|
|
|||||||||
|
|||||||||
|
|||||||||
| |
||
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
||||
|
||||
|
Blocking Dynamic URLs with Robots.txt
Ok friends,
What command in Robots.txt exactly works for blocking Dynamic URLs?? We have been hitting on this thing back and forth so many times in the forums and there seems to be varied answers to this question. Some say that: user-agent: * disallow: /filename.php Will block not only the filename.php file but also any other query strings/parameters which it attached to it. While some others (like me ) say thatuser-agent: * disallow: /filename.php* is the one which works. So I thought of starting a small test to see which command is really effective in blocking Dynamic URLs through the robots.txt file. The test will go like this.... We'll take a domain that has dynamic pages in it. Then we'll try to block those dynamic pages with both the above mentioned commands PLUS any other which anyone of you can suggest. We'll apply those commands one at a time and see which one really blocks those dynamic pages. I believe that this test will be an eye-opener for all those like me who are still in this dilemma. Thoughts ?? Thanks!
__________________
SEO FAQs - You might find your answer here. SEOchat Forum Rules - Read Before You Post **Do what you feel in your heart to be right- for you'll be criticized anyway. You'll be damned if you do, and damned if you don't.** |
|
#2
|
||||
|
||||
|
Sure.. Thats sounds nice.. We need a Volunteer though
__________________
Link Diary - Build Links Fast & Easy. Similar to Linkmarket, with option of three way link exchange and anchor rotation. **"Save SEO Industry - GO VIRAL!! - Tips and Tricks ."** **"If you surrender to the wind, you can ride it."** **" |
|
#3
|
|||
|
|||
|
You should also try
user-agent: * disallow: /filename.php? Cheers |
|
#4
|
||||
|
||||
|
Boy...the Gbot is devouring threads of SEOchat like crazy...
![]() I made this thread about 30 mins ago..and it's already ranking at #1 for the keyphrase "Blocking Dynamic URLs Through Robots.txt" ...I feel that I chose the right Title for the thread This also proves that Google indeed meant when they said that they are now following "Minty Fresh Indexing" |
|
#5
|
||||
|
||||
|
Quote:
Thanks ! We'll surely try that out as well. Any volunteers care to spare a domain for the test ?? |
|
#6
|
||||
|
||||
|
I've just run:
Code:
user-agent: * disallow: /filename.php? through the robots.txt analysis tool in Google's webmaster tools and it shows that: /filename.php with no query string is allowed, whilst /filename.php?id=5 is blocked. From personal experience I've done something similar on an existing site: Code:
user-agent: * disallow: /contact.php?id= Having only one variable in use for that file meant I could include that in the robots.txt file. Since using this the existing indexed pages using the variable have been dropped from Google's index, whilst the main page without a query string has remained. |
|
#7
|
||||
|
||||
|
Quote:
Thanks for your input More observations...anybody ? |
|
#8
|
||||
|
||||
|
Nice Info...
Thanks
__________________
SEO Services |
|
#9
|
||||
|
||||
|
Use this
Code:
user-agent: * disallow: /*filename.php Code:
http://www.skatevideosonline.net/filename.php?id=23 Blocked by line 4: Disallow: /*filename.php http://www.skatevideosonline.net/filename.php Blocked by line 4: Disallow: /*filename.php |
|
#10
|
||||
|
||||
|
Thanks Galen
|
|
#11
|
|||
|
|||
|
Quote:
Since 1994, there is a universally accepted standard about robots.txt and it is defined here: A Standard for Robot Exclusion. Yes, it is old. Yes, that's just one page. Yes it looks like a personal web site, but it completely defines the standard and every serious robot designer refers to it. Carefully read this page and you will know what command in robots.txt should work for blocking dynamic URLs. On top of that Google, Yahoo, Microsoft and others have all defined their own private extension to this standard, but they did not mutually agree about these extensions. If you use these private extensions in the part of your robots.txt that follows User-agent: * , expect that a few robots will understand it and many will not. By the way, if you want to check this, you have to look at what several bots do, not only Googlebot. I would recommend to only use these private extensions after a user agent line pointing to a robot that supports it. Private extensions to the standard include: - * used as a wildcard - $ used as a mark for the end of the URL - the Allow: directive - the Crawl-delay: directive Jean-Luc
__________________
AWStats Support : add-on's, extra sections, forum, installation assistance AWStats remote service for less than $2 a month Checking redirects is now as easy as 1 2 3, even if you are not a HTTP-header guru ! |
|
#12
|
||||
|
||||
|
Quote:
So you mean that for example: /shopping_cart.php? That would block all shopping cart pages and will not crawl any pages including dynamic generated pages? Does the above command prevent all shopping cart pages from being indexed? Thanks.
__________________
SEO Specialist - SEO Company UK SEO campaign return of investment calculator "You don't have to be great to start, but you have to start to be great "-Ziglar |
|
#13
|
||||
|