|
|
|||||||||
|
|||||||||
|
|||||||||
| |
||
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
Why do this with Robots.txt
I am researching some companies that come in the top search results and
noticed that one of them had this in their robots.txt file: User-agent:* Disallow: /google/ Disallow: /mirago/ Disallow: /overture/ Disallow: /looksmart/ Disallow: /looksmart2/ Disallow: /yell/ Disallow: /thomson/ Disallow: /discounts/ Could anyone explain why there would be a reason for this... i know that it is to stop engines from crawling the directories, i just found it interesting that they related to search engine names themselves - not some dark secret robot trick is it? Also what is the correct syntax for a robots.txt file to allow all engines to crawl you site... one of the posts in here said it was: User-agent: * Disallow: But this resulted in my site NOT being crawled.... (i.e. no title and description in google now...!!!) Last edited by Jasontnyc : January 27th, 2005 at 06:33 AM. Reason: testing reputation system - please ignore |
|
#2
|
||||
|
||||
|
See this post, I gave you the answer there with link's, the advice you got was wrong.
http://forums.seochat.com/showthrea...96078#post96078 User-agent: * Disallow: tells the spider that it can spider whatever it wan't to. |
|
#3
|
||||
|
||||
|
Quote:
yep, a cheater if i couldnt guess.... on Index.html or whatever its prob like this... PHP Code:
So it probably is Quote:
Dan p.s report it to Google ;) |
|
#4
|
|||
|
|||
|
Tim, I noticed no-one had provided you with what would appear be the proper answer and, that is;
All the engines listed provide some form of PPC service. By listing all of those engines as pages or directories on your site then it allows the webmaster to monitor what hits are being received from each of the respective PPC services and analyse them via a log file analyser tool. The pages within those directories would then be copies of your home page or redirects to your home page so the user is non the wiser and is not distracted from their experience. One of the problems of PPC is that they all provide different reporting systems so if you can monitor what is hitting your site using software under your own control then you can achieve a couple of objectives such as validate what the PPC services are charging you and monitor for competitors stealing your budget by clicking your PPC adverts as well as time of day/week benefits. If you run such a system to monitor what is being received via PPC the last thing you want is search bot to crawl those pages too. Hence the reason for the robots.txt file on those directories. Last edited by dazzlindonna : October 19th, 2004 at 05:44 PM. Reason: no fake sigs allowed |
|
#5
|
|||
|
|||
|
Interesting, but i can think of much better ways of tracking PPC without having to replicate content.... thanks for the reply though.
|
![]() |
| Viewing: SEO Chat Forums > Other > HTML Coding > Why do this with Robots.txt |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|
|