|
|
|||||||||
|
|||||||||
|
|||||||||
| |
||
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
I am looking for an actual robot.txt file that I can put on my site. I would like the spider to visit my whole site.
Brian |
|
#2
|
||||
|
||||
|
In that case you won't need a robots.txt file, not for that anyway.
Last edited by Wit : October 14th, 2004 at 06:41 AM. Reason: ...thanks Trevor for elaborating ;) |
|
#3
|
||||
|
||||
|
Hi Brian
You don't actualy need a robots.txt file to get spidered. It is standard search engine bot protocal to first check if there is a robots.txt file. If there is, they then read it, if there isn't they go ans spider all the site they can anyway. It is sometimes neccessary to stop certain bots from indexing your site - if they use too much bandwidth. Some rouge bots ignore this so you have to block them with .htaccess. If you just want your site spidered, submit it and wait. Also having a link on a frequently indexed popular site - like yahoo directory & DMOZ etc will get you spidered soon. Hope this was helpful. Trevor Stolber |
|
#4
|
||||
|
||||
|
Just to add to that, if you realy do want one - here you go.
User-agent: Googlebot Allow: / or to allow all bots User-agent: * Allow: / The forward slash represents all of your site form the root. If you had an area that you didn't want spidered do the follwoing User-agent: Googlebot Allow: /images |
|
#5
|
||||
|
||||
|
More info on the subject can be found here for example:
http://www.searchengineworld.com/robots/robots_tutorial.htm You might want to put up a robots.txt file to get rid of 404 errors in your logs (caused by bots expecting it to be there). |
|
#6
|
|||
|
|||
|
Although tstolber has provided sound advice and examples, my advise to a newbie (or at least someone unfamiliar with robots.txt) is to not use one at all. Newbies make a lot of common mistakes and have problems with their sites because of it.
As wit has stated, if you are looking to have your whole site indexed, then you can leave it just the way it is. Your logs will show a 404 error for that page but this does not affect your site in any way. |
|
#7
|
|||
|
|||
|
The best thing for newbies is to save a blank notepad file as robots.txt to upload to the root directory. It helps reduce the number of 404 errors in log files.
|
|
#8
|
||||
|
||||
|
Yes that is probably the best thing for a newbie!
The bots will see that they are ok to crawl the site and you won't get any 404s. Also as an added bonus a blank robots.txt file is exactly 0 bytes so it won't effect your bandwidth! When and if you check your server logs and notice a bot indexing your site and don't want it to - you can then block it. I have a list of bots that I don't want to index the site and just block them all out. Just make a text file and name it robots.txt and put it in the ROOT of your directory, it is the only place bots will look for it. For example . www.yourdomain.com/robots.txt |
|
#9
|
|||
|
|||
|
Quote:
Last edited by quadcity : October 15th, 2004 at 03:11 PM. Reason: Typo. |
|
#10
|
|||
|
|||
|
Quote:
Code:
User-agent: * Disallow: /images/ I used "Disallow: /images/" instead of "Disallow: /images" because "Disallow: /images" would also disallow /images.html Last edited by quadcity : October 15th, 2004 at 03:08 PM. Reason: Typo. |
|
#11
|
||||
|
||||
|
Sometimes I have a check in my error404 file that checks to see if the requested file was infact the robots.txt file. If it is, it then emails me the user agent so that I can basically keep an eye on which spiders are "browsing".
|
|
#12
|
||||
|
||||
|
This was a great post! I have been wondering the importance of a bot file for a while now. Thanks for the great info guys
|
|
#13
|
||||
|
||||
|
Ok I'm starting to see a pattern here... you are just posting BS, adding nothing to the posts and just bumping up old threads.
We already had a spammer like you before and he caused the admins to put a 90 day and 100 post restriction to avoid more spammers. Stop posting crap and read the rules. I guess I will have to report this guy to the mods. I clicked on the "find new posts" link and there are over 4 pages thanks to him....
__________________
Need some free backlinks for your site? Check this out! Last edited by fryman : December 1st, 2004 at 03:28 PM. |
|
#14
|
||||
|
||||
|
|
![]() |
| Viewing: SEO Chat Forums > Search Engine Strategies > Search Engine Optimization > Robot.txt file needed |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|