Search Engine Optimization
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
 
User Name:
Password:
Remember me
Go Back   SEO Chat ForumsSearch Engine StrategiesSearch Engine Optimization

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread SEO Chat Forums Sponsor:
  #1  
Old October 13th, 2004, 10:24 PM
wonderman wonderman is offline
Registered User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Apr 2004
Posts: 28 wonderman User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 24 m 22 sec
Reputation Power: 0
Smile Robot.txt file needed

I am looking for an actual robot.txt file that I can put on my site. I would like the spider to visit my whole site.
Brian

Reply With Quote
  #2  
Old October 14th, 2004, 06:37 AM
Wit's Avatar
Wit Wit is offline
http://tinyurl.com/cz56g
SEO Chat God 2nd Plane (6000 - 6499 posts)
 
Join Date: Sep 2004
Location: D0RDRECHT NL
Posts: 6,065 Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 2 Months 6 Days 10 h 52 m 26 sec
Reputation Power: 18
In that case you won't need a robots.txt file, not for that anyway.

Last edited by Wit : October 14th, 2004 at 06:41 AM. Reason: ...thanks Trevor for elaborating ;)

Reply With Quote
  #3  
Old October 14th, 2004, 06:37 AM
tstolber's Avatar
tstolber tstolber is offline
Contributing User
SEO Chat Frequenter (2500 - 2999 posts)
 
Join Date: Jul 2004
Location: Bedfordshire
Posts: 2,789 tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 2 Weeks 18 h 37 m 2 sec
Reputation Power: 29
Send a message via MSN to tstolber Send a message via Google Talk to tstolber Send a message via Skype to tstolber
Hi Brian


You don't actualy need a robots.txt file to get spidered. It is standard search engine bot protocal to first check if there is a robots.txt file. If there is, they then read it, if there isn't they go ans spider all the site they can anyway.

It is sometimes neccessary to stop certain bots from indexing your site - if they use too much bandwidth. Some rouge bots ignore this so you have to block them with .htaccess.

If you just want your site spidered, submit it and wait. Also having a link on a frequently indexed popular site - like yahoo directory & DMOZ etc will get you spidered soon.

Hope this was helpful.

Trevor Stolber

Reply With Quote
  #4  
Old October 14th, 2004, 06:41 AM
tstolber's Avatar
tstolber tstolber is offline
Contributing User
SEO Chat Frequenter (2500 - 2999 posts)
 
Join Date: Jul 2004
Location: Bedfordshire
Posts: 2,789 tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 2 Weeks 18 h 37 m 2 sec
Reputation Power: 29
Send a message via MSN to tstolber Send a message via Google Talk to tstolber Send a message via Skype to tstolber
Just to add to that, if you realy do want one - here you go.


User-agent: Googlebot
Allow: /

or to allow all bots

User-agent: *
Allow: /

The forward slash represents all of your site form the root.

If you had an area that you didn't want spidered do the follwoing

User-agent: Googlebot
Allow: /images

Reply With Quote
  #5  
Old October 14th, 2004, 06:46 AM
Wit's Avatar
Wit Wit is offline
http://tinyurl.com/cz56g
SEO Chat God 2nd Plane (6000 - 6499 posts)
 
Join Date: Sep 2004
Location: D0RDRECHT NL
Posts: 6,065 Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 2 Months 6 Days 10 h 52 m 26 sec
Reputation Power: 18
More info on the subject can be found here for example:

http://www.searchengineworld.com/robots/robots_tutorial.htm

You might want to put up a robots.txt file to get rid of 404 errors in your logs (caused by bots expecting it to be there).

Reply With Quote
  #6  
Old October 14th, 2004, 06:49 AM
Jasontnyc Jasontnyc is offline
Permanently Banned
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2004
Location: St. James Gate
Posts: 1,988 Jasontnyc User rank is Corporal (100 - 500 Reputation Level)Jasontnyc User rank is Corporal (100 - 500 Reputation Level)Jasontnyc User rank is Corporal (100 - 500 Reputation Level)Jasontnyc User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 1 Week 1 Day 12 h 49 m 31 sec
Warnings Level: 20
Number of bans: 1
Reputation Power: 0
Although tstolber has provided sound advice and examples, my advise to a newbie (or at least someone unfamiliar with robots.txt) is to not use one at all. Newbies make a lot of common mistakes and have problems with their sites because of it.

As wit has stated, if you are looking to have your whole site indexed, then you can leave it just the way it is.

Your logs will show a 404 error for that page but this does not affect your site in any way.

Reply With Quote
  #7  
Old October 14th, 2004, 07:26 AM
sem4u sem4u is offline
Contributing User
SEO Chat Novice (500 - 999 posts)
 
Join Date: Jan 2004
Posts: 529 sem4u User rank is Private First Class (20 - 50 Reputation Level)sem4u User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 3 Days 5 h 6 m 26 sec
Reputation Power: 5
The best thing for newbies is to save a blank notepad file as robots.txt to upload to the root directory. It helps reduce the number of 404 errors in log files.

Reply With Quote
  #8  
Old October 14th, 2004, 07:39 AM
tstolber's Avatar
tstolber tstolber is offline
Contributing User
SEO Chat Frequenter (2500 - 2999 posts)
 
Join Date: Jul 2004
Location: Bedfordshire
Posts: 2,789 tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level)tstolber User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 2 Weeks 18 h 37 m 2 sec
Reputation Power: 29
Send a message via MSN to tstolber Send a message via Google Talk to tstolber Send a message via Skype to tstolber
Yes that is probably the best thing for a newbie!

The bots will see that they are ok to crawl the site and you won't get any 404s. Also as an added bonus a blank robots.txt file is exactly 0 bytes so it won't effect your bandwidth!
When and if you check your server logs and notice a bot indexing your site and don't want it to - you can then block it. I have a list of bots that I don't want to index the site and just block them all out.
Just make a text file and name it robots.txt and put it in the ROOT of your directory, it is the only place bots will look for it.

For example . www.yourdomain.com/robots.txt

Reply With Quote
  #9  
Old October 15th, 2004, 02:58 PM
quadcity quadcity is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jul 2004
Location: Moline, IL
Posts: 186 quadcity User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 12 h 10 m 38 sec
Reputation Power: 5
Quote:
Originally Posted by Jasontnyc
Although tstolber has provided sound advice and examples, my advise to a newbie (or at least someone unfamiliar with robots.txt) is to not use one at all. Newbies make a lot of common mistakes and have problems with their sites because of it.
Very true. One little mistake and you could block all SE spiders from your site.

Last edited by quadcity : October 15th, 2004 at 03:11 PM. Reason: Typo.

Reply With Quote
  #10  
Old October 15th, 2004, 03:06 PM
quadcity quadcity is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jul 2004
Location: Moline, IL
Posts: 186 quadcity User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 12 h 10 m 38 sec
Reputation Power: 5
Quote:
Originally Posted by tstolber
If you had an area that you didn't want spidered do the follwoing

User-agent: Googlebot
Allow: /images
I use
Code:
  User-agent: * 
  Disallow: /images/
  

I used "Disallow: /images/" instead of "Disallow: /images" because "Disallow: /images" would also disallow /images.html

Last edited by quadcity : October 15th, 2004 at 03:08 PM. Reason: Typo.

Reply With Quote
  #11  
Old October 16th, 2004, 12:09 PM
wineo's Avatar
wineo wineo is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jul 2004
Location: Perth, Australia
Posts: 189 wineo User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 15 h 30 m 21 sec
Reputation Power: 5
Sometimes I have a check in my error404 file that checks to see if the requested file was infact the robots.txt file. If it is, it then emails me the user agent so that I can basically keep an eye on which spiders are "browsing".

Reply With Quote
  #12  
Old December 1st, 2004, 03:20 PM
double00's Avatar
double00 double00 is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Nov 2004
Location: Houston TX
Posts: 102 double00 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 5 h 40 m 51 sec
Reputation Power: 5
Send a message via Yahoo to double00
This was a great post! I have been wondering the importance of a bot file for a while now. Thanks for the great info guys

Reply With Quote
  #13  
Old December 1st, 2004, 03:25 PM
fryman's Avatar
fryman fryman is offline
Master of the cave
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Apr 2004
Location: Mexico
Posts: 1,533 fryman User rank is Private First Class (20 - 50 Reputation Level)fryman User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 21 h 27 m 36 sec
Reputation Power: 6
Send a message via MSN to fryman
Angry

Ok I'm starting to see a pattern here... you are just posting BS, adding nothing to the posts and just bumping up old threads.

We already had a spammer like you before and he caused the admins to put a 90 day and 100 post restriction to avoid more spammers. Stop posting crap and read the rules.

I guess I will have to report this guy to the mods. I clicked on the "find new posts" link and there are over 4 pages thanks to him....
__________________
Need some free backlinks for your site? Check this out!

Last edited by fryman : December 1st, 2004 at 03:28 PM.

Reply With Quote
  #14  
Old December 1st, 2004, 03:51 PM
Wit's Avatar
Wit Wit is offline
http://tinyurl.com/cz56g
SEO Chat God 2nd Plane (6000 - 6499 posts)
 
Join Date: Sep 2004
Location: D0RDRECHT NL
Posts: 6,065 Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 2 Months 6 Days 10 h 52 m 26 sec
Reputation Power: 18

Reply With Quote
Reply

Viewing: SEO Chat ForumsSearch Engine StrategiesSearch Engine Optimization > Robot.txt file needed


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump