#1
  1. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Aug 2005
    Posts
    82
    Rep Power
    13

    Arrow How long for robots.txt


    I run a large vBulletin Forum and recently added a robots.txt to my forums directory so the spiders couldn't access the reply, search, memberlist, etc pages. It has been about 5 hours now since I have uploaded the robots.txt and I still see my msn and google spiders accessing all those pages in the "Who's Online" section of my vBulletin Forums. How long does it take for the robots.txt to start working? Thanks for the help.
  2. #2
  3. Contributing User
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Jul 2005
    Location
    Canada
    Posts
    762
    Rep Power
    22
    When I look at my log files I see the search engines access the robot.txt file at the start of each session ... then download a series of pages.

    I think it should take effect almost immediately ...

    Can we see your file?
  4. #3
  5. http://tinyurl.com/cz56g
    SEO Chat Mastermind (5000+ posts)

    Join Date
    Sep 2004
    Location
    D0RDRECHT NL
    Posts
    6,063
    Rep Power
    30
    Both those SEs' bots are known to "accidentally" bypass the robots.txt file on occasion (e.g. during index updates...) They are only reasonably well-behaved bots

    Anyway, you can kinda "submit" your robots.txt file to google here:
    http://www.google.com/intl/en/remove.html
    From there, use the automatic URL removal system.
    ...please help me w/ the real Redscowl Bluesingsky...how2 check backlinks...now postin' @ SEO Refugee ...
    <`)~ LOL now that I finally have a paypal account, I'm charging 19,- for SEO advice via PM. Seriously...
  6. #4
  7. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Aug 2004
    Posts
    48
    Rep Power
    14
    I'm pretty sure that Google mentions in their webmasters blurb that they may operate on a cached version of robots.txt and that they may only update that cache once a day, so worst case you should only have to wait for 24 hours before they start paying attention.

    That won't get them to remove the newly blocked pages from the index though. If you need to do that you can try their URL removal tool, but I've found that to be a bit hit and miss
  8. #5
  9. Contributing User
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Apr 2004
    Location
    Texas, USA
    Posts
    552
    Rep Power
    14
    I also believe robots may still follow links you have defines as "nofollow', but will generally obey your request to keep them out of thier index.
  10. #6
  11. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Aug 2005
    Posts
    82
    Rep Power
    13
    Hey guys.. the bots are still accessing all the "disallowed" pages in my forums. Here is a copy of my robots.txt placed in my forums directory:

    Code:
    User-agent: *
    Disallow: /search.php
    Disallow: /member.php
    Disallow: /memberlist.php
    Disallow: /private.php
    Disallow: /sendmessage.php
    Disallow: /report.php
    Disallow: /postings.php
    Disallow: /editpost.php
    Disallow: /newreply.php
    Disallow: /online.php
    Disallow: /calendar.php
    Disallow: /Warn.php
    Disallow: /shoutbox.php
    Disallow: /showgroups.php
    Any idea why this robots.txt is not working?
  12. #7
  13. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Aug 2004
    Posts
    48
    Rep Power
    14
    Just a thought but you said that you had your robots.txt in your forums directory. Is this a subdirectory of the main site ?

    robots.txt files must be placed in the root directory, not in subdirectories. Once you've done that you'll then need to edit it to add "/forums/" ( or whatever ) in front of each file you want to ban
  14. #8
  15. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Aug 2005
    Posts
    82
    Rep Power
    13
    ahhhhhhhh...... awesome man... thanks for pointing that out... no wonder it's not working!!!!
  16. #9
  17. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Aug 2005
    Posts
    82
    Rep Power
    13
    Should the robots.txt be placed in the 'httpdocs' folder or the "real" root directory where the cgi-bin, stats, etc folders are???
  18. #10
  19. Croatia - Hrvatska
    SEO Chat Skiller (1500 - 1999 posts)

    Join Date
    May 2005
    Location
    Croatia - Hrvatska
    Posts
    1,891
    Rep Power
    17
    some host got public_html for root directory, some www, some something else it depends, ask your hosting administrator what is your root directory
  20. #11
  21. Contributing User
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Jul 2005
    Location
    Canada
    Posts
    762
    Rep Power
    22
    Root directory is always where your default page is (index.* or default.*).

    See: ROBOT.TXT VALIDATOR:
    http://www.searchengineworld.com/cgi-bin/robotcheck.cgi

    For files you can also use <meta name="robots" content="noindex, nofollow"> in the HEAD section.
    Last edited by rtchar; Aug 11th, 2005 at 11:47 PM.

Similar Threads

  1. query string - how long is too long?
    By lilbit in forum Google Optimization
    Replies: 14
    Last Post: Jun 2nd, 2006, 05:05 AM
  2. duplicate content (splitting long pages)
    By thejoker in forum Search Engine Optimization
    Replies: 1
    Last Post: Mar 18th, 2005, 01:55 PM
  3. How long does Amazon UK take to ship their products?
    By IrishCoffee in forum Affiliate Marketing
    Replies: 0
    Last Post: Mar 13th, 2005, 06:58 PM
  4. New sites, how long in top 10?
    By cityneil in forum Google Optimization
    Replies: 7
    Last Post: Oct 10th, 2003, 12:01 PM

IMN logo majestic logo threadwatch logo seochat tools logo