Great community. Great ideas.

Welcome to SEOChat, a community dedicated to helping beginners and professionals alike in improving their Search Engine Optimization knowledge.  Sign up today to gain access to the combined insight of tens of thousands of members.

Thread: is robots.txt required in my case ?

Results 1 to 11 of 11
Share This Thread →
  1. #1
    RajeshSah is offline Registered User SEO Chat Explorer (0 - 99 posts)
    Join Date
    Jan 2013
    Posts
    7
    Rep Power
    0

    is robots.txt required in my case ?

    I want web spiders crawl the whole website and i have no xml sitemap to mention in it. The only thing i need is to block bad bots, if they are bad means they will bypass robots.txt. Is robots.txt required in this case.

    Second question whether genuine web spiders like googlebot will look for robots.txt if there isn't a robots.txt available in the site. I need to differentiate the bad bots with genuine spiders.

  2. #2
    dzine's Avatar
    dzine is offline DIYSEO SEO Chat Mastermind (5000+ posts)
    Join Date
    Oct 2005
    Location
    sharing a room with my ego
    Posts
    5,184
    Rep Power
    1629
    What sort of 'bad bots' are you concerned about?

  3. #3
    RajeshSah is offline Registered User SEO Chat Explorer (0 - 99 posts)
    Join Date
    Jan 2013
    Posts
    7
    Rep Power
    0
    Quote Originally Posted by dzine View Post
    What sort of 'bad bots' are you concerned about?
    The bots which will eat your bandwidth unnecessarily.........

  4. #4
    dzine's Avatar
    dzine is offline DIYSEO SEO Chat Mastermind (5000+ posts)
    Join Date
    Oct 2005
    Location
    sharing a room with my ego
    Posts
    5,184
    Rep Power
    1629
    Do you have any examples? I have found that there are hardly any bots like that, that DO obey robots.txt
    Also, I must admit that I'm not all that concerned with my bandwidth

    If you want to limit crawling to only a small number of bots, say: Google + Bing/Yahoo + Ask + Baidu + Yandex, then you could do that with a robots.txt file. But it still wouldn't keep the 'bad bots' out -- if said bots ignore the robots.txt file.
    You'd have to check each one.

    What you could do is:
    - check your logs which bot is eating lots of your bandwidth and doesn't send you any good visitors
    - check online if that bot obeys the robots.txt protocol
    - if so: block it specifically
    - if not: try to block its IP address(es) using .htaccess ("allow/deny") or something similar

  5. #5
    ZenReputation is offline Registered User SEO Chat Explorer (0 - 99 posts)
    Join Date
    Dec 2011
    Location
    Zen-Reputation - 66, avenue des Champs-Elysées - 75008 Paris - France
    Posts
    26
    Rep Power
    0
    i think only robots.txt is not sure for whole website...

  6. #6
    RajeshSah is offline Registered User SEO Chat Explorer (0 - 99 posts)
    Join Date
    Jan 2013
    Posts
    7
    Rep Power
    0
    Unknown robot (identified by 'bot/' or 'bot-'), Unknown robot (identified by 'robot'), Unknown robot (identified by 'spider'), Unknown robot (identified by 'crawl'). I have no idea who are these and how to block them in robots.txt file.

    Its a hell out of job to analyze log file and find the bad bot. I have some issues that i will post in new thread.

    Also whats the alternative of .htaccess in windows server i need to discuss.

  7. #7
    RajeshSah is offline Registered User SEO Chat Explorer (0 - 99 posts)
    Join Date
    Jan 2013
    Posts
    7
    Rep Power
    0
    But my original question still remains is robots.txt is required in my case. I am speaking about bad bots not google, yahoo, ask

  8. #8
    dzine's Avatar
    dzine is offline DIYSEO SEO Chat Mastermind (5000+ posts)
    Join Date
    Oct 2005
    Location
    sharing a room with my ego
    Posts
    5,184
    Rep Power
    1629
    Only you can tell if it's necessary.

    Personally I wouldn't bother. But if you really want to keep ALL bots out EXCEPT for a few trusted ones, then by all means do so using robots.txt

  9. #9
    RajeshSah is offline Registered User SEO Chat Explorer (0 - 99 posts)
    Join Date
    Jan 2013
    Posts
    7
    Rep Power
    0
    can i block these in robots.txt:
    Unknown robot (identified by 'bot/' or 'bot-'), Unknown robot (identified by 'robot'), Unknown robot (identified by 'spider'), Unknown robot (identified by 'crawl')

  10. #10
    dzine's Avatar
    dzine is offline DIYSEO SEO Chat Mastermind (5000+ posts)
    Join Date
    Oct 2005
    Location
    sharing a room with my ego
    Posts
    5,184
    Rep Power
    1629
    I think you cannot. As far as I know, one cannot use 'wildcards' for parts of the bots' names.

    How much bandwidth are these bots costing your by the way?

  11. #11
    RajeshSah is offline Registered User SEO Chat Explorer (0 - 99 posts)
    Join Date
    Jan 2013
    Posts
    7
    Rep Power
    0
    Quote Originally Posted by dzine View Post
    I think you cannot. As far as I know, one cannot use 'wildcards' for parts of the bots' names.

    How much bandwidth are these bots costing your by the way?
    For December 2012 (1 month):
    Unknown robot (identified by 'bot/' or 'bot-') 6994(Hits) 80.34 MB(Bandwidth) 25 Dec 2012 - 00:52(Last Visit)
    site is hosted on a windows shared hosting server....

Share This Thread →

Become Part of This Conversation

Join NowFor Free!

Similar Threads

  1. Replies: 2
    Last Post: Jul 6th, 2011, 07:01 AM
  2. Steve Jobs required to answer questions in iTunes case (Macworld)
    By RSS_News_User in forum Technology News
    Replies: 0
    Last Post: Mar 22nd, 2011, 12:01 PM
  3. Dynamic Website - robots.txt file Help Required
    By peterson in forum SEO Help (General Chat)
    Replies: 4
    Last Post: Mar 10th, 2011, 06:07 AM
  4. Study Results: Search Engines, Meta Robots Tag and Robots.txt
    By SEO Chat in forum SEO Chat Articles
    Replies: 1
    Last Post: Jan 13th, 2010, 12:47 PM
  5. A different case of Broken Links Checking- help required
    By godwin in forum SEO Help (General Chat)
    Replies: 0
    Last Post: Oct 26th, 2005, 07:31 AM

SEO Chat Advertisers and Affiliates