Great community. Great ideas.
Welcome to SEOChat, a community dedicated to helping beginners and professionals alike in improving their Search Engine Optimization knowledge. Sign up today to gain access to the combined insight of tens of thousands of members.
Sep 28th, 2013, 08:27 AM
Could anyone please tell me the reason behind using robots.txt file? I f we want to hide certain pages from bots then why are we keeping such pages on site at first? Wouldn't it be better if get rid of those pages?
Also, what kind of pages do we put under robots.txt file?
Sep 30th, 2013, 05:03 PM
robots.txt unlike what you said, will not "hide" pages from the bots that crawl your site, this is a public file and anyone can see what your placing there. By placing pages in your robots.txt you are instructing robots to crawl or not crawl the pages on your site. Keep in mind that this is a suggestion for robots not to crawl your site, not all will listen but friendly robots like those of the search engines will most often follow your directions. Placement of files or directories into your robots.txt can be for any number of reasons, mostly it is up to you and are pages that you wish the search engines to not index. Some common types of pages that are put into robots.txt are:
Login and Shopping Cart pages - necessary pages to have on your site, but they can also produce undesirable URLs through session ids
Duplicate content from URLs created by your CMS - URLs being created by parameters, often caused by search functions, blogs and wordpress sites often have
tag pages which create duplicate content
The list can go on and on and can change due to the needs of your site, therefore without seeing the site its hard to say what you should or should not place in the file. Robots.txt can be thought of like a broad sword rather than a precision knife, the incorrect use of robots.txt could block your whole site or very important parts of your site with a small error. Meaning it may not always be the best solution to your problems.
Check out The Web Robots Pages for more information about robots.txt
Hope that helps some.
Oct 2nd, 2013, 10:59 AM
Thanks Kevin. I got a few gold nuggets from it. Thank you for sharing that information.
Originally Posted by kevin.w
I would really appreciate if you could tell me what do files like Admin Files and cgi-bin files have in them? I mean what do they contain?
Oct 3rd, 2013, 02:54 PM
I'm glad to hear that you got some useful information out of that! I would just like to re-affirm that the pages and folders that I discussed above were just examples of types of pages and folders commonly disallowed through robots.txt, what is and should be disallowed is different for every site. As for the cgi-bin folder...historically they were used to secure scripting used on websites, like a form for instance. (Please note that is a very simple explanation, I do not claim to be a programmer with extensive knowledge about these) They are commonly blocked for security reasons. This may not be applicable to your site as .php and .asp have become more standard for scripting. The admin folders can contain credentialing or login pages like on a wordpress site (/wp-admin the admin folder for wordpress sites), also blocked for security reasons. Like I said these may not be an issue for your site as each site is different. A good way to look at your site and see what could be showing up for those folders would be to do a site search in Google.
I hope that helped to answer your question/clarify a little more
By webkul in forum Google Optimization
Last Post: Jul 6th, 2011, 08:01 AM
By SEO Chat in forum SEO Chat Articles
Last Post: Jan 13th, 2010, 01:47 PM
By SEO Chat in forum SEO Chat Articles
Last Post: Nov 23rd, 2009, 09:00 AM
By Loco007 in forum Google Optimization
Last Post: Feb 27th, 2009, 04:36 PM