Google Optimization
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
User Name:
Password:
Remember me
Go Back   SEO Chat ForumsGoogleGoogle Optimization

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread SEO Chat Forums Sponsor:
  #1  
Old October 30th, 2008, 03:26 PM
roseberry roseberry is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Oct 2006
Location: Atlanta
Posts: 191 roseberry User rank is Sergeant (500 - 2000 Reputation Level)roseberry User rank is Sergeant (500 - 2000 Reputation Level)roseberry User rank is Sergeant (500 - 2000 Reputation Level)roseberry User rank is Sergeant (500 - 2000 Reputation Level)roseberry User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 5 Days 13 m
Reputation Power: 12
Google and Noindex blocked by robots.txt

There has been a lot of discussion lately (especially since Google's last webmaster conference call) about noindex pages being indexed due to being blocked by robots.txt so the bots can't find the meta noindex in the file.

For a little background, Matt Cutts explains in comments here http://www.mattcutts.com/blog/noind...#comment-135566

I've done some research into the matter and have come up with a theory on how this is handled by G (as well as Y and MSN). I posted on the blog, but thought I'd share here as well.

At delicious.com there is a meta noindex on the home page, but all files blocked by robots. So you do a search on Google for “delicious” and the new home page shows up, but with no snippet or meta description because the page wasn’t crawled, but there is a page title associated with it, not because it actually is the page title, but because it’s the domain name and it is used in anchor text (google will attribute a page title to any page that doesn’t have one, and it’s often the name of the root).

If you do search for “delicious” on MSN, the page doesn’t come up, so it appears that MSN is accessing the file (despite what robots.txt says) and finds the noindex meta.

Yahoo! gives page title and a description, which actually isn’t in the meta description or found anywhere on the page. It’s pulled from the Yahoo! directory listing of the site. Yahoo! tends to assign Y directory data to pages that don’t have it (and even often times when they do). So it would it appear that Y also follows the robots.txt directive. Of course delicious.com is a Yahoo! property, so you could draw a different conclusion, but this is my thought on the subject.

It's interesting to note how the three engines handle this differently. I don't know that this will have a major impact on any of my strategy going forward, but it could lend other insights into how the algos work. For example, notice how del.icio.us/ is still ranking #1 for the search even though there is 301 in place to the new URL at delicious.com. Very interesting. I wonder if this is common practice in these instances?

I'd love to hear anyone's thoughts, reactions, algo speculations as I am still trying to figure out for myself if this really points to anything of any importance or is just something to file under the "hmm, that's interesting" folder.

Reply With Quote
Reply

Viewing: SEO Chat ForumsGoogleGoogle Optimization > Google and Noindex blocked by robots.txt


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump



 Free IT White Papers!
 
How to Present Effectively Online
This white paper offers practical and actionable advice on the key steps that any presenter should consider as they plan and execute a Webinar or online meeting.

Request Your Free Technology Downloads!
 
Open Source Security Myths
Open Source Software (OSS) is computer software whose source code is available to the general public with relaxed or non-existent intellectual property restrictions (or arrangement such as the public domain), and is usually developed with the input of many contributors.

Request Your Free Technology Downloads!
 
Power and Cooling Capacity Management for Data Centers
This paper describes the principles for achieving power and cooling capacity management.

Request Your Free Technology Downloads!
 
Scalable, Fault-Tolerant NAS for Oracle - The Next Generation
For several years NAS has been evolving as a storage alternative for Oracle databases, and for good reason: NAS is quite often the simplest, most cost-effective storage approach for Oracle. Learn about the benefits that HP's approach to scalable NAS brings to Oracle environments in this comprehensive white paper.

Request Your Free Technology Downloads!
 
Understanding Web Application Security Challenges
This white paper discusses many common threats and preventive measures for Web application security, and explains what you can do to help protect your organization.

Request Your Free Technology Downloads!
 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2009 by Developer Shed. All rights reserved. DS Cluster 3 hosted by Hostway
Stay green...Green IT