SunQuest
 
           Search Engine Optimization
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
 
User Name:
Password:
Remember me
Go Back   SEO Chat ForumsSearch Engine StrategiesSearch Engine Optimization

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread SEO Chat Forums Sponsor:
Be the architects of evolution and help create the mobile internet future. It’s your move---enter to win here!
  #1  
Old May 12th, 2008, 06:03 AM
Marfola Marfola is offline
Registered User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2006
Posts: 4 Marfola User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 13 m 48 sec
Reputation Power: 0
Robots.txt syntax for wildcards?

I would like to exclude all pages ending in /print.html and all pages with a ? in the url string in my robots.txt file. Is the following syntax correct for Yahoo, Google and MSN?

Disallow: /*print.html$
Disallow: /*?

Reply With Quote
  #2  
Old May 12th, 2008, 06:34 AM
pro_seo's Avatar
pro_seo pro_seo is online now
Moderator
Click here for more information.
 
Join Date: Apr 2006
Location: I N D I A
Posts: 2,741 pro_seo User rank is Sergeant Major (2000 - 5000 Reputation Level)pro_seo User rank is Sergeant Major (2000 - 5000 Reputation Level)pro_seo User rank is Sergeant Major (2000 - 5000 Reputation Level)pro_seo User rank is Sergeant Major (2000 - 5000 Reputation Level)pro_seo User rank is Sergeant Major (2000 - 5000 Reputation Level)pro_seo User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 4 Days 10 h 40 m 7 sec
Reputation Power: 32
Send a message via AIM to pro_seo Send a message via MSN to pro_seo Send a message via Yahoo to pro_seo Send a message via Google Talk to pro_seo Send a message via Skype to pro_seo
Quote:
Originally Posted by Marfola
I would like to exclude all pages ending in /print.html and all pages with a ? in the url string in my robots.txt file. Is the following syntax correct for Yahoo, Google and MSN?

Disallow: /*print.html$
Disallow: /*?


Put these in your robots.txt file

user-agent: *
disallow: /*?
disallow: /print.html
Comments on this post
dzine agrees: Hmm /print.html would have a different effect...
__________________

Site Map
SEO FAQs - You might find your answer here.
SEOchat Forum Rules - Read Before You Post


**Do what you feel in your heart to be right- for you'll be criticized anyway. You'll be damned if you do, and damned if you don't.**

Reply With Quote
  #3  
Old May 12th, 2008, 06:39 AM
dzine's Avatar
dzine dzine is offline
Vergruizer: Vot tebe khuy
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Oct 2005
Location: in a life preserver @ seorefugee
Posts: 1,896 dzine User rank is Sergeant (500 - 2000 Reputation Level)dzine User rank is Sergeant (500 - 2000 Reputation Level)dzine User rank is Sergeant (500 - 2000 Reputation Level)dzine User rank is Sergeant (500 - 2000 Reputation Level)dzine User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 1 Month 5 Days 11 h 10 m 47 sec
Reputation Power: 21
Marfola, something like that looks like your best bet.

However, personally I'd prefer something like this in the <head> section of my pages:
PHP Code:
<?php
if ( blah blah test $_SERVER['PHP_SELF'] and if it ends in 'print.html' or '?' ) {
  echo 
'<meta name="robots" description="noindex, noarchive" />';
}
?>


That would get rid of files already (but inadvertently) indexed as well.

You could even make those files '301' redirect to their indexable equivalents
__________________
Love my host...Check your gender ...

Last edited by dzine : May 12th, 2008 at 06:43 AM.

Reply With Quote
  #4  
Old May 15th, 2008, 07:14 AM
seo_ryan seo_ryan is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jun 2007
Location: NY
Posts: 50 seo_ryan User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 14 h 10 m 39 sec
Reputation Power: 2
I think it should be :
user-agent: *
disallow: /*?
disallow: /*print.html

Reply With Quote
Reply

Viewing: SEO Chat ForumsSearch Engine StrategiesSearch Engine Optimization > Robots.txt syntax for wildcards?


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway