#1
  1. No Profile Picture
    Registered User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Sep 2006
    Location
    NRW, Germany
    Posts
    2
    Rep Power
    0

    How to dispose dupe phpsessid pages from index?


    We run an health information website since 2003 using php. All sites were indexed well, but in March 2006 google bot began to index our pages with PHPSESSID, which we need use for our users as leaflet. In July we figured out that our pages in the index are growing extremly and we saw that google bot began to index the dupe pages. We directly fixed the problem as said at google webmaster resouce set in robots.txt

    Disallow:
    User-agent: Googlebot
    Disallow: /*PHPSESSID

    we also fixed our scripts so that the google bot doesn't get a PHPSESSID when he spiders the sites, because its no extra content, only for better usage for our users.

    Our website has nearly 55.000 unique pages, but site: DOMAIN PHPSESSID said there were 230.000 pages, so there were a lot of dupes. At the first time it works well, within 3 weeks these PHPSESSID went down to 15.000 pages and everything was looking fine. Then end of August a data center square came and within one day the PHPSESSID pages grow to 82.000 pages, since this day the number of 82.000 never changed. Google doesn't delete any page of this NOT WANTED pages.

    Actually we wouldn't care about those results, but two weeks after the "data backfall" our search engine rankings and traffic lost 80%. It looks like because of this dupe pages or of the rapid growing of new pages, google kicked our rankings down. But the main technical problem are these ghost pages, we can't do anything.

    Have you ever heard about this special problem? What should we do now? We can't wait about 10-20 months until google kicks the 82.000 ghost pages by himself.
  2. #2
  3. No Profile Picture
    Registered User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Sep 2006
    Location
    NRW, Germany
    Posts
    2
    Rep Power
    0
    We still have this big duplicate content problem. Doesn't have anybody an idea to find a way out of this matter?
  4. #3
  5. No Profile Picture
    Contributing User
    SEO Chat Discoverer (100 - 499 posts)

    Join Date
    May 2006
    Location
    Barcelona
    Posts
    160
    Rep Power
    13
    We have a similar problem. Our website use ASP technologies.
    4 weeks ago Google start to duplicate a string something like this (Home page)

    http://www.floraqueen.com://www.floraqueen.com

    so basicaly it always repeat the same string(://www.floraqueen.com). The about page

    http://www.floraqueen.com://www.floraqueen.com/about.html

    Obviously these pages couldnīt be found and it comes with a 404 error. At the same time the pages that google has indexed has multiplicated by 2. We havenīt uploaded anything new and everyday we have more pages indexed.

    Our technical department doesnīt know how to deal with this problem. We did a similar thing than you with the robots.txt file but it didnīt work. We asked in many forums but none of them has any solution

    I will keep you uptodate if we find a solution, I will appreciate if you can do the same. Thanks
    fernando@floraqueen.com

    Originally Posted by fisher
    We run an health information website since 2003 using php. All sites were indexed well, but in March 2006 google bot began to index our pages with PHPSESSID, which we need use for our users as leaflet. In July we figured out that our pages in the index are growing extremly and we saw that google bot began to index the dupe pages. We directly fixed the problem as said at google webmaster resouce set in robots.txt

    Disallow:
    User-agent: Googlebot
    Disallow: /*PHPSESSID

    we also fixed our scripts so that the google bot doesn't get a PHPSESSID when he spiders the sites, because its no extra content, only for better usage for our users.

    Our website has nearly 55.000 unique pages, but site: DOMAIN PHPSESSID said there were 230.000 pages, so there were a lot of dupes. At the first time it works well, within 3 weeks these PHPSESSID went down to 15.000 pages and everything was looking fine. Then end of August a data center square came and within one day the PHPSESSID pages grow to 82.000 pages, since this day the number of 82.000 never changed. Google doesn't delete any page of this NOT WANTED pages.

    Actually we wouldn't care about those results, but two weeks after the "data backfall" our search engine rankings and traffic lost 80%. It looks like because of this dupe pages or of the rapid growing of new pages, google kicked our rankings down. But the main technical problem are these ghost pages, we can't do anything.

    Have you ever heard about this special problem? What should we do now? We can't wait about 10-20 months until google kicks the 82.000 ghost pages by himself.
  6. #4
  7. Super Moderator
    SEO Chat Genius (4000 - 4499 posts)

    Join Date
    Aug 2004
    Location
    Calgary
    Posts
    4,033
    Rep Power
    920
    Most PHP programs you can do a 301 redirect using htaccess to remove all session ids. If you blog session ids sometimes u block the entire bot as well cause they are usually defaulted to session ids (they are in essence a user).

Similar Threads

  1. Yahoo won't index pages after a website URLRewrite mod
    By JasonYAP in forum BING/Yahoo Search Optimization
    Replies: 1
    Last Post: Sep 21st, 2006, 10:40 AM
  2. Link from front to sub index pages -good idea ??
    By keep in forum Google Optimization
    Replies: 7
    Last Post: Aug 17th, 2006, 07:45 AM
  3. Blog pages removed from index
    By devilfruit in forum Google Optimization
    Replies: 0
    Last Post: May 12th, 2006, 08:18 PM
  4. How to increase index pages in MSN?
    By chrisjones in forum BING/Yahoo Search Optimization
    Replies: 6
    Last Post: May 2nd, 2006, 01:11 PM
  5. Inter-page internal anchored linking by site footer
    By newbieuk23 in forum Google Optimization
    Replies: 3
    Last Post: Feb 21st, 2005, 02:40 AM

IMN logo majestic logo threadwatch logo seochat tools logo