#1
  1. No Profile Picture
    Contributing User
    SEO Chat Good Citizen (1000 - 1499 posts)

    Join Date
    Nov 2006
    Location
    Germany
    Posts
    1,078
    Rep Power
    30

    Google indexing the wrong content


    There is a site called www.verwandt.de and there are a few international versions of it, for instance www.meusparentes.com.pt in portuguese.

    Unfortunately, if you search for meusparentes in Google, the result shows you the portuguese domain but a german title and german meta description.

    It seems to be theoretically possible to retrieve german content on the portuguese site, but by default the only language you can access there is portuguese. Does anyone have any idea why this happens?
  2. #2
  3. Live and Learn!
    SEO Chat Skiller (1500 - 1999 posts)

    Join Date
    Jun 2006
    Location
    London, England
    Posts
    1,803
    Rep Power
    352
    I would say its probably because they are both the same websites in terms of duplicated template etc and so google cannot tell which is which, as there is very little text on the pages it is difficult for google to determin which language is which so has mixed them up.

    I would say its also possible google is considering them one site and considers meusparentes more relevant to the german meta tags, but lists the portugese url as it is the same website anyway.
  4. #3
  5. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Feb 2007
    Location
    UK
    Posts
    47
    Rep Power
    12
    Is this a dynamic site that's just passing the content back depending on where, geographically, the user is viewing from? i.e. all the URL's are just pointing to the same server?

    I suspect that the page is just being spidered by the googlebots from one location and that's the content that's listed?
  6. #4
  7. Contributing User
    SEO Chat Discoverer (100 - 499 posts)

    Join Date
    Sep 2007
    Location
    Ohio, USA
    Posts
    344
    Rep Power
    19
    http://www.duplicatecontent.net/

    Total HTML similarity: 83.73%
    Standard text similarity: 80.81%
    Smart text similarity: 98.21%
    Total text similarity 89.51%

    Add the fact that they are on the same IP address. And you are calling to the server for a script with the same address for both pages.

    The Flash is killing you. You also have 2 open Meta tags. Content language and content type. Check out the top 10 keywords for density..Meta tags included.

    www.meusparentes.com.p
    class 18 3.22%
    script 18 3.22%
    http 16 2.86%
    static60 16 2.86%
    style 16 2.86%
    meusparentes 16 2.86%
    type 12 2.15%
    text 12 2.15%
    verwandt 10 1.79%
    href 10 1.79%

    www.verwandt.de
    verwandt 26 4.77%
    script 18 3.30%
    class 18 3.30%
    style 16 2.94%
    http 16 2.94%
    static60 16 2.94%
    text 12 2.20%
    type 12 2.20%
    href 10 1.83%
    javascript 9 1.65%


    They are definitely duplicate and Google has simply chosen to display the www.verwandt.de site. I would guess Google chose the German version as they as on German hosting....Relevance.

    --Melanie
  8. #5
  9. No Profile Picture
    Contributing User
    SEO Chat Good Citizen (1000 - 1499 posts)

    Join Date
    Nov 2006
    Location
    Germany
    Posts
    1,078
    Rep Power
    30
    Thanks for the good feedback. It is a server for dynamic content, but the locale is chosen by looking in the session (google bot does not have one, AFAIK), by looking at cookies (the bot doesn't allow them) and finally by checking the TLD, in this case .com.pt
    So it's weird that the google bot sees german content when requesting meusparentes.com.pt...

    Originally Posted by mprough
    Total HTML similarity: 83.73%
    Standard text similarity: 80.81%
    Smart text similarity: 98.21%
    Total text similarity 89.51%

    Add the fact that they are on the same IP address. And you are calling to the server for a script with the same address for both pages.

    The Flash is killing you. You also have 2 open Meta tags. Content language and content type. Check out the top 10 keywords for density..Meta tags included. [...]
    Thanks for the analysis. I don't care much about keyword density. Even though the sites are very similar, it makes no sense that Google indexes content in a different language. Even the fact that it's the same IP should not account for the effect...
    After all, there are sites which share an IP with other ones and i have heard of no case where a page from domain B was indexed for domain A...

    I noticed that the cached versions contain the correct header images and the right flash content and footer. Only the other texts on the site are in german... It might be an internal server problem...
  10. #6
  11. Contributing User
    SEO Chat Discoverer (100 - 499 posts)

    Join Date
    Sep 2007
    Location
    Ohio, USA
    Posts
    344
    Rep Power
    19
    Thanks for the analysis. I don't care much about keyword density. Even though the sites are very similar, it makes no sense that Google indexes content in a different language.
    They are duplicate...One is supplemental...The other is not. When called upon in the serps Google has chosen the German one for many reasons (my post above) to display it as it's choice of the 2. Google believes they in fact contain the same content.

    The fact that they are on the same IP only add to the problem, as it helps Google to identify it as the same site.

    After all, there are sites which share an IP with other ones and i have heard of no case where a page from domain B was indexed for domain A...
    This in fact does happen, but it's not what has happened here...it has only helped Google determine the duplicity of your content.

    I
    noticed that the cached versions contain the correct header images and the right flash content and footer. Only the other texts on the site are in german... It might be an internal server problem
    Indexing and keyword search are different issues, supplemental pages are indexed and cached.

    Even the fact that it's the same IP should not account for the effect...
    I posted your keyword density because you have an error, lack of closed tags and some coding errors added to the crawlers inability to crawl flash mean Google does not see any different language...Your keyword crawls were nearly identical as well.

    Read up on the supplemental index...Google has chosen the German page to display for that content. In order to fix this, you must bring your content out of the Flash and make it crawlable, then add many strong backlinks to the supplemental page to pull it out of the supplemental index so it can be returned for keyword searches in Google.

    --Melanie
  12. #7
  13. No Profile Picture
    Contributing User
    SEO Chat Good Citizen (1000 - 1499 posts)

    Join Date
    Nov 2006
    Location
    Germany
    Posts
    1,078
    Rep Power
    30
    Thanks for the additional feedback.

    But how do you know that the homepage of one domain is supplemental? Google removed the function to check for supplemental pages some time ago. Have they reinstalled it?
    Your keyword crawls were nearly identical as well.
    What does this mean? And how do you know the outcome of a crawl of the google bot?

    Can you explain why the page in the cache contains a german header, but a portuguese footer if this is not a server error? Why would the language change in the middle of crawling the page?

    Currently, the cached version has the same title and meta description as the entry in the index, which leads me to the conclusion that they are related or even the same.
  14. #8
  15. Contributing User
    SEO Chat Discoverer (100 - 499 posts)

    Join Date
    Sep 2007
    Location
    Ohio, USA
    Posts
    344
    Rep Power
    19
    http://www.duplicatecontent.net/

    Total HTML similarity: 83.73%
    Standard text similarity: 80.81%
    Smart text similarity: 98.21%
    Total text similarity 89.51%

    Crawl the page in a Lynx browser and you will see what Googlebot sees.

    Comments on this post

    • Chris42 agrees : Ah, by 'keyword crawl' you mean 'viewing the page and analyzing the content'

Similar Threads

  1. Content is No Longer King - Results of a Two Year Study
    By distinctseo in forum Search Engine Optimization
    Replies: 20
    Last Post: Feb 16th, 2007, 12:21 PM
  2. Finally! Separating Google Search & the Content Network
    By seostew in forum Google Adwords
    Replies: 0
    Last Post: Feb 12th, 2007, 12:44 PM
  3. See your sandboxed site's rank if it weren't sandboxed
    By dazzlindonna in forum Google Optimization
    Replies: 306
    Last Post: Feb 5th, 2006, 11:01 AM
  4. Google Penalized due to content duplication- How to recover?
    By yamin in forum Google Optimization
    Replies: 13
    Last Post: Oct 31st, 2005, 02:53 PM
  5. What is Google - An Opinion
    By fvkg in forum SEO Help (General Chat)
    Replies: 9
    Last Post: Jul 6th, 2005, 06:00 AM

IMN logo majestic logo threadwatch logo seochat tools logo