#1
  1. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Nov 2017
    Posts
    93
    Rep Power
    1

    number of indexed pages has gone wild


    Two different tools (webmaster tools and google search console) report very different results for the total indexed pages of this website: learn-greek-online.com

    WMT reports 867 indexed pages and 690 pages blocked by robots.txt.
    GSC reports 345 indexed pages.

    345 is a much more realistic figure.

    I have the impression that this unnatural rise of indexed pages reported by wmt hurts the rankings.

    Any idea what could be happening here?

    From google webmastertools:


    From the new google search console:


    This question is similar to this topic, however the issues spotted back then have been resolved by now:
    too many new indexed pages
  2. #2
  3. No Profile Picture
    Moderator
    SEO Chat Hero (2000 - 2499 posts)

    Join Date
    Sep 2016
    Location
    USA
    Posts
    2,095
    Rep Power
    3007
    Originally Posted by bobptz
    I have the impression that this unnatural rise of indexed pages reported by wmt hurts the rankings.
    I generally never complain when Google indexes my pages. In your case, about the only way I could see this impacting your rankings would be if you have competing pages, and you confuse Google as to which page should be listed in any given search query. Then the use of a canonical url would solve that issue.

    But anything could happen with Google. Have you seen any decrease in traffic? Any decrease in your CTR to your money pages ? Check those metrics and if they have not been impacted I would think you are unduly concerned.
  4. #3
  5. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    May 2018
    Posts
    47
    Rep Power
    4
    ^^ exactly, i think this can only be a good thing,
  6. #4
  7. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Nov 2017
    Posts
    93
    Rep Power
    1
    Impressions, CTR, traffic etc look normal.

    Last time I had a big rise in indexed pages it was an issue that had to be fixed. I do not have 867 pages in my website. The two different google tools display a different result. I am afraid google is somehow confused, and this is not a good thing.

    I looked for errors, I do use canonical. My site seems ok.
  8. #5
  9. No Profile Picture
    Moderator
    SEO Chat Hero (2000 - 2499 posts)

    Join Date
    Sep 2016
    Location
    USA
    Posts
    2,095
    Rep Power
    3007
    Originally Posted by bobptz
    Last time I had a big rise in indexed pages it was an issue that had to be fixed
    So what was the fix ?

    You say you do not have 867 pages on your site, so Google is seeing something. Have you considered that the site may be hacked ?

    I have another 40 questions to ask. Providing the domain name will eliminate most of these questions. We can then run our own scans on the site to see what problems may be lurking.
  10. #6
  11. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Nov 2017
    Posts
    93
    Rep Power
    1
    Hello KnowOneSpecial

    I don't think the site is hacked, but I will scan it again.

    Last time there was a rise in indexed pages, it was because I had changed the construction of the url. I fixed it with 301 redirects and working in the robots.txt file. Like I added this:
    Disallow: /ask-greek/register*

    I also used the canonical for some duplicate pages.

    The domain was provided in the initial post: learn-greek-online.com
  12. #7
  13. No Profile Picture
    Moderator
    SEO Chat Hero (2000 - 2499 posts)

    Join Date
    Sep 2016
    Location
    USA
    Posts
    2,095
    Rep Power
    3007
    I scanned your site. Found some interesting things...

    I found 670 urls. The number Google is reporting is more accurate than you think. You have way more than 345 pages by far !

    I think your problems revolve around your solution dealing with dynamic urls and your robots.txt file

    Let me give you an example... (you have many such occurrences like this example )

    Page 645 is the one you have referenced with your rel=canonical and is a self referencing canonical.
    <link rel="canonical" href="https://learn-greek-online.com/ask-greek/997/epta-i-efta-does-it-matter">
    Pages 646. 647, and 648 all reference Page 645 as the page to give the juice to, they are all dynamic urls in that they contain "?'s" in them.
    These pages are all index-able by all search engines



    Yes the Q2A does explicitly solves the issue of which page to funnel the almighty juice to. What it doesn't do is prevent Google from indexing those pages.

    Your robots.txt file has the following in it...

    # Exclude all urls with parameters in ask-greek
    # No need to do this. Q2A uses the Canonical tag. Works perfectly.
    # Disallow: /ask-greek/*?*

    All of these pages here are very much index-able by Google



    So from simple calculations
    670 total pages - 405 dynamic duplicate pages = 265 unique pages without "?'s", aka dynamic urls.

    My suggestion would be to re-activate this line in your robots.txt and give it a couple of weeks and see if the total number of pages being indexed returns to what you call normal.
    Meaning add this back to your robots.txt
    Disallow: /ask-greek/*?*

    I am still of the opinion you are unduly worried in this explicit situation, why ? Because you are using the rel=canonical to solve any duplicate issue.

    Edited to add...

    I would consider the 1493 empty image alt text opportunities you have omitted, something to fix tho.
    Last edited by KnowOneSpecial; May 17th, 2018 at 08:09 AM.
  14. #8
  15. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Nov 2017
    Posts
    93
    Rep Power
    1
    Hello KnowOneSpecial

    Thank you very much for the in-depth answer.

    The forum section of the website (the one that is not blocked by robots.txt) has about 340 unuque pages. Unuque I mean, not including the ones with ?show=. The rest of the website is about 20 pages. So it should be about 360 total.

    You said: "All of these pages here are very much index-able by Google"

    I did not test all pages, but this is indicative of the situation:

    This page is indexed:
    Code:
    site:https://learn-greek-online.com/ask-greek/987/after-all-translation
    This page is not indexed:
    Code:
    site:https://learn-greek-online.com/ask-greek/987/after-all-translation?show=989



    So unless Google is lying, the canonical tag works perfectly passing the juice and preventing indexing of the wrong pages.


    GSC reports 345 indexed pages, which seems closer to reality. WMT reports 867. Obviously WMT counts pages in a different way. Maybe WMT counts the ?show= pages too. This could be an explanation.
    Last edited by bobptz; May 17th, 2018 at 08:40 AM.
  16. #9
  17. No Profile Picture
    Moderator
    SEO Chat Hero (2000 - 2499 posts)

    Join Date
    Sep 2016
    Location
    USA
    Posts
    2,095
    Rep Power
    3007
    Hi bobptr

    You are under some miss conceptions, lets clear up those issues first !

    1. The canonical url only determines which page gets the juice.
    2. The canonical url was never designed to prevent indexing a page and in fact does not prevent indexing by Google or any other search engine.
    3. The Site: command only shows you what pages have been indexed by Google and then it only shows you a partial list and is known to not provide all indexed pages of large sites. So your usage of the command for this purpose is FLAWED ! Using this command to determine if a page is index-able does not work, you are using the command WRONG!

    Now to the meat of the matter....

    Your example page you used to explain the page is non-indexed is in-correct. That page is in fact index-able by Google See below picture....

    So you can understand this image correctly...

    The little green circle with the white letter I means this page is index-able
    The little green circle with the white letter F means links on this page are followed
    The little blue circle with the white letter C means this page uses a canonical and that links on page may not count
    The little circle that looks like a chocolate chip cookie means this page uses cookies



    Notice that this page learn-greek-online.com/ask-greek/1004/help-with-a-phrase?show=1005 is index-able by all search engines, but has a canonical pointing to learn-greek-online.com/ask-greek/1004/help-with-a-phrase



    I did tell you that you have 405 other pages that do have this indexing issue! You also need to understand just because you specify in your robots.txt to not allow this, and in fact some people do rely upon it to prevent indexing, it is not fool proof. I only suggested you put that line back in to help you determine if that line being removed caused the issue.

    To make a page non-index-able do one of the following two things...nether of which you have implemented !

    Reference URL https://support.google.com/webmaster...er/93710?hl=en

    Excerpt from above reference url (the first method requires that you do not block the page in your robots.txt)

  18. #10
  19. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Nov 2017
    Posts
    93
    Rep Power
    1
    Hello KnowOneSpecial

    Thank you for all this great info. Indeed I relied a lot in the "site:" command.

    Let me read all this again tomorrow. It is late now and had a couple of beers already.

Similar Threads

  1. Small Number of Indexed Pages
    By amcgibbon in forum New User SEO Questions and Answers
    Replies: 1
    Last Post: Feb 2nd, 2018, 02:14 AM
  2. More pages indexed than actual number of pages
    By shiwali in forum Search Engine Optimization
    Replies: 5
    Last Post: Sep 25th, 2013, 12:45 AM
  3. Number of indexed pages fell
    By DAG in forum Google Optimization
    Replies: 4
    Last Post: Oct 22nd, 2009, 11:24 AM
  4. number of indexed pages in decline ?
    By jorje29 in forum Google Optimization
    Replies: 10
    Last Post: Feb 15th, 2005, 06:52 PM
  5. number of pages indexed?
    By cgchris99 in forum Google Optimization
    Replies: 9
    Last Post: Jun 17th, 2003, 04:30 PM

IMN logo majestic logo threadwatch logo seochat tools logo