Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    May 2004
    Posts
    84
    Rep Power
    16

    Question on the Mozilla G-bot


    Hi all,

    Well, it's back. That new Mozilla 5 gbot is dancing around my sites. I'm not overly thrilled because every theory I've ever had about this bot has not panned out.

    A question for those of you who HAVE seen this thing, has ANY of your pages crawled by this new bot ever actually shown up in the index? Or has it always needed a crawl by the original bot.

    Thanks,

    Owen
  2. #2
  3. No Profile Picture
    Contributing User
    SEO Chat Discoverer (100 - 499 posts)

    Join Date
    Oct 2004
    Posts
    391
    Rep Power
    15
    What is the Mozilla 5 gbot?
  4. #3
  5. http://tinyurl.com/cz56g
    SEO Chat Mastermind (5000+ posts)

    Join Date
    Sep 2004
    Location
    D0RDRECHT NL
    Posts
    6,063
    Rep Power
    31
    It manifests itself in your logs like this:

    66.249.66.180 - - [01/Nov/2004:01:19:46 +0100] "GET /robots.txt HTTP/1.1" 200 484 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

    ...instead of this:

    66.249.64.52 - - [08/Oct/2004:17:41:43 +0200] "GET /robots.txt HTTP/1.0" 200 484 "-" "Googlebot/2.1 (+http://www.google.com/bot.html)"

    ...regardless of the IP address - mind you. The difference lies in the User-Agent string.

    There are theories that it's a test bot, or that it looks for outdated/removed or even duplicate pages... I still can't see a clear pattern (sorry Owen )
  6. #4
  7. Sick of BL's, PR + Google
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Dec 2003
    Location
    UK
    Posts
    865
    Rep Power
    40
    This bot has been on my site for the last 4 hours taking 5 pages a minute. These are all new pages never been spidered before.

    The interesting thing is that yesterday I had 20 gbots on site all day on and off indexing previously spidered pages but no Mozilla gbot.

    Today the Mozilla gbot and the others have gone...

    What do you deduce from that Wit?
    I'm just pleased that any gbot is on my site.
    Last edited by thewormman; Nov 1st, 2004 at 03:38 PM.
  8. #5
  9. http://tinyurl.com/cz56g
    SEO Chat Mastermind (5000+ posts)

    Join Date
    Sep 2004
    Location
    D0RDRECHT NL
    Posts
    6,063
    Rep Power
    31
    I can deduce from that "that your site ain't big enough for both of 'em" (or something like that - remember that Sparks song?).

    Seriously: more examples, more confusion... One speck of light though (supporting the dup pages theory). Moz.Gbot visited my site yesterday, when it spidered TWO (!) pages, both very similar in structure and content to some of my other pages (they are in fact translations). So the code of those pages is identical to code on some of my other pages (which weren't spidered this time BTW), and the url is similar, but the content differs. If I were a bot, I'd look into that... and it did. Time will tell how it interpreted its findings this time...
  10. #6
  11. Contributing User
    SEO Chat Discoverer (100 - 499 posts)

    Join Date
    Apr 2004
    Location
    Leeds, UK
    Posts
    151
    Rep Power
    16
    I'm getting activity from both bots today. Both Mozilla + normal.

    But, i get a hell of a lot more interest from the mozilla one than the normal gbot. Its a very new site, and the normal gbot has only found about 4 pagers. the other bot has spidered about 3,000 so far.
  12. #7
  13. http://tinyurl.com/cz56g
    SEO Chat Mastermind (5000+ posts)

    Join Date
    Sep 2004
    Location
    D0RDRECHT NL
    Posts
    6,063
    Rep Power
    31
    LOL - I guess it's a blackhat bot. (kidding of course)
  14. #8
  15. Contributing User
    SEO Chat Discoverer (100 - 499 posts)

    Join Date
    Apr 2004
    Location
    Leeds, UK
    Posts
    151
    Rep Power
    16


    Well, to be fair - the bot spidering my site might mean something about it being a bot which checks red-flagged sites, maybe.
    I noticed mozillabot on one of my other sites not long before it got banned (indeed, one of my lovely blackhattered sites was finally banned after a year of being alive).

    Of course, it was indeed expected. A year was quite a lot of business though, i must say.
    We ended up getting banned because of a guy called "googleguy", of all people. He sent an email via the website saying "blah blah i know you're doing blackhat, google are aware". a day after it'd been removed from google.

    But yeah, another site which i'm sure a few people know what i'm talking about (as i mentioned it here a while back), is getting a lot of search engine activity at the moment.
  16. #9
  17. Sick of BL's, PR + Google
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Dec 2003
    Location
    UK
    Posts
    865
    Rep Power
    40
    Update:

    Mozilla Gbot STILL on site and now spidering 4 to 6 pages PER SECOND

    4 Normal gbot's have also returned and are taking a more sedate 4 pages/hour each.

    Different numbers of indexed pages still showing on different data centres

    Something is brewing, last time I saw this bot going this manic was just before the last PR update. I have had in the last two days, pages being indexed, descriptions in the SERPS, then being dropped back to just the URL listed.

    And the adsense media bot which works off the same IP has disappeared too.

    All hell seems to have broken loose here :-?
    Last edited by thewormman; Nov 1st, 2004 at 03:39 PM.
  18. #10
  19. Mostly sane...
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Aug 2004
    Location
    St. Petersburg, FL
    Posts
    756
    Rep Power
    16
    Up to 42545 visits now, all from 66.249.65.72, all MozillaGbot. Seems to be coming in waves, with 5-10 minute breaks in between (smoking? coffee??)

    For comparison, in Oct gbot, all versions, visited my site a TOTAL of 79670 times. Weird to have this much activity after the BL update. Does this smell like maybe a PR update?

    -Michael

    PS. Went to 42931 in the time it took to write this post.
  20. #11
  21. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    May 2004
    Posts
    84
    Rep Power
    16
    Does anyone see the same filename pattern I do in their logs. For me, its grabbing the files in filename length order.

    Ie. it grabbed all the 2-character filenames, then the 3-character filenames, etc.
  22. #12
  23. Mostly sane...
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Aug 2004
    Location
    St. Petersburg, FL
    Posts
    756
    Rep Power
    16
    Originally Posted by Owen
    ...it grabbed all the 2-character filenames, then the 3-character filenames, etc.
    Weird, all my pages are numbered, so it's easy to see that pattern too. It's not entirely like that, ie, when it went from the 3 digit to the 4 digit pages, and then the 5 digit pages, some of the 3 and 4 digit pages interleave, but predominantly that is the pattern it's following. Which means, in my case, MozillaGbot is not following links. These must just be the pages that are already stored in G's index.

    I had noticed that a ton of my internal links on the link: command were just showing up as URL's, I wonder if GBot is just filling in the missing info now.

    -Michael
  24. #13
  25. No Profile Picture
    Contributing User
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Aug 2004
    Posts
    566
    Rep Power
    17
    it all makes sense. Huge update coming this week...
  26. #14
  27. No Profile Picture
    Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    May 2004
    Posts
    84
    Rep Power
    16
    Originally Posted by mvandemar
    Weird, all my pages are numbered, so it's easy to see that pattern too. It's not entirely like that, ie, when it went from the 3 digit to the 4 digit pages, and then the 5 digit pages, some of the 3 and 4 digit pages interleave, but predominantly that is the pattern it's following. Which means, in my case, MozillaGbot is not following links. These must just be the pages that are already stored in G's index.

    I had noticed that a ton of my internal links on the link: command were just showing up as URL's, I wonder if GBot is just filling in the missing info now.

    -Michael
    I'm fairly sure it IS following links, but sorting them by size first. I say this because one site it's crawling is new and it had never seen the lower pages except on the sitemap pages which it grabbed earlier. And they are sorted alphabetically there.

    As for the link: command, I actually got an answer from G on that. No big news, it's broken, and engineering is looking at it. I have one of my sites showing with 580+ backlinks, but the real number is like 76 (so sayeth the guy at google).
  28. #15
  29. Sick of BL's, PR + Google
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Dec 2003
    Location
    UK
    Posts
    865
    Rep Power
    40
    Originally Posted by mvandemar
    Which means, in my case, MozillaGbot is not following links. These must just be the pages that are already stored in G's index.

    I had noticed that a ton of my internal links on the link: command were just showing up as URL's, I wonder if GBot is just filling in the missing info now.

    -Michael
    All the pages it is following for me have NEVER been indexed before but the pages that contain the links to them were spidered by the normal gbot last week.

    The pages being currently spidered are not in the index at all.
Page 1 of 2 12 Last
  • Jump to page:

Similar Threads

  1. keyword question
    By pteam in forum Keyword Research
    Replies: 3
    Last Post: Nov 1st, 2004, 06:22 PM
  2. Css Not Working In Mozilla..
    By lane in forum Web Design, Coding and Programming
    Replies: 4
    Last Post: Sep 11th, 2004, 11:35 AM
  3. Where can I post a question about copywriting?
    By sorvoja in forum Suggestions & Feedback
    Replies: 1
    Last Post: Jun 13th, 2004, 01:36 PM
  4. Just a quick question
    By didjital1 in forum Google Optimization
    Replies: 1
    Last Post: Aug 10th, 2003, 12:03 PM

IMN logo majestic logo threadwatch logo seochat tools logo