#1
  1. King of da Wackos
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Jan 2003
    Location
    Planet Zeekois
    Posts
    680
    Rep Power
    17

    Lightbulb If-Modified-Since can give you the Google Death Penalty.


    Is Google only hitting one page and leaving? Do you have the files end with .shtml? If you answer yes to those questions, it could be If-Modified-Since giving you the Google Death Penalty.

    I started a new site two months ago and decided to test If-Modified-Since on it to see if it does save bandwidth by making Google only get files that have been updated. At first before I tried IMS, Google got the files. Then I use IMS. Google comes and hits one file, and leaves. So after a few weeks I get rid of If-Modified-Since, and BANG! With in a day Google crawls the whole site, so I dump IMS. That was a month ago, and since then Google had only got the index. I start thinking it was just luck that I got crawled right after stoping IMS, but yesterday I started getting my next deepcrawl on the site, and of course the best time to test this is while your getting deepcrawled, so over the last hour I've been watching Google deepcrawl the site while I change the permission settings to try to get IMS working. It looks like IMS CAN keep you from being crawled by Google, at least if it's .shtml files (I havn't tested this on .html files). You have to have the permission setting exactly correct. I've only found one setting where Google will crawl the site with IMS on. I try changing the files in a directory to different permissions, and here's what I get.

    Owner: Read-Write-Search
    Group: Read-Write-Search
    Everyone: Read-Write-Search

    XXX
    XXX Has IMS, but stopped crawling site (even directories not with this permission) (Chmod 777)
    XXX

    XXX
    X Has IMS, but stopped crawling (even directories not with this permission) (Chmod 755)
    XXX

    XX
    X No IMS, but all directories get's crawled. (Original permission setting) (Chmod 454)
    X

    X
    X X Has IMS, and also crawls the site. (Chmod 454)
    X

    So I changed the permission setting on every file except the section indexes to Chmod 454, and Google is now crawling the site with IMS. At the next deepcrawl I'll see if it really does make Google only get pages that are new or have been updated.

    Edit: The chmod number was wrong in the setting that allowed Google to use IMS (454).
    Last edited by Nintendo; Nov 5th, 2003 at 12:53 PM.
  2. #2
  3. Moderator
    SEO Chat Good Citizen (1000 - 1499 posts)

    Join Date
    Jan 2003
    Location
    Madrid, Spain
    Posts
    1,382
    Rep Power
    18
    Thanks for the info, Nintendo.

    Just one question - after setting IMS, did you update any files. Because if you didn't, the fact that none were spidered is correct. With IMS set, Google will only spider files that have been changed, in order to save you bandwidth.

    Gringo.
    Last edited by Gringo; Nov 5th, 2003 at 05:12 AM.
  4. #3
  5. King of da Wackos
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Jan 2003
    Location
    Planet Zeekois
    Posts
    680
    Rep Power
    17
    I had added over 2,000 files to the site since it last crawled the site.

    Looking at the last 10 hours of the crawl, it looks like it is skipping the files that havn't been updated. I only see a few files from sections that were crawled a month ago geting hit this time around, and yes, I did edit a few of the files since the last crawl. Before doing this, it was recrawling the sections that were crawled last time around, and stoped after geting IMS set up. So it looks like it's now doing exactly what it's supposed to do with IMS.

    Edit: One more thing, you might have to have a .htaccess file with

    XBitHack Full

    to get IMS with .shtml files.
    Last edited by Nintendo; Nov 5th, 2003 at 08:39 PM.

Similar Threads

  1. Google Friends Newsletter December
    By Phoenix in forum Search Engine Optimization
    Replies: 1
    Last Post: Dec 5th, 2003, 11:50 PM
  2. Google End?
    By analogik in forum Google Optimization
    Replies: 12
    Last Post: Sep 17th, 2003, 07:37 PM
  3. 20 Days from launch to Page 1 on Google
    By The Renegade in forum Google Optimization
    Replies: 30
    Last Post: Sep 11th, 2003, 03:17 PM
  4. Buzz abounds on Google IPO
    By ctn in forum Google Optimization
    Replies: 0
    Last Post: Jun 14th, 2003, 10:20 PM

IMN logo majestic logo threadwatch logo seochat tools logo