#1
  1. No Profile Picture
    Registered User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Mar 2003
    Posts
    24
    Rep Power
    0

    CMSs, Dynamic URLs and googlebot


    Most Content Management Systems are Written either in CGI perl, PHP or ASP. Now the problem with them is, most if not all use Dynamic URLs such as in our site for example if you wanted to read about the Information Security and Cyber-terrorism Conference 2003, you click this URL http://www.wi-fitechnology.com/modul...=1&thold=0 which is dynamic. Googlebot will not touch it or spider it, in fact, chances are, it'll stop dead on its four paws and run back to another web site.
    Some sites have thousands of pages, but only a handfull of pages included when indexed.
    There are reasons for that, one is googlebot may think the URL is an affiliate link (selling products or services through affiliate networks for example) or the URL is too long as the bot has to stop and refuse to crawle once that URL is over 70 characters I think.

    Many webmasters had to find all sorts of ways to get round the problem. and we are no exeption, however, surely google must realize it is a downer and a negative aspect of its crawler and do some thing about it.
  2. #2
  3. Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Apr 2003
    Location
    UK
    Posts
    63
    Rep Power
    17
    I have a live problem akin to this..... I have a basic site www.johnpacker.co.uk where I am trying to get google to spider the content of the database (contains some 5000 products) it used to be invisible (the only way you could search it was via a gay flash search), now I have put a full list in with hard coded links to the results page, the changes was made a few weeks ago and to be honest it probably missed the deep crawl.

    However do you think that I stand a chance next time round for the database to be indexed? or am I missing some gem of knowledge that will take the ifs and but out (I may have time to implement it before the next deep crawl)
  4. #3
  5. No Profile Picture
    Registered User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Mar 2003
    Posts
    24
    Rep Power
    0
    Originally posted by "The Tank"

    I have a live problem akin to this..... I have a basic site www.johnpacker.co.uk where I am trying to get google to spider the content of the database (contains some 5000 products) it used to be invisible (the only way you could search it was via a gay flash search), now I have put a full list in with hard coded links to the results page, the changes was made a few weeks ago and to be honest it probably missed the deep crawl.

    However do you think that I stand a chance next time round for the database to be indexed? or am I missing some gem of knowledge that will take the ifs and but out (I may have time to implement it before the next deep crawl)
    I like the site, the layout is nice, fast and quite well organized, although the dynamic html may confuse some surfers and probably some spiders.
    Your title is way too, too, too, long 70 characters is the average norm for all spiders. It'll help if you included a DOCTYPE Tag at the start.

    Your site is (like our site) not a search engine friendly, it relies on dynamic "un-friendly URLs" that is URLs with ?, &, = in them and most over 70 characters....spiders would probably spider the shortest ones and leave the longests out. It may spider http://www.johnpacker.co.uk/shop/list.asp for example but leave out http://www.johnpacker.co.uk/shop/results.asp?fvarCategory=brass&fvarInstruments =trumpet&fvarRange=student.

    Don't get me wrong, the site does not need to be re-designed or re-written, it'll need techniques to appear friendly and spider more pages, hence more product pages. There are few things you could do to it to such an extent that any instrument searched for or anything you are selling, your site will come up in the top 10 at least in crawler based search engines, but it'll take 2-3 months of optimizing and watching.
    I'll leave it as it is, don't change nothing and see if anymore pages are added between now and 5 weeks time. If after that, there is no hope, I'd act accordingly, you are in Taunton and I am in Sheffield, I could've popped in for a chat about it, but it's too far

    PS: you can shorten the title for now and leave all as it is!
  6. #4
  7. Contributing User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Apr 2003
    Location
    UK
    Posts
    63
    Rep Power
    17
    just reading a few other posts about only having two variables in the dynamic URL to allow spidering, on the full list I have some with 2 and others with 3, will need to rethink those to get them to work. Are you able to qualify this? I am in Yeovil by the way my client is Taunton.
  8. #5
  9. No Profile Picture
    Registered User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Mar 2003
    Posts
    24
    Rep Power
    0
    Originally posted by "The Tank"

    just reading a few other posts about only having two variables in the dynamic URL to allow spidering, on the full list I have some with 2 and others with 3, will need to rethink those to get them to work. Are you able to qualify this? I am in Yeovil by the way my client is Taunton.
    Well, that is the conclusion I came to. In 95% of cases, the URLs spidered are not more than 2 vars long, more to be exact as follows: .com/bla.cgi?blabla=x&blabla=y OR .com/products/bla.cgi?blabla=x&blabla=y or similar. The total length of the URL also is at its smallest, that is the names of the variables are better when named shorter.

    Why is that, it is a mystery! I also hear that people using PHP CMSs or PHP IN GENERAL are the ones with more disadvantage.
    The spiders, notably googlebot may have clear instructions, such as in meta tags (told to bite only 70 characters of the title meta tag and spit the rest out for example....), if that is so with URL lengths, and if over the limit, it would not be eaten, funny and choosy beast!

    will need to rethink those to get them to work
    I would advise that, but not a clear cut 100% guarantee. That is only my analyses and an advice from conducting few experiments (and seen others do so). The advice depends on the site and its accessibility, host, programs and language, security levels, site design and optimization....
    Yeovil and Taunton are both too far

Similar Threads

  1. dynamic pages or better static pages
    By semo in forum Google Optimization
    Replies: 10
    Last Post: Dec 29th, 2004, 08:43 AM
  2. php queries, long urls, and Google
    By TRandle in forum Google Optimization
    Replies: 7
    Last Post: Jun 12th, 2003, 08:34 AM
  3. Googlebot dynamic URL problem
    By knipper in forum Google Optimization
    Replies: 7
    Last Post: Apr 15th, 2003, 03:33 PM

IMN logo majestic logo threadwatch logo seochat tools logo