|
|
|||||||||
|
|||||||||
|
|||||||||
| |
||
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
||||
|
||||
|
how do you tell a deep crawl?
I use awstats, and I've only been able to tell when a bot visits, not what pages they visit. How do I determine when a deep crawl happens?
-Greg
__________________
new jersey web design guess you can't google bomb a site in the sandbox... |
|
#2
|
||||
|
||||
|
Generally, you need to examine the raw log files to see which pages are being requested. Then you can see if the pages are deep in your site's heirarchy or top level pages. If your site only has a few pages, you might be able to gauge the relative activity by the number of pages requested but this is less telling.
__________________
Have a thumb? Check out my gardening forum. |
|
#3
|
||||
|
||||
|
Good question - good answer. I've been wondering what is a deep crawl and how you find out about one.
|
|
#4
|
||||
|
||||
|
|
|
#5
|
||||
|
||||
|
hmmm... that turned up bupkiss. bernard, what do you use? Well first let me get something clarified - can you track a visitor's session, or is it done via IP address? Maybe I'm expecting too much...
|
|
#6
|
||||
|
||||
|
I don't worry about deep crawls or shallow crawls. All of my sites are static HTML, a maximum of 3 levels deep (from the home page) and completely indexed. If my sites were huge, I might be more interested in it, but in the end, checking for a deep crawl is a futile excercise (IMO) - it doesn't make it happen. I suppose it might be useful if you are testing changes that might affect crawlability. Otherwise, I would focus my energies on more productive pursuits.
Raw log files only show you page requests. Sessions reported by most log analyzers are an arbitrary designation lumping page requests that occur within a specific period of time together (more or less). Each page request in your raw log should identify the user agent and IP of the requester. Bots/spiders are easy to spot because they (usually) have a unique user agent. You can load/import your raw log file into Excel and sort the data on that field for easier analysis. |
|
#7
|
|||
|
|||
|
"how can you tell a deep crawl?"
When your face is dragging on the ground. Seriously, I think it greatly depends on the vastness of your site, and how deep your subpages go. |
|
#8
|
||||
|
||||
|
ok, let me rephrase the question;
How can you tell when a bot requests a specific page? -Greg |
|
#9
|
||||
|
||||
|
Greg. Look at your raw log files and you will figure it out. Each record lists the page requested and the requestor.
|
|
#10
|
||||
|
||||
|
oh snap! i never thought of that. seriously, it's a little embrassing
Everyone uses log managers, I just assumed you had to. Is it still true that google's deep crawl starts with 216?-Greg |
|
#11
|
|||
|
|||
|
This is my log file and this how I look at it may be I am wrong
Robots/Spiders visitors 2 different robot****sBandwidthLast visit Googlebot (Google)106+53.78 MB08 Jul 2004 - 06:47 Unknown robot (identified by hit on 'robots.txt')0+95.26 KB08 Jul 2004 - 10:04 * Numbers after + are successful hits on "robots.txt" files
This is how I interpet AW Stats. I am a newby at this and could be wrong. On google it hit 106 files + another 5 that wore defined by robots.txt. It consumed 3.78 megs of bandwidth on the visit. Last visit was July 8 at 6:47. I would classify 106 pages on my site as prety deep though I probably have around 300 pages. Good luck and I hope this helps Sorry but the format of the stats I copied are not going to maintain Last edited by arthur1972 : July 8th, 2004 at 08:49 PM. Reason: Did not hold format of pasted text |
![]() |
| Viewing: SEO Chat Forums > Search Engine Strategies > Search Engine Optimization > how do you tell a deep crawl? |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|