Google Optimization
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
User Name:
Password:
Remember me
Go Back   SEO Chat ForumsGoogleGoogle Optimization

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread SEO Chat Forums Sponsor:
  #1  
Old February 1st, 2003, 02:47 AM
Wayne Wayne is offline
Junior Member
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Feb 2003
Posts: 15 Wayne User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
bot tracking software

Hi what software do you all use to see what pages googlebot has spidered? Im new to this SEO stuff and would like to know what pages googlebot is visiting.

Thankyou

Reply With Quote
  #2  
Old February 1st, 2003, 04:41 AM
Darrin Ward's Avatar
Darrin Ward Darrin Ward is offline
Founder, SEOChat.com :)
SEO Chat Beginner (1000 - 1499 posts)
 
Join Date: Dec 2002
Location: Miami, Florida
Posts: 1,453 Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level)Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level)Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level)Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level)Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 21 h 5 m 4 sec
Reputation Power: 20
I don't use any software .. I just download my log files and scan them for the spiders I'm tracking!!

It's much faster this way!
__________________
Darrin J. Ward, the Original Founder of SEO Chat (this site), Google Dance Tool & some other cool stuff! Read my: Professional SEO Site or Twitter: @DarrinJWard.

Reply With Quote
  #3  
Old February 1st, 2003, 01:28 PM
mario's Avatar
mario mario is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Dec 2002
Location: Spain
Posts: 329 mario User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 m 31 sec
Reputation Power: 8
Quote:
I just download my log files and scan them for the spiders I'm tracking!! It's much faster this way!


Darrin do you have a list with all spider adresses?
__________________
best regards... mario

Reply With Quote
  #4  
Old February 1st, 2003, 02:03 PM
Darrin Ward's Avatar
Darrin Ward Darrin Ward is offline
Founder, SEOChat.com :)
SEO Chat Beginner (1000 - 1499 posts)
 
Join Date: Dec 2002
Location: Miami, Florida
Posts: 1,453 Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level)Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level)Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level)Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level)Darrin Ward User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 21 h 5 m 4 sec
Reputation Power: 20
Quote:
Originally posted by "mario"

Darrin do you have a list with all spider adresses?


Nope .. In the case of Google I do a search for ".googlebot.com" (the remote host) in my log file, then I keep my eye on the requested URL and keep hitting F3 (which is find next).. then I can see all the pages Google has requested. I do the sale for Inktomi except I replace ".googlebot.com" for "slurp@intomi.com" (the user agent).

As I've said previously in this thread:
http://www.google-dance.com/chat/viewtopic.php?t=49&highlight=grep

You can use this command if you can login to your server via SSH / Telnet and know the location of your logfile(s) to email you all records of Googlebot:
Code:
grep 'googlebot.com' access_log | /usr/sbin/sendmail me@myaddress.com

I use this a LOT, especially if the log file is over about 30 megs i.e. would take me more than about 1 minute to download!!

Reply With Quote
  #5  
Old February 1st, 2003, 02:10 PM
Amygdala Amygdala is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Posts: 66 Amygdala User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
Quote:
Originally posted by "Darrin Ward"

You can use this command if you can login to your server via SSH / Telnet and know the location of your logfile(s) to email you all records of Googlebot:
Code:
grep 'googlebot.com' access_log | /usr/sbin/sendmail me@myaddress.com

I use this a LOT, especially if the log file is over about 30 megs i.e. would take me more than about 1 minute to download!!


May I just say, that is sweet... why didn't I think of that! Using it now though

Amy
__________________
You've just read the posting of an airhead, take no notice whatsoever.

Reply With Quote
  #6  
Old February 1st, 2003, 05:08 PM
Janetteddy's Avatar
Janetteddy Janetteddy is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Dec 2002
Location: NH
Posts: 46 Janetteddy User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
Darrin where exactly do you put this code .....in very simple instructions and I dont understand the part about logging into your server and the other SSH and the Tel thingy lol...I am a novice at this and I dont want to mess with something I do not understand.....thanks in advance ops: and God Bless Our Astraunauts and their families so sad...so very, very sad

Reply With Quote
  #7  
Old February 1st, 2003, 05:11 PM
johnnyb3's Avatar
johnnyb3 johnnyb3 is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Posts: 76 johnnyb3 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
if you have access to your server (telnet or ssh)

you can run the commands:

look at live googlebot hits to your site
tail -f /home/virtual/path/to/site/access_log | grep googlebot

look at last 10 hits to your site
tail /home/virtual/path/to/site/access_log | grep googlebot

look at all googlebot hits to your site
less /home/virtual/path/to/site/access_log | grep googlebot

[/b]

Reply With Quote
  #8  
Old February 1st, 2003, 05:44 PM
Wayne Wayne is offline
Junior Member
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Feb 2003
Posts: 15 Wayne User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Darrin when i run that command i get:
grep : access_log: No such file or directory.

Thankyou

Reply With Quote
  #9  
Old February 2nd, 2003, 05:09 AM
Amygdala Amygdala is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Posts: 66 Amygdala User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
Look at johnny's post. You need to either specify the path to the access_log or be in the directory with it in at the time.

Reply With Quote
  #10  
Old February 5th, 2003, 07:26 PM
Wayne Wayne is offline
Junior Member
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Feb 2003
Posts: 15 Wayne User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
I cant seem to find where my access_log is located. Can anyone help me find where it is? i need the path.

Thankyou

Reply With Quote
  #11  
Old February 5th, 2003, 07:32 PM
Amygdala Amygdala is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Posts: 66 Amygdala User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
At the command line as root type:-

locate access_log

That'll tell you where any files called access log are.

Reply With Quote
  #12  
Old February 5th, 2003, 08:36 PM
Wayne Wayne is offline
Junior Member
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Feb 2003
Posts: 15 Wayne User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Hi thanks for the relpy.

I was trying that before and was getting:
warning: locate: could not open database: /var/lib/slocate/slocate.db: Permission denied.

Ive tryed contacting my host with no reply.

Thankyou

Reply With Quote
  #13  
Old February 5th, 2003, 09:51 PM
johnnyb3's Avatar
johnnyb3 johnnyb3 is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Posts: 76 johnnyb3 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
make sure you are logged in as root

to log in as root (superuser) type in: su
then you should be asked your root pw, enter it and hit enter.

then try to see if it will work

Reply With Quote
  #14  
Old February 6th, 2003, 02:59 AM
Amygdala Amygdala is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Posts: 66 Amygdala User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
Quote:
Originally posted by "Wayne"

warning: locate: could not open database: /var/lib/slocate/slocate.db: Permission denied.

Ive tryed contacting my host with no reply.


As I said, you need to be root. However, if you have a virtual account or a managed hosting account on a dedicated server then you may not get the root password from them.

In which case you need to tell them that you want to be able to get full access to your access_logs and they may set up something different for you.

I don't know your circumstances, but they're almost certainly different to mine (I manage my own servers).

Good Luck.

Amy

Reply With Quote
  #15  
Old February 7th, 2003, 12:31 PM
j-net's Avatar
j-net j-net is offline
Junior Member
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Posts: 12 j-net User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
other tracking options

For those of you without the knowledge or direct access to your server, consider a tracking software program, but be careful that you don't use one that is not "interactive". Webtrends log Analyzer will tell you how many spiders visits there were and how many pages they went to, but it's basically a hard copy report. I use Nettracker, which allows you to drill down into the spider visits and find out all of the information that is in the log, such as which pages, in what order, the length of time spent, etc. While all this information is in the logs, it might be easier to understand in this format.

J-Net

Reply With Quote
Reply

Viewing: SEO Chat ForumsGoogleGoogle Optimization > bot tracking software


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump




 Free IT White Papers!
 
Create the Optimal Architecture for your Critical Applications
Warburton's the largest independently owned bakery in the UK faced a number of difficult challenges in providing the most robust yet efficient IT infrastructure for their organization's success. IBM's services combined with their xSeries servers created the perfect platform for their SAP environment with sufficient flexibility, and did so in very time effective fashion.

Request Your Free Technology Downloads!
 
Five Best Practices for Deploying a Successful Service-Oriented Architecture
This white paper describes the benefits you can expect with SOA, and how IBM can help take your business there.

Request Your Free Technology Downloads!
 
Gartner Magic Quadrant for Application Delivery Controllers
Gartner summarizes its view on Application Delivery Controllers, evaluates strengths and weaknesses of solutions, and provides Magic Quadrant reporting for a quick comparison across all vendors. Learn from Gartner how you can benefit from an all-in-one device like Citrix NetScaler that delivers the highest levels of availability, performance and security.

Request Your Free Technology Downloads!
 
Knowledge is Power
What you don't know can hurt you, and is likely costing you money and increasing your security risks during an era of scarce resources. This white paper proposes six key strategies that enterprise security managers can use to improve their network defense posture.

Request Your Free Technology Downloads!
 
Rationalizing the Multi-Tool Environment
The rationalized multi-tool approach is flexible, scalable and cost effective. It provides the necessary input to the IT service management business processes. It preserves prior investments in monitoring tools, empowers technologists to select the best tools with which to do their jobs, and enhances effective response to incidents.

Request Your Free Technology Downloads!
 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 




© 2003-2010 by Developer Shed. All rights reserved. DS Cluster 4 Hosted by Hostway
For more Enterprise Application Development news, visit eWeek