#1
  1. No Profile Picture
    Registered User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Jul 2012
    Posts
    11
    Rep Power
    0

    Google Analytics VS Log Analysis


    We use a very good web log analyzer for metrics, I know it accounts for spiders and image linkers and some other noise traffic. I know it gets it's counts off of IP address vs. js/cookie. Even with the junk filters in place these numbers are always much higher then GA. Which number is closer to accurate?? I would like to be able to use the weblog numbers but don't want to misrepresent the site.
    Any help would be appreciated.
    Thanks.

    Comments on this post

    • DMN Webmaster agrees : See my later post, I could help you with this
  2. #2
  3. Here to help
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Feb 2012
    Location
    Zebulon, North Carolina USA
    Posts
    570
    Rep Power
    406
    Originally Posted by Darrin Ward
    I would bet that GA is more accurate than your log file analyzer. The junk filters in your log file analyzer are probably never going to be able to filter enough. GA only tracks browsers/agent with JavaScript enabled... which is believe is 90+% of real visitors. Most other stuff will be junk/robots.
    For most analytics offered for free I would agree with you. If you do your own, I disagree with you. Browses, user agents and such can be forged. G can't detect this like you can if you capture your own stats. There isn't one feature analytics provides that keeping your own stats cannot provide and then more.

    You can....
    determin entry page- trace visitor thur site - time on each page - who sent you - direct visit or not - ip address - exit page - who refered visitor - exit page - what user agent was used - what keyword was used to find you. ( all in real time ) The best part is you don't have to share with G or any other engine !!!


    Personally I track all traffic to sites, (mine and clients) using Access, MsSQL or MySQL. Depending on what is available on their server. These files can become quite hugh. One client gets 6000 visits a day on average. The database on that one is 1.5 gig alone. It has to be dumpled every few months or so, meaning downloaded and archived for later. These databases are extremely accurate.

    All in all, I find that my data is more accurate, updated immediately, on a per visit basis, bot, human or otherwise. I don't have to wait till tomorrow to get yesterdays stats like analytics.

    All done in either php, asp, or asp.net. Again depends on your host as to what I use.

    Comments on this post

    • Jocelyn agrees : The paragraph under 'You can..." is what you send to GA so they make you their report.
    Here to Help, Nothing More.....
    Good SEO isn't Cheap and Cheap SEO isn't Good !!
  4. #3
  5. Contributing
    SEO Chat Hero (2000 - 2499 posts)

    Join Date
    Sep 2003
    Location
    Montreal / Canada
    Posts
    2,186
    Rep Power
    707
    Yep... I made my own log analyzer, cause none of the solutions available at the time were able to give me a 'per page', 'per keyword', 'per IP' or what ever I wanted. That was before GA, and maybe it does have that now. But why slow my stuff down with an external data tracking system when my logs are the most accurate, and my analyzer does exactly what I need. When you code your own, it fits like a glove and you build the reports you really need to save time or figure stuff.
    Disclaimer : My posts on SEO are just from my observations and I do not say it is a true fact... A real fact of life is that, I'm often wrong...
  6. #4
  7. No Profile Picture
    Registered User
    SEO Chat Explorer (0 - 99 posts)

    Join Date
    Jul 2012
    Posts
    11
    Rep Power
    0
    I really like the power of my log analyzer, it's just the number of unique visitors is 2-3 times higher then GA and with my 2 sites that is 200K-300K monthly. Both systems show the same trends with peaks and valleys but the log analyses is always much greater. I have written a log of filters to get rid of the noise and the numbers are still radically different. I'll always run both systems but am unclear on which numbers to report.
  8. #5
  9. Contributing
    SEO Chat Hero (2000 - 2499 posts)

    Join Date
    Sep 2003
    Location
    Montreal / Canada
    Posts
    2,186
    Rep Power
    707
    Well, if GA uses javascript and a user has that disabled, it's most probable that the GA will get lower data than your logs hey. Not sure how they create/use and what type of code and if it's always functional on all platforms of browser.

    Comments on this post

    • DMN Webmaster agrees : My point exactly, you can't disable my data collection, it's done on the server side!!!
  10. #6
  11. Here to help
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Feb 2012
    Location
    Zebulon, North Carolina USA
    Posts
    570
    Rep Power
    406
    Originally Posted by jpascone
    I really like the power of my log analyzer, it's just the number of unique visitors is 2-3 times higher then GA and with my 2 sites that is 200K-300K monthly. Both systems show the same trends with peaks and valleys but the log analyses is always much greater. I have written a log of filters to get rid of the noise and the numbers are still radically different. I'll always run both systems but am unclear on which numbers to report.
    The reason for that is the way the logs are analyized. Most of the time if the visits are over say 20 to 30 minutes apart, and lets say it's you making updates to the site and such, that gets mixed into the unique visits. This is typical behavior for most hosting companies. This is just one of the reasons I decided on my own stats database.

    Generally I find Google to report more visits than there actually is. But it is reall close.. sometimes it's dead on but most of the time it is off.

    Here is a sample of bots you will find on th web
    Something Google will not tell you..... these all came from one database.

    AdsBot-Google (+http://www.google.com/adsbot.html)
    AdsBot-Google-Mobile (+http://www.google.com/mobile/adsbot.html) Mozilla (iPhone; U; CPU iPhone OS 3 0 like Mac OS X) AppleWebKit (KHTML, like Gecko) Mobile Safari
    Baiduspider
    Baiduspider+(+http://www.baidu.com/search/spider.htm)
    CCBot/1.0 (+http://www.commoncrawl.org/bot.html)
    checkgzipcompression.com robot
    COMODOSpider/Nutch-1.2
    DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
    Domnutch-Bot/Nutch-1.0 (Domnutch; http://www.Nutch.de/)
    EdisterBot (http://www.edister.com/bot.html)
    findfiles.net/0.98 (Robot;test_robot@gmx-topmail.de)
    freelinkexplorerbot
    Gigabot/3.0 (http://www.gigablast.com/spider.html)
    Google Bot
    GoogleBot 1.0
    Googlebot/2.1 (+http://www.google.com/bot.html)
    Googlebot-richsnippets
    GrepNetstat.com Bot/1.0; +http://www.grepnetstat.com)
    GSLFbot
    http://SiteIntel.net Bot
    http://www.activesearchresults.com/addwebsite.php ASR Ranking Technology/Spider/Crawler
    Influencebot/0.9; (Automatic classification of websites; http://www.influencebox.com/; info@influencebox.com)
    intelium_bot
    Keyword Density Analyzer v1.01 ( http://www.ranks.nl/tools/spider.html )
    LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 +http://www.linkedin.com)
    MLBot (www.metadatalabs.com/mlbot)
    Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0) AddSugarSpiderBot www.idealobserver.com
    Mozilla/5.0 (compatible; AboutUsBot Johnny5/2.0; +http://www.AboutUs.org/)
    Mozilla/5.0 (compatible; AhrefsBot/1.0; +http://ahrefs.com/robot/)
    Mozilla/5.0 (compatible; AhrefsBot/2.0; +http://ahrefs.com/robot/)
    Mozilla/5.0 (compatible; AhrefsBot/3.0; +http://ahrefs.com/robot/)
    Mozilla/5.0 (compatible; AhrefsBot/3.1; +http://ahrefs.com/robot/)
    Mozilla/5.0 (compatible; aiHitBot/1.0; +http://www.aihit.com/)
    Mozilla/5.0 (compatible; aiHitBot/1.1; +http://www.aihit.com/)
    Mozilla/5.0 (compatible; alexa verifiybot/1.0; +http://www.alexa.com/help; help@alexa.com)
    Mozilla/5.0 (compatible; archive.org_bot +http://www.archive.org/details/archive.org_bot)
    Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
    Mozilla/5.0 (compatible; Bender; http://benderthewebrobot.tumblr.com)
    Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
    Mozilla/5.0 (compatible; Blekkobot; ScoutJet; +http://blekko.com/about/blekkobot)
    Mozilla/5.0 (compatible; CareerBot/1.1; +http://www.career-x.de/bot.html)
    Mozilla/5.0 (compatible; DCPbot/1.0; +http://domains.checkparams.com/)
    Mozilla/5.0 (compatible; DCPbot/1.1; +http://domains.checkparams.com/)
    Mozilla/5.0 (compatible; DCPbot/1.2; +http://domains.checkparams.com/)
    Mozilla/5.0 (compatible; discobot/2.0; +http://discoveryengine.com/discobot.html)
    Mozilla/5.0 (compatible; DIY-SEOBot/0.1a; +http://www.diyseo.com/bot.html)
    Mozilla/5.0 (compatible; DomainVader/1.0; +http://domainvader.com/bot/info.php)
    Mozilla/5.0 (compatible; en-US; ReverseGet/1.0; http://reverseget.com/; robot@reverseget.com)
    Mozilla/5.0 (compatible; Exabot/3.0; +http://www.exabot.com/go/robot)
    Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)
    Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
    Mozilla/5.0 (compatible; GrepNetstatBot/1.0; +http://www.grepnetstat.com/bot)
    Mozilla/5.0 (compatible; Hailoobot/1.2; +http://www.hailoo.com/spider.html)
    Mozilla/5.0 (compatible; IntelCSbot/0.2beta)
    Mozilla/5.0 (compatible; IPTCBOT;
    Mozilla/5.0 (compatible; JikeSpider; +http://shoulu.jike.com/spider.html)
    Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Exabot-Thumbnails)
    Mozilla/5.0 (compatible; LinksManager.com_bot +http://linksmanager.com/linkchecker.html)
    Mozilla/5.0 (compatible; ltbot/1.3; bot@language-tools.com)
    Mozilla/5.0 (compatible; MJ12bot/v1.4.0; http://www.majestic12.co.uk/bot.php?+)
    Mozilla/5.0 (compatible; MJ12bot/v1.4.1; http://www.majestic12.co.uk/bot.php?+)
    Mozilla/5.0 (compatible; MJ12bot/v1.4.2; http://www.majestic12.co.uk/bot.php?+)
    Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)
    Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1) KomodiaBot/1.0
    Mozilla/5.0 (compatible; NerdByNature.Bot; http://www.nerdbynature.net/bot)
    Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/)
    Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)
    Mozilla/5.0 (compatible; Plukkie/1.4; http://www.botje.com/plukkie.htm)
    Mozilla/5.0 (compatible; ProCogBot/1.0; +http://www.procog.com/spider.html)
    Mozilla/5.0 (compatible; Purebot/1.1; +http://www.puritysearch.net/)
    Mozilla/5.0 (compatible; Search17Bot/1.1; http://www.search17.com/bot.php)
    Mozilla/5.0 (compatible; SEOkicks-Robot +http://www.seokicks.de/robot.html)
    Mozilla/5.0 (compatible; SIBot/1.0; +http://www.profound.net/sibot.html)
    Mozilla/5.0 (compatible; SinaaBot/1.0; +http://www.marketdefender.com/bot.html)
    Mozilla/5.0 (compatible; SWEBot/1.0; +http://swebot-crawler.net)
    Mozilla/5.0 (compatible; WBSearchBot/1.1; +http://www.warebay.com/bot.html)
    Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)
    Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
    Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
    Mozilla/5.0 (compatible; YandexFavicons/1.0; +http://yandex.com/bots)
    Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
    Mozilla/5.0 (seoanalyzer; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
    Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv:1.9.0.13) Gecko/2009073022 Firefox/3.5.2 (.NET CLR 3.5.30729) SurveyBot/2.3 (DomainTools)
    Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) Speedy Spider (http://www.entireweb.com/about/search_tech/speedy_spider/)
    Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.186 Safari/535.1 solfo-linkchecker/1.0 (http://solfo.com/linkbot.html)
    Mozilla/5.0+(compatible;+googlebot/2.1;++http://www.google.com/bot.html)
    msnbot/2.0b (+http://search.msn.com/msnbot.htm)
    msnbot/2.0b (+http://search.msn.com/msnbot.htm)._
    msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)
    msnbot-NewsBlogs/2.0b (+http://search.msn.com/msnbot.htm)
    My Nutch Spider/Nutch-1.5
    nb-bot
    New-Sogou-Spider/1.0 (compatible; MSIE 5.5; Windows 98)
    NextGenSearchBot 1 (for information visit http://www.zoominfo.com/About/misc/NextGenSearchBot.aspx)
    OSS-bot/0.02 (see http://michaelnielsen.org/blog/oss-bot/ or contact Michael Nielsen, mn@michaelnielsen.org)
    PagePeeker.com (info: http://pagepeeker.com/robots)
    Primo Web Spider/Nutch-1.4
    SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
    SemrushBot/0.91
    SEOENGWorldBot/1.0 (+http://www.seoengine.com/seoengbot.htm)
    SeznamBot/3.0 (+http://fulltext.sblog.cz/)
    ShowyouBot (http://showyou.com/crawler)
    Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
    spider
    SuperPagesBot/0.1
    SuperPagesUrlVerifyBot/1.0
    Toplistbot
    TurnitinBot/2.1 (http://www.turnitin.com/robot/crawlerinfo.html)
    Twitterbot/0.1
    Twitterbot/1.0
    Updownerbot (+http://www.updowner.com/bot)
    Who.is Bot
    Wotbox/2.0 (bot@wotbox.com; http://www.wotbox.com)
    Yahoo! Slurp China
    Yeti/1.0 (NHN Corp.; http://help.naver.com/robots/)
    YPBot/Raven1.1.3 (compatible; Googlebot/2.1;+http://www.yellowpages.com/about/legal/crawl)
  12. #7
  13. Here to help
    SEO Chat Adventurer (500 - 999 posts)

    Join Date
    Feb 2012
    Location
    Zebulon, North Carolina USA
    Posts
    570
    Rep Power
    406
    Originally Posted by Darrin Ward
    How do you know that your logfile analyzer is more accurate? How do you know you a really eliminating all of the bots, including those that load page elements such as images?
    Darrin you got a good point.

    First
    Every visit to the site is captured and entered into a database. This happens no matter where you host your system, unless you run your own server and turn off logging. ( which would defeat the purpose of logging)

    Second
    I can at my lesiure examine and analyize the data. I can make a mistake yes, but my data is complete. It is how I interpet the data that matters. As long as I don't make an error then my analysis is correct.

    Third
    Nothing is left out of the data stored in the logs, absolute nothing. As for detecting all the bots you have to know what to look for. See my previous post with the 116 bots in it. It is not a complete list, since new bots come out all the time. But when you eliminate all the known bots and look at the traffic that is left, you will find the new bots. Then you just account for them with a little tweak of the query.
    For example here is the query I used to get the list of bots in previous post...
    SELECT DISTINCT tbl_RawStatistics.Agent
    FROM tbl_RawStatistics
    WHERE (((tbl_RawStatistics.Agent) Like "*bot*" Or (tbl_RawStatistics.Agent) Like "*spider*" Or (tbl_RawStatistics.Agent) Like "*slurp*"));

    If I find a new bot, and its name is "Darrin" ( just kidding ) I could modify the query as follows and immediately account for the new bot....
    SELECT DISTINCT tbl_RawStatistics.Agent
    FROM tbl_RawStatistics
    WHERE (((tbl_RawStatistics.Agent) Like "*bot*" Or (tbl_RawStatistics.Agent) Like "*spider*" Or (tbl_RawStatistics.Agent) Like "*NewBotName*" Or(tbl_RawStatistics.Agent) Like "*slurp*"));

    Now that I have tweaked the query, immediately all data is corrected and updated for the inclusion of the new bot. This is a retro-active adjustment, it corrects from the beginning not just from today forward.

    The trick is to use just enough characters to wildcard match on the name. Notice I didn't search for google, msn, yahoo or baidu. Just looked for agents with bot, spider, slurp and so on.

    Edit
    These are ADO examples. In the real world, in the query the wildcard char * only works in access on your system, on a web server you have to use the char %. Just to clarify, because if you decided to use these as is, they wouldn't work.
    Last edited by DMN Webmaster; Aug 10th, 2012 at 05:56 PM.

Similar Threads

  1. Google Analytics and other google stuff
    By Ben Goldstein in forum Google Optimization
    Replies: 1
    Last Post: Nov 15th, 2006, 12:41 PM
  2. Google says you donít need a web site any more
    By karma_killer in forum Search Engine Optimization
    Replies: 2
    Last Post: Nov 8th, 2006, 05:47 AM
  3. Does Google Analytics affect your ranking?
    By zipz in forum Google Optimization
    Replies: 9
    Last Post: Aug 5th, 2006, 12:19 PM
  4. Google Analytics, is it perfect?
    By VamsiGangavalli in forum Google Optimization
    Replies: 5
    Last Post: Dec 8th, 2005, 06:00 PM
  5. Google Analytics
    By jason_sot in forum Google Optimization
    Replies: 2
    Last Post: Nov 15th, 2005, 03:03 PM

IMN logo majestic logo threadwatch logo seochat tools logo