This is a cross posting of an article I wrote over at Technology Report about internet marketing:
One of the cornerstones of good internet marketing is knowing your statistics, and you’d think with all the elaborate, inexpensive and free measurement and analytical tools everybody would have a great sense of how their sites stack up to the competition.
But you’d be wrong.
In fact even many large companies are struggling with high quality analysis even as the tools get better and the measures s-l-o-w-l-y are reaching some level of standardization. For most small companies metrics are, literally, more misses than “hits”. Webmasters routinely report or misinterpret or misrepresent website “hits” as viable traffic when hits often are simply a measure of the number of total files downloaded from the site. Graphics or data intensive websites can see hundreds of hits from a single web visitor.
Even when the analysis is good the reporting is often opportunistic or manipulative, and it’s often done by the same team that is accountable for the results. This is a common problem throughout the business metrics field. Executives are well advised to have independent auditing of results by unbiased parties for any business critical measurements.
Consider learning and using analysis packages like Google Analytics – a brilliantly robust and free tool provided by Google to anyone.
A while back Peter Norvig, one of the top search experts over at Google (also a leading world authority on Artificial Intelligence), published a little study indicating how unreliable the Alexa Metrics were with regard to website traffic. (Thanks to Matt Cutts for pointing out the Peter paper.
The results here demonstrates that Alexa is off by a factor of 50x (ie an error of five thousand percent!) when comparing Matt Cutts’ and Peter’s site traffic.
Although this is just an anecdotal snapshot indicating the problem, and perhaps Alexa is better now, I’d also noted many problems with comparisons of Alexa to sites where I knew the real traffic. 50x seems to be a spectacular level of error for sites read mostly by technology sector folks. It even suggests that Alexa may be a questionable comparison tool unless there is abundant other data to support the comparison, in which case you probably don’t need Alexa anyway.
Of course the very expensive statistics services don’t fare all that well either. A larger, and excellent comparison study by Rand Fishkin over at SEOMOZ collected data from several prominent sites in technology, including Matt Cutts’ blog, and concluded that no metrics were reasonably in line with the actual log files. Rand notes that he examined only about 25 blogs so the sample was somewhat small and targeted, but he concludes:
Based on the evidence we’ve gathered here, it’s safe to say that no external metric, traffic prediction service or ranking system available on the web today provides any accuracy when compared with real numbers.
It’s interesting how problematic it’s been to accurately compare what is arguably the most important aspect of internet traffic – simple site visits and pageviews. Hopefully as data becomes more widely circulated and more studies like these are done we may be able to create some tools that allow quick comparisons. Google Analytics is coming into widespread use but Fishkin told me at a conference that even that “internal metrics” tool seemed to have several problems when compared with the log files he reviewed. My own experience with Analytics have not been extensive but the data seems to line up with my log stats and I’d continue to recommend this excellent analytics package.