|
||
Understanding Web StatisticsMWS Newsletter: Volume 4, Issue 1 First a quick announcement: Margaret and I are going to be in New York for the Folio Show next week. If you'll be at the show or if you're in the New York area and you'd like to set up a meeting with us, please give me a call or send an email. Now on to the subject of this newsletter: Web stats. I'm going to try to keep this short and to the point. This is fundamental information that anyone who's trying to make money off of Web content needs to understand, but I'm going to approach it from a slightly different angle than usual. Rather than simply defining the terms "hit", "page view", "visit", and "referrer" (which you should know or look up now if you don't), I'm going to show you a small section of a Web server log file and explain how these stats are derived from it. Many Web stat collection systems (such as Google Analytics and Omniture) now use Javascript and cookies to collect visitor data, but the basic idea is the same. Every time a file is requested from a Web server, a line is written to a text file. There are several different formats for logging Web server requests, but one of the most common and most useful is the 'Combined Log Format'. Here's are two example lines from a log in Combined Log Format:
If you take apart each of these lines, you'll see that they aren't that complicated, and they contain quite a bit of information. Here's a quick rundown, in order, of each important piece of data in this log file format:
You can find out more about each of the parts by visiting: http://httpd.apache.org/docs/2.2/logs.html (I'm trying to keep this short, after all). I will say this much, just because it's so important: each time any file is successfully downloaded from a web server, it counts as a "hit". Each time a Web page is downloaded, it counts as a 'page view'. So, in the above example, the file aboutme.html contains an image (picture.jpg). These files download separately and register two "hits" but only one page view. Page views are the more accurate measurement of how much traffic a site gets. Hopefully you already know this1. The raw data that can be found in a single line of a log file is pretty impressive. However, even more important is what can be calculated, or at least estimated, from multiple lines of a log file. For example, by looking at all of the logged hits over a certain period of time, log file analysis software can figure out how long someone stayed on your site, or what keywords are most commonly used to find your site on which search engine, or how many unique visitors your site got during a specified period of time. All of this data is essential for knowing how your site is doing, for figuring out where to focus your efforts, and for comparing it with other sites. But wouldn't it be nice if there were a number that you could use that would give you a handle on how people feel about your content, or how "engaged" they are in your content? Yes, of course. Is it possible? Sort of. Defining and measuring "engagement" will be the topic of next month's newsletter. In the meantime, please contact me if you have any questions. Thanks, Chris Minnick ------ ------ If you would like to suggest a topic for a future newsletter issue, please send mail to newsletter@minnickweb.com In the meantime: For more information about how Minnick Web Services can help you achieve your goals, please visit our web site, www.minnickweb.com or contact us. To find out more about our digital proofing, publishing, and reporting technologies, visit the eBookHost demo at demo.ebookhost.com. ------ Complete Archives and Subscription Information ------ |
||