Who's Visiting You?
First of all a big thanks to everyone who gave such great feedback to last months tech feature on "Creating Your First Website".
As you all have created web masterpieces, I thought we would take a look at tracking visitors to your web pages.
The very first question I usually ask a customer when they are discussing their online needs is 'Why?' As in what are your central reasons for putting a website onto the internet? It wasn't all that long ago that a very common answer to that question was 'Well because everyone else is and I need one'. From there we'd go right back to the beginning and talk about online strategies. Over the last few years, users have become much more savvy, they sign up for web hosting, already with a complete online business plan, a clear set of objectives (in terms of delivering traffic to their sites) and a path to achieving those results.
One of the most important tools that a site owner relies on is web site statistics. These can show trends in user behaviour, for example when you've made changes to your site, or the effectiveness of marketing campaigns. Be it placing an advertisement in a magazine, or managing a pay per click online strategy. What this tech feature will show you is how this information is collected, how to access it, and what options you have available in regards to the data.
Your website at NetRegistry resides on a machine called a webserver. A webserver works by responding to requests for information. A request for information is simply a user visiting your website. Each time a request is made to the webserver, it logs some essential information and stores them in a 'logfile'.
An example of a log file entry is shown below:
10.1.1.56 - - [31/Jul/2007:00:30:54 +1000] "GET /index.php HTTP/1.0" 200 18672 "http://test.com/index.php" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0" "-"
This example shows, the IP address the page request has come from, a time stamp for when the event occurred, the page that was requested, a status for the request (200 = successful) and as much available information about the machine making the request as possible (the operating system and the web browser software used).
As mentioned above, every single request is simply added as a new line in the log. Now If you have a busy site that gets say 100,000 hits a month, what you'll have at the end of the month is a very long log file with lots and lots of raw data that is difficult to extract useful information out of. Fortunately, there is clever software available which can take your log file, analyze all the data in it, and make a nice graphical summary for you showing you a great deal of useful information. This type of software is very creatively called a 'logfile analyzer'.
There exists a range of logfile analyzers, from free open source type solutions right through to very expensive enterprise type solutions.
How does all this relate to the services that NetRegistry offer?
Well, every single web hosting package that NetRegistry offers to clients comes with free 'stats packages' which are simply logfile analyzers (as described above). We currently offer:
- Webalyzer
- Funnelweb
- Analog
- AWStats
For no additional cost. If you want something a bit more powerful, there is always Urchin, which is provided at a small additional cost.
Many of our users like to take a look around the 4 free options and 1 non-free option before making a decision. We've made this process very simple with the launch of stats.au.com. This site has web stats data for each of the packages and users are welcome to navigate through the structure of each applications output to decide what information works the best for their requirements. The exact pages are:
- http://stats.au.com/webalyzer
- http://stats.au.com/funnelweb
- http://stats.au.com/analog
- http://stats.au.com/awstats
- http://stats.au.com/urchin
Unfortunately, I'm going to have to add a disclaimer part to this month’s article. So here it is, DISCLAIMER!! The most common question I get asked in relation to web site statistics is:
How can I find out how many people visited my site?
Sounds simple doesn't it? A single number that sums up all the visits to your site. Hang on, you may be asking yourself, is that total visits or unique visits? Hmmm, maybe it's not quite so simple after all.
There are a number of reasons why it's not simple. To understand this you will need a basic understanding of how a web page works. So lets take a simple example. You have built a single page of text that also has a single image on it. I'm going to visit your site. I land on the single page and my web browser loads the text, it does this by submitting a single request to the web server. As the page is downloading, the web browser sees that the page has an image on it; therefore it sends another request to the server, which in tern downloads the image. I read your page then decide to go off and visit another site.
The two requests that I've made from the web server both generate a line in the log file. As I've shown before, other than listing 'my' IP address it doesn't store (or otherwise identify) any personal information about me.
The complexity to this example lies in what are termed 'caches'. There are two types of these. Firstly, there is one on your computer. It will make local copies of files that you frequently request. This makes pages load more quickly for the internet user. At a larger scale, your ISP also uses caching both to increase the speed a page loads and also to lower their data carriage costs. So in the above example, what may happen is the text is loaded from your ISP's cache and the image comes out of the cache on your PC. No requests get through to the server at all and no lines are added to the log file.
Some studies have shown that for popular sites, up to 50% of requests are made via a cache. Meaning the 'visitors' shown in a web stats program are somewhat meaningless in absolute terms. Of course comparatively using the information to show trends is highly valuable.
In summary there is much that you can't accurately know about visitors to your site. Many analytics programs make some assumptions or estimations to try and provide a 'best guess' but you as a savvy site manager need to take the numbers provided with a grain of salt.
Finally two very common questions that I'll address are:
- "How can I see the web stats for my site?" and
- "Can I download my raw server logfiles?"
The first question is very simple. Simply visit:
http://theconsole.netregistry.net, then click the 'Web Hosting' icon. At the bottom of that screen there is the 'Web Server Statistics' section. There is a link to take you to the actual web stats as well as some administrative tools that let you change your analytical tool. Please note if you change the tool, it doesn't take effect until the next time we run the analytics software, which is each night. Also if you change the package, it only commences with the current log file that is it doesn't regenerate all of your historical statistics. So exercise caution in making changes.
The answer to the second question is, yes. The raw log files are available for you to download. The NetRegistry policy however states that we don't archive the raw access logfiles for more than 60 days. The reason for this is that if we were simply to keep all logs forever, the data storage requirements that this would require would simply be too enormous to contemplate. For clients who wish to run their own analytics tools we strongly recommend that you download your log file for the previous month within a week or two of the next month. To download the raw log files, you need to use an ftp program and access them via:
stats.netregistry.net
The username and password that is required for this system is the same as your general ftp details that you would use to access the files for your website that are located on the web servers. The files are archived in neat, monthly compressed files in /logs. The compression method we use is bzip2, for which there are many free tools to uncompress the file.
And that brings us to the end of another action packed and exciting technical feature. I'm happy to answer any questions readers may have via email. My email address is tina.tyler@netregistry.com.au. Finally if there is a tech feature that you'd really like to see, then I'm more than happy to take requests!