One hosting service I have contains several domains. In order to understand which of these web sites was the most
trafficked, I needed to analyze the logs a bit.
There are sever Apache log analysis solutions out there, but frankly I just needed some basic information, and didn’t
want a software bloated with a lot of feature 99% of which I didn’t need.
Perl came to the rescue, and in some 15 minutes I wrote a working script.
What I needed to know
Basically, I wanted to know the number of hits and the bytes transferred for each of the web sites, in order
to make a ranking.
The Apache log is something that is a standard now, and is something like this:
Being a shared Apache, my log also has an extra field at the beginning, with the various domain names:
The script
The script is meant to be used feeding a log on its standard input, so for instance.
Here goes the script, with some comments where needed.
We use a couple of nice Perl modules here, which are Number::Bytes::Human, used to automatically convert bytes
in something easier to read, and especially Text::Table, which produces a great nicely-formatted table on
the console.
Here’s the output:
Text::Table also supports borders for cells and other niceties, which I didn’t use.
This quick and dirty solution is very easy to adapt and improve, for example to show the IPs which hit the server
most, grouped by web site.