Webalizer HOWTO

Updated 05/20/03

Webalizer is a tool for creating nice graphs of web site usage. It's not too hard to set up, but hopefully this document will show how CougaarForge uses Webalizer to create usage graphs for each virtual host and a combined usage page as well.

First, when I add a new project to CougaarForge, I create a new virtual host entry in httpd.conf with a log file for that host:

<VirtualHost *>
 ServerName csmart.cougaar.org
 DocumentRoot /var/www/gforge-3.0/csmart
 CustomLog /var/log/httpd/csmart-access_log common
 ErrorLog /var/log/httpd/csmart-error_log
</VirtualHost>

Then I create a $project/usage/ directory and put a placeholder page in there. Like this:

# mkdir "/var/www/gforge-3.0/$project/usage/"
# cp /var/www/gforge-3.0/www/usage/msfree.png /var/www/gforge-3.0/$project/usage/
# cp /root/webalizer/index.html /var/www/gforge-3.0/$project/usage/

Note that I also copied the msfree.png file in there to prevent future broken links.

Next, I create a webalizer.conf file specifically for this host. This can usually be done by cranking another site's webalizer.conf thru sed:

# sed -e "s/project_one/project_two/g" /etc/webalizer/project_one-webalizer.conf > /etc/webalizer/project_two-webalizer.conf

I've got a shell script set up that runs every night and runs Webalizer against all those log files, like this:

#!/bin/bash

# run webalizer on cougaar.org first
/usr/bin/webalizer -c /etc/webalizer/cougaarforge-webalizer.conf

# run webalizer on all the logs after combining and sorting them
cat /var/log/httpd/*access_log > /var/log/httpd/all_access_log_unsorted
sort -t ' ' -k 4.9,4.12n -k 4.5,4.7M -k 4.2,4.3n -k 4.14,4.15n -k 4.17,4.18n -k 4.20,4.21n /var/log/httpd/all_access_log_unsorted
> /var/log/httpd/all_access_log
/usr/bin/webalizer -c /etc/webalizer/cougaarforge-all-webalizer.conf
rm /var/log/httpd/all_access_log
rm /var/log/httpd/all_access_log_unsorted

for project in `ls /var/www/gforge-3.0/ | grep -v prototype | grep -v www | grep -v common`
do
        # create the usage directory if it's not there
        if [ ! -e "/var/www/gforge-3.0/$project/usage/" ]; then
                mkdir "/var/www/gforge-3.0/$project/usage/"
                cp /var/www/gforge-3.0/www/usage/msfree.png /var/www/gforge-3.0/$project/usage/
                cp /root/webalizer/index.html /var/www/gforge-3.0/$project/usage/
        fi

        # run webalizer on this project
        /usr/bin/webalizer -c /etc/webalizer/$project-webalizer.conf
done

Nifty, huh? So now you can get to each project's usage page with a URL something like this - http://tutorials.cougaar.org/usage/ - and there's a combined page, too!

I've probably forgotten some important details; if you can think of a way to improve this document, please post to the GForge forums.