Using Extnded Common Log Format

Mark Avnet (mavnet@banta-im.com)
Wed, 30 Jun 1999 12:44:29 -0400


Hi.  I am in the process of modifying wwwstat 2.0 to accept extended
common log format.  To the original information, I am adding referrer,
browser, and platform.  Right now, I am trying to get referrer on
there.  I have added everything necessary to the code, but I think there
is a problem in the way I am parsing the log file, as I am now getting
0s for all of the entries.  Whereas the original line of code for common
logs is:

($host, $rfc931, $authuser, $timestamp, $request, $status, $bytes) =
            /^(\S+) (\S+) (\S+) \[([^\]]*)\] \"([^"]*)\" (\S+) (\S+)/;

my new linw for parsing the extended log format is:

     ($host, $rfc931, $authuser, $timestamp, $request, $status, $bytes,
$ref,
 $null1, $null2, $browser, $platform) =
            /^(\S+) (\S+) (\S+) \[([^\]]*)\] \"([^"]*)\" (\S+) (\S+)
\"([^"]*)\ (\S+) (\S+) (\S+) (\S+)/;

An example of the logs that I am parsing is:

1Cust105.tnt17.dfw5.da.uu.net - - [20/Jun/1999:11:31:11 -0400] "GET
/home.htm HTTP/1.1" 200 3455 "http://www.dickinson.com/" "Mozilla/4.0
(compatible; MSIE 4.01; Windows 98)"

There is doubtless a mistake in how I have done this.  Any ideas as to
wait the problem could be?

Thanks a lot.

       Mark S. Avnet
       Banta Integrated Media
       Cambridge, MA