Using Extnded Common Log Format
Mark Avnet (mavnet@banta-im.com)
Wed, 30 Jun 1999 12:44:29 -0400
Hi. I am in the process of modifying wwwstat 2.0 to accept extended
common log format. To the original information, I am adding referrer,
browser, and platform. Right now, I am trying to get referrer on
there. I have added everything necessary to the code, but I think there
is a problem in the way I am parsing the log file, as I am now getting
0s for all of the entries. Whereas the original line of code for common
logs is:
($host, $rfc931, $authuser, $timestamp, $request, $status, $bytes) =
/^(\S+) (\S+) (\S+) \[([^\]]*)\] \"([^"]*)\" (\S+) (\S+)/;
my new linw for parsing the extended log format is:
($host, $rfc931, $authuser, $timestamp, $request, $status, $bytes,
$ref,
$null1, $null2, $browser, $platform) =
/^(\S+) (\S+) (\S+) \[([^\]]*)\] \"([^"]*)\" (\S+) (\S+)
\"([^"]*)\ (\S+) (\S+) (\S+) (\S+)/;
An example of the logs that I am parsing is:
1Cust105.tnt17.dfw5.da.uu.net - - [20/Jun/1999:11:31:11 -0400] "GET
/home.htm HTTP/1.1" 200 3455 "http://www.dickinson.com/" "Mozilla/4.0
(compatible; MSIE 4.01; Windows 98)"
There is doubtless a mistake in how I have done this. Any ideas as to
wait the problem could be?
Thanks a lot.
Mark S. Avnet
Banta Integrated Media
Cambridge, MA