Visitors/visits vs Requests (hits)

David Welton (davidw@efn.org)
Fri, 18 Jul 1997 14:54:52 -0700 (PDT)


Hello,

I am going to be developing a web statistics package for my employer,
CKS|Partners, that can be given to clients as a standard statistical
package for their web sites.  I would prefer to use free software and
contribute my modifications back to the community, and it seems as if it
will be ok with CKS.  Yay!  It looks as if wwwstat is the best choice, as
I know perl better than C, which analog is written in (analog is however,
*very* fast). 

To avoid duplication of effort, I have a few questions regarding current
development of this program.

The big advantage that the nasty proprietary software we are currently
using (hitlist for windoze) has is its ability to count visitors and
visits.  Visitors is pretty simple - the total amount of different ip
addresses.  Visits is not so simple - "a visit is a collection of requests
that represent all the pages and graphics seen by a particular visitor at
one time".  Ie, they are using some kind of timeout - if no more requests
from a particular user are received within say 15 minutes, then that
'visit' is over.  This shouldn't be a tremendous amount of work - I think
a hash with the time in it will work.  Optimization might be trickier (but
the yucky windoze program takes more than an *hour*, so anything under
that is good - currently wwwstat takes about 10 minutes for our logs, and
analog takes around 27 *seconds*).

Other issues, of lesser importance, include some sort of database or text
file for storing old information so that old logs can be dispensed with,
tracking how many pages each visit includes, which pages are the final
ones seen during a visit, and similiar things...

Well, I'll leave it at that for now, awaiting your comments.

Thanks,

David Welton   
davidw@efn.org  davidw@freenet.hut.fi  http://www.efn.org/~davidw
Se quest'email e` in Italiano, mi dispiace per gli errori:-) FORZA PANTANI!
			 --Debian GNU/Linux--