Re: Visitors/visits vs Requests (hits)
David Welton (davidw@efn.org)
Mon, 21 Jul 1997 10:01:06 -0700 (PDT)
On Sat, 19 Jul 1997, Mike Whitaker wrote:
> On 18/07/1997 10:54 pm, David Welton said:
>
> >To avoid duplication of effort, I have a few questions regarding current
> >development of this program.
>
> Time for wwwstat-dev@ics, and the SSH/CVS tree?
Are there really that many people working on it? I just arrived, so to
speak, and know relatively little about the current status of things.
> >The big advantage that the nasty proprietary software we are currently
> >using (hitlist for windoze) has is its ability to count visitors and
> >visits. Visitors is pretty simple - the total amount of different ip
> >addresses. Visits is not so simple - "a visit is a collection of requests
> >that represent all the pages and graphics seen by a particular visitor at
> >one time". Ie, they are using some kind of timeout - if no more requests
> >from a particular user are received within say 15 minutes, then that
> >'visit' is over.
> >This shouldn't be a tremendous amount of work - I think
> >a hash with the time in it will work.
>
> Methinks an early step perhaps needs to be a little Perl library for
> doing maths on times in a logfile (including timezone stuff, and concepts
> like 'yesterday', 'last Monday', 'last month','a day ago'). I'm working
> on cleaning up mine.
Hmm yes... sounds interesting...
My original idea (without looking at the existing source much) is a hash
that contains an array. The first value of this array could be a time,
with subsequent elements being documents seen. For every log file line,
an if test could be run to see if the time has incremented, in which case
one would loop through our hash to see what has 'expired'. For each one,
the 'visits' counter gets incremented (hmm maybe an array with the unix
date in it..). Additional calculations might include a hash containing
the most common 'last pages'.
> >tracking how many pages each visit includes, which pages are the final
> >ones seen during a visit, and similiar things...
>
> I'd love this!
@pages would be taken from the length of our hash array, and the last page
from the array could be put in another array.
These are just initial brainstorms.. now I start looking at the code...
David Welton
davidw@efn.org davidw@freenet.hut.fi http://www.efn.org/~davidw
Se quest'email e` in Italiano, mi dispiace per gli errori:-) FORZA PANTANI!
--Debian GNU/Linux--