Re: Perl 5 Classes for the Web (CGI and libwww)

Jack Shirazi - BIU (js@bison.lif.icnet.uk)
Thu, 16 Mar 95 11:10:52 GMT


Other points taken. I'll only address what I see left

> > Say timeout is 20 mins.
> > With MinSvr, 5 initiating connections to the original CGI script
> > within 20 minutes gives 5 servers. 50 initiating connections
> > gives 50 servers. Have a number of CGI scripts which use this,
> > and hey presto 'No swap available, cannot start anything - and
> > all other processes start thrashing from the lack of swap'.
> > 
> You are assuming a) a long default timeout and b) that everyone orphans
> their mini-server. The former is possible, the latter is unrealistic.
> 
> The only time a mini-server would be orphaned (for want of a better term)
> would be if the user does not follow one of the links on a page produced
> by the mini-server. Typically a 'commit' button would commit the changes
> to the database and the mini-server would produce a new page and exit.

I was not assuming a) or b). The timeout is if people don't respond in time.
But if people do respond in time, the processes still stay around until
the entire session is finished ('commit time'). This may be over many transactions
and could let sessions last a long time.

> How many times must I say that the mini-server is only intended for
> specific types of application? I won't get mail because the users will
> understand what is happening. They will be told. It will be clear.

And how many times must I say that once MiniSvr is made available
from the CGI library, people are going to use it for everything it
was not intended for.

> > How many CGI scripts are you going to have on that machine that use MiniSvr?
> > How many people are going to use each of those scripts, Tim?
> 
> In my case not many, say a hundred processes. I'll let others comment on
> their possible applications.

Ugh. This has pretty much made up my mind that if I was sys-admin, any
CGI script with a fork would be banned. One of the nice things
about HTTP is that transactions are generally short and so easily
serializable.


I've distilled my main objections down to two dislikes - 1 is opening
an extra socket to the machine by a CGI script, and 2 is the extra
processes that hang around. There is nothing that can be done about 2
if you insist on the need for a full extra process per session.

To increase the validation for 1, I suggest that starting the MiniSvr
from the CGI script returns a randomly generated session key (in addition
to the URL for the new server. You could use $$ and time to seed rand).
This session key should be returned to the client as a hidden field,
and used to provide an extra validation for requests sent to the MiniSvr.


Other things in general:

1. Transient URL's. Ugly but not serious.

2. An encrypted HTTPD giving an unecrypted MiniSvr is potentially compromising.

3. Connecting multiple incarnations of a CGI script to one server:

CGI Writes to server & CGI triggers server (either way round)
Server reads & processes
Server writes to CGI & triggers CGI (either way round)

Connections can be:
Through a disk based file system (write a file to a queue directory, with the CGI
process id as extension, e.g. QUEUEDIR/data.$$, and other process reads it)
Through a memory based file system (as above using tmpfs)
Through socket connections
Through named pipes (mknod NAME p, then just write to/read from NAME)
Through shared memory

Triggers can be:
A signal (e.g. 'ALRM' or 'USR1')
A 'ready' file (i.e. create QUEUEDIR/dataReady.$$ when finished writing data)
A socket connect/disconnect/transmitted token
A named pipe specifically used for semaphores
Shared memory semaphores


Easiest and quickest to implement is:

CGI script dumps data straight to file QUEUEDIR/data.cgipid, then
creates QUEUEDIR/dataReady.cgipid and goes to sleep.
Server loops to detect any QUEUEDIR/dataReady.* files - when finds any, reads
the corresponding data file, removes the files, processes data. When data is
ready, writes file QUEUEDIR/dataComplete.cgipid wakes GCI process
(kill 'ALRM',cgipid) and goes back to loop.
CGI wakes reads file, removes it and returns data to HTTPD.

Note that since the CGI script is so small and does so little, it has
very little startup overhead - should in fact be very little more than
a fork.

You should use tmpfs to optimize this type of scenario if its available
on your OS.

> Thanks again for the useful discussion.

A pleasure Tim. I haven't had a good on-line argument in ages.