Re: Perl 5 Classes for the Web (CGI and libwww)
Tim Bunce (Tim.Bunce@ig.co.uk)
Wed, 15 Mar 1995 15:34:43 +0000
> From: Jack Shirazi - BIU <js@bison.lif.icnet.uk>
>
> > From Tim Bunce <Tim.Bunce@ig.co.uk>
> > > From: Jack Shirazi - BIU <js@bison.lif.icnet.uk>
> > >
> > > > The Mini-Server Concept:
> > >
> > > > The job of the CGI::MiniSvr class is to allow a CGI application to
> > > > fork a new process, sit on a new socket and act as a basic server.
> > > > This allows a mini-server to be dedicated to a particular client.
> > > > The form returned by the CGI application prior to forking will contain
> > > > some links (generally form actions) that refer back to the specific
> > > > (private) mini-server socket port number. Others, such as images,
> > > > could still refer to the main server.
> > > >
> > > > This approach avoids the need to implement complicated state
> > > > save/restore code and simplifies application coding. It also allows
> > > > database transactions to span multiple pages/forms. The CGI::MiniSvr
> > > > class validates the client host and has a built-in timeout mechanism.
> > >
> > > My instant reaction is 'oh dear, I hope this does not take off'.
> > > As I'm sure you know, state information can be kept in hidden fields -
> > > and the 'complicated state save/restore code' should be extremely simple
> > > for the CGI developer if someone does a class for it.
> > >
> > Care to tell me how hidden fields or external state servers will allow you
> > to rollback an Oracle database change made on a previous form ?
>
> (Didn't I hear that there is an Oracle supplied WWW interface?).
It's a toy (no offence to it's authors intended).
> In exactly the same way you plan to do it with the MiniSvr. You either keep
> a list of changes, or maintain the session in the external state server
> and rollback from there. How do you plan to do it with the MiniSvr?
>
I think you're missing the point. I'm talking about relational database
transactions made by a client over a series of forms.
You *cannot* simply 'keep a list of changes' because once a change is
commited other database users may have started using your changed data
and/or your changes may have triggered database procedures. Basically
once commited it's too late! With a process-per-form you'd have to commit.
Using an external server for this type of application is:
a) *much* more complicated form many reasons including the need to
pipe returned query data (how do you send a rows of data, possibly
including image blobs, back from the server to the cgi application?)
b) is only possible for databases which allow one process to manage/juggle
many distinct database connections.
c) has poor performance (latency) under load (unless you make the server
multi-threaded and few if any database vendors support multi-threaded clients).
> > > The troubles with MiniSvr are that:
> > > 1. It scales very badly. One process hanging around per initial query
> > > means that batches of queries will bring the server machine to its knees.
> >
> > On the contrary, the process startup overheads of the traditional
> > approach have a much greater impact. The per-transaction cost of the
> > mini-server approach is very low. The first thing a collegue said when
> > I showed her the first prototype was "wow, that's fast!".
>
> That's the transactions after the initial connection. Obviously subsequent
> transactions are fast to the user since the server has no startup overhead.
> I am talking about the cost to the server _machine_ in having a separate
> extra process hanging around after every time a specific cgi script
> is started.
>
> Say timeout is 20 mins.
> With MinSvr, 5 initiating connections to the original CGI script
> within 20 minutes gives 5 servers. 50 initiating connections
> gives 50 servers. Have a number of CGI scripts which use this,
> and hey presto 'No swap available, cannot start anything - and
> all other processes start thrashing from the lack of swap'.
>
You are assuming a) a long default timeout and b) that everyone orphans
their mini-server. The former is possible, the latter is unrealistic.
The only time a mini-server would be orphaned (for want of a better term)
would be if the user does not follow one of the links on a page produced
by the mini-server. Typically a 'commit' button would commit the changes
to the database and the mini-server would produce a new page and exit.
> > > 2. Many's the time I've sent off a www query, got the result and left it
> > > for a long time before responding to it (sometimes overnight). How long
> > > is the MiniSvr going to wait for?
> >
> > That's an application choice. The default will probably be around 5 minutes.
> > Some applications may set it higher, some lower. Some may change it from
> > dynamically.
>
> Set it to 5 minutes and I would guess that you are going to get
> a lot of mail asking why your URL is 'broken'. Actually, you
> won't get the mail - most people will just not use the service again
> after trying once. Unless you plan on having _very_ simple forms.
>
How many times must I say that the mini-server is only intended for
specific types of application? I won't get mail because the users will
understand what is happening. They will be told. It will be clear.
> > > 3. Security of CGI bin scripts has a bad enough reputation at the moment.
> > > This will make it worse.
> > >
> > Why? Please explain in detail. Remember that the mini-server is started
> > by a full server which can deal with the 'big' authentication issues
> > and 'set the scene' before it starts the cgi script. I'm very happy to
> > add any checks as required but I need hard facts. The mini-server
> > already always re-validates the client ip address.
>
> The mini-server is not started by the full server. It is started by the
> CGI script.
Which is started by the full server. The mini-server forks but doesn't exec.
> What uid is that running as?
The same as the CGI script.
> It starts up an internet socket. Anyone can connect to the socket -
> how are you validating that it was exactly the client that started the
> connection (in general - not for password registered forms only)?
The mini-server currently validates that the same ip address is being
used. Other authentication checks can be added. Within reason any
authentication checks a full server can do a mini-server could also do.
> Each MiniSvr is going to have its own request processing - can
> you guarantee that this is always loophole free? No system accessing
> except where you stated - i.e taint free on all data inputted through
> the socket? Remember that the MiniSvr is a general mechanism - I'd
> be foolish to have mine be
>
> startMiniSvr; $r=readRequest; eval $r;
>
> But someone could do the equivalent without realizing it. HTTPD daemons
> have some of these problems with CGI scripts, but at least they get to
> 'validate' all the input to the machine. You would be bypassing that
> mechanism.
>
This is no different to existing CGI applications. HTTPD daemons don't
'validate' the input in any meaningful sense. If you have a form with
an text field and your cgi script evals the contents of that field
the httpd will not stand in your way.
> > I have never said that the mini-server is a panacea. Quite the
> > opposite. It's horses for courses. This is a horse that has some
> > mileage for me and my collegues right now. It may not be perfect but
> > it's simple, effective and here now. I'm releasing it to others in the
> > hopes that it might be useful to some people.
> >
> > Other people will prefer other mechanisms.
>
> Its not that I prefer another mechanism. Its that the CGI classes being
> created in these groups are going to be used by people all over the place
> with all sorts of experience, and I think we have some responsibility in
> not making it easy to do 'bad' things. If the users want to go out of their
> way to do it - fine. But It shouldn't be made easier by us. And I do
> think this is a bad idea.
>
I intend that the docs for the mini-server make all thse issues very clear.
(I'll base much of the text on this discussion, so thanks :-)
> How many CGI scripts are you going to have on that machine that use MiniSvr?
> How many people are going to use each of those scripts, Tim?
In my case not many, say a hundred processes. I'll let others comment on
their possible applications.
> What happens if the service proves popular - how many simultaneous
> processes are going to be hanging around at any one time?
>
It's a trade-off. Swap-space vs process creation overhead.
Thanks again for the useful discussion.
Regards,
Tim.