Perl 5 Classes for the Web (CGI and libwww)

Tim Bunce (Tim.Bunce@ig.co.uk)
Tue, 14 Mar 1995 17:07:37 +0000


This message started out as a response to a message on the
CGI-perl@webstorm.com list but it grew into something more general.
I have included libwww-perl@ics.uci.edu to let them know what is
happening and get their input on URL and HTTP classes.


	I hope we don't get bogged down in small issues.
	It's the big picture we need to get into focus right now.


Introducton:

The World Wide Web standards are split into several parts:
 
    HTML     - Markup
    HTTP     - Transfer
    CGI      - Server to external program interface (for client requests)
    URI/URL  - Addressing
 
I believe that World Wide Web software should reflect these divisions
by implementing general classes (and maybe specialised subclasses) for
each area. Note that most of these areas have producer/consumer splits
which complicates the issues slightly.
 
So far we have only seen a few libraries which have an add-hoc
selection of functions addressing a variety of areas. CGI libraries
typically have code addressing CGI (env vars), HTTP (headers), HTML
(parsing QUERY_STRING) and URI/URLs! This is not an ideal foundation
for the future.

There also seems to be a fear of splitting up code into smaller parts.
Classes should be small. Individual methods should be smaller still
(for some definition of small :-).


The New CGI::* Classes:

The CGI is an 'interface'. The interface defines a protocol/syntax
(using those terms in the broadest sense). It overlaps slightly into
HTTP but does not overlap HTML. I see a need for an object to represent
and encapsulate (hide) that interface and provide methods to 'pull data
across' the interface into the local (perl) dataspace in a defined manner.
 
This is exactly what the CGI::Base class does. The CGI::Base module knows
nothing about HTML. It does have some knowledge of HTTP and URL's but
I'd like to see that factored out into separate classes at some point.

Once you've got the data from the interface you need to be able to
understand what it means. This is the job of the CGI::Query module.
This module uses a CGI::Base object (or subclass) to get the data.
It has no knowledge of how that is done. It then uses a little
knowledge of HTTP to break-up the query (this should also be factored
out into a separate class at some point). The CGI::Query module then
builds application friendly data structures which represent the query.
It also provides methods to write the values directly into perl
variables in a named package thus avoiding the need to say $q->{...}
all the time. Note that CGI::Query is *not* a subclass of CGI::Base.
 
The two classes above will satisfy most CGI script writers needs for
data _input_. (We have not addressed HTML generation in any depth yet
and I'd rather not just now. Maybe next week.)


The Mini-Server Concept:
 
In addition to the CGI::Base class I've also implemented a CGI::MiniSvr
class (which is a subclass of CGI::Base). Note: This may only be of
interest to a minority and can safely be ignored by anyone who wants to.
 
The job of the CGI::MiniSvr class is to allow a CGI application to
fork a new process, sit on a new socket and act as a basic server.
This allows a mini-server to be dedicated to a particular client.
The form returned by the CGI application prior to forking will contain
some links (generally form actions) that refer back to the specific
(private) mini-server socket port number. Others, such as images,
could still refer to the main server.
 
This approach avoids the need to implement complicated state
save/restore code and simplifies application coding. It also allows
database transactions to span multiple pages/forms. The CGI::MiniSvr
class validates the client host and has a built-in timeout mechanism.
 
The CGI::MiniSvr class is small and simple. It just overrides a few
methods from the CGI::Base class. It does *not* try to be a full
server. Anything an application doesn't want to handle can be passed on
to a main server with a single method call. The MiniSvr is not a
panacea but it does provide an excellent mechanism for some applications.


The libwww-perl work:

I've sent this to libwww-perl because there are common classes which
we both need. Specifically URI/URLs and HTTP. I hope the two groups
(CGI-perl@webstorm.com and libwww-perl@ics.uci.edu) can cooperate in
the definition and implementation of these clases.

I'd appreciate it if someone could advise us of the state of any work 
going on in these areas or any other related issues.

I currently see libwww-perl as being primarily client-side code.
Is that a fair perspective?


That's enough preaching for now. I look forward to your comments.
(... now where did I put that flame-proof suit ... ;-)

Regards,
Tim Bunce.