CGI one-shot program -> CGI server for better performance

Mark-Jason Dominus (mjd@plover.com)
Sun, 2 Jun 1996 19:24:55 -0400 (EDT)


I've sometimes had performance problems with CGI scripts written in
Perl.  The scripts would run fairly quickly once they got started, but
the startup costs were very high.  The costs to load Perl into memory
and compile the script for every HHTP request often far outweigh the
costs of actually running the script.

I thought I'd look into the following possible solution to this:

Instead of the CGI script being started for each request by the HTTP
server, and receiving the CGI inputs via stdin, the script will be in
a network `application server' program that waits for CGI requests to
come over the network.  The server will sit in an infinite loop,
accepting a request, serving it, and then looping again.

Instead of running the actual application, the HTTP server will start
a little tiny network client (`microgateway') which will read the CGI
inupts from the HTTP server and send them over the network to the
application server.  The application server will send back its HTML
output or HHTP response, and the microgateway will pass this response
unchanged to the HTTP server.

If the microgateway starts up a lot more quickly than the perl CGI
script would have, then we win.  I don't know whether or not we'll
win, but it seems like it's worth a try.

Software you'd need to do this:

	1. Microgateway program, probably written in native machine
	   code or something that compiles to native machine code.

	2. Kit for turning one-shot CGI programs into application
	   servers, including drop-in replacements for CGI.pm,
	   CGI::Base.pm, cgi-lib.pl. These replacements will set up
	   the network listening socket, accept client connections,
	   read the CGI input from the network instead of from stdin,
	   and redirect stdout to the network client.

In addition to possibly reducing startup costs, this system might
offer the following advantages:

	* If the CGI script accesses a database or other sharable
	   resource, you don't need to worry about concurrency control
	   any more; it's been automatically centralized.

	* The databases or other sharable resources no longer need to
	   reside on the same machine as the HTTP server; it's trivial
	   to move them around.

My questions:

	1. Is there an obvious reason why this isn't worth investigating?

	2. Has anyone already done this?

If the answers are both `no,' I'll have my intern work on it as his
next project.

Mark-Jason Dominus 	  			               mjd@plover.com