Re: Wishlist for libwww-perl...

Roy T. Fielding (fielding@simplon.ICS.UCI.EDU)
Wed, 13 Jul 1994 05:30:53 -0700


Brooks wrote:

> I've recently started work on a WWW browser written in tkperl, and I
> am using libwww-perl to retrieve documents, etc..  Listed below are
> some enhancements that I plan to make to the library.. (not this week,
> but in the next few weeks..)

Wow, that'll be an adventure.  Most of the tools built so far have been
maintenance drones that don't care about interactive issues.  It'll
be interesting to see how that effects the interface.

> just to be clear - these are things I plan to hack into the library,
> and don't want or expect anyone else to be doing these - this is just
> a FYI to the list as to what I plan on writing (and suggesting for
> inclusion into the package) and would be interested in hearing
> anyone's thoughts or if anyone else has similar needs..
> 
> - Connection and Transfer status messages
> 
> So I can make my browser like Mosaic.  I'd like to submit a
> (optional) routine to the package that will be called whenver
> there is anything interesting to report.  "Interesting"
> would be something like:
> 
> Proxy connection established to proxy.host.com

no problem

> Connection established to www.ncsa.uiuc.edu

no problem

> Retrieving file /somefile.foo

no problem

> 123 bytes of 456 transferred...

big problem :(  Currently the library just slurps the input all at once
(it's much faster that way) and lets perl handle the read blocks, etc.
Mosaic knows that information because it is performing reads at a very
low level.  To do this, you will have to change how the input works
(and probably the interface as well).

>...

> - Interruptable I/O
> 
> Like Mosaic, the ability to abort a transfer after it's been started.
> I'm not sure how this is implemented in Mosaic - I figure it uses
> alarm and catches SIGALRM - where it spins the icon and checks for mouse
> input..

This should not be a problem -- just send a SIGALRM, since the ALRM
handler is set up before the socket connection is made.  The returned
response will look like a Timed Out message.

> - Ability to load data to local file rather than pass back in $content
> 
> I want to limit how big (in memory) my browser will grow, and I'm worried
> that someone will pull down a 20mb mpeg via HTTP and perl has to allocate
> all that memory... and while it will page out memory that isn't used,
> there is a problem with fragmentation of memory..

Yep, that is a fatal flaw with the current architecture -- it is designed
for retrieving HTML files rather than huge stuff.  Fortunately, that is all
that is needed for most web tools.

> I'd like to be able to pass it a filename (like /tmp/$$) or filehandle 
> and have it write the data out to it... That way, for non-text files
> I can spawn a program (from .mailcap) and pass the file to that program..
> For HTML/Text files, I can read/parse it a line at a time...

Hmmm...that sounds more like a streams interface.  I think CERN's libwww
is supposed to be set up like that, but it's difficult to tell with all
their macros.  The only problem is that the interface between the client
and library becomes much more complex.

It'll be interesting to see what you come up with.  If the changes are
massive, we may want to include two separate interfaces to the library
or a simple shell-like interface overlayed on top of the more complex one.

.......Roy