Re: libwww, a different approach

Roy T. Fielding (fielding@avron.ICS.UCI.EDU)
Wed, 15 Mar 1995 03:17:50 -0800


> My class hierarchy was completely different and I feel it has several 
> advantages over the hierarchy adopted by libwww-perl, but the main one 
> being that protocols are loaded on demand and nothing about any protocol
> is 'hard-coded' into any package. This made the adding of a protocol as 
> simple as writing it and placing the file in the correct place.

That is how the perl4 libwww-perl is designed as well.

> My other goal was to write a few specific packages as possible and use those 
> from perl5-porters (eg IPC::Chat, Net::FTP etc) which some people will know
> that I have shown a great interest in the development of these.

Yep, mine too.

> Here is a desctiption of the main WWW classes/packages I used
[...]
> WWW::Scheme
> 
>   This is the base package for all the schemes.
>   The package performs the following tasks:
> 
>    - Locates and AUTOLOAD's schemes when required
>    - Provides a method (SchemeClass) for scheme registration, all inherited
>       schemes should call this. This is similar to nTk::Widget
>    - Keeps track of all loaded schemes
>    - Provides user interface for all schemes to the WWW requests get,put etc.
>    - Provides base functions for schemes (GET,PUT etc) which all call fail
>    - Provides a fail method for schemes to inherit

Excellent -- that's (essentially) how the original works as well.
Does it depend on knowing the "base functions" names in advance?
These should really be scheme-dependent.

> WWW::URL
> 
>   This package is the base URL object which inherits from  WWW::Scheme
>   and adds the following
>     - URL Creation via WWW::URL->new(...);
>     - useful functions and variables to aid parsing
>     - url_encode & url_decode functions

Why inherit from WWW::Scheme?  Does it also inherit Scheme's methods?
Ah, I see that it does (below).

> WWW::Proxy
> 
>   Provides method to perform proxy-ing
>     proxy() Given a url return a reference to a WWW::URL object which
>             represent the server to be connected to. Could be just the URL
>             passed in

I would have expected that to be in Scheme -- the proxy check needs to
be done before autoloading.  Hmmm, I'm probably missing something here.

> WWW::http WWW::ftp WWW::mailto WWW::telnet WWW::url WWW::file
> 
>   Implementations of specific WWW protocol schemes. All inherit from WWW::URL
>   and some from WWW::Proxy
> 
> Current Inheritance tree
> ========================
> 
>                     WWW::Scheme
>                         |
>                     WWW::URL     
>                         |        
>       +----------+------+----+   
>       |          |           |   
>   WWW::file  WWW::mailto     |  WWW::Proxy
>                              |      |
>            +---------+-------+--+---+-----+
>            |         |          |         |
>        WWW::url  WWW::http  WWW::ftp  WWW::telnet  

Very interesting -- I've never considered separating proxyable from
non-proxyable.  Actually, come to think of it, there is no reason that
distinction should be made -- a local proxy is capable of proxying both
file and mailto (the latter as a request->returned form->POST).

> The interface required by each scheme is
> 
>  ->parse($url)  parse the URL into segments
>  ->stringify    reconstruct the URL
>  ->GET($object, ...)
>  ->PUT($object, ...)
>    etc          The methods are called on the server object and the requested
>                 object passed as the first parameter. They will be the same if
>                 proxy-ing is not being performed.
> 
> How it works
> ============
> 
> A new URL is created by
> 
>   WWW::URL->new('ftp://ftp.ti.com/pub/');
> 
> WWW::URL->new extracts the scheme from this url and calls the WWW::Scheme
> method ->scheme to set the scheme. This setting involves locating the package
> WWW::scheme (eg WWW::http), blessing the object into this package and calling
> the method parse.
> 
> Each package is responsible for parsing its own url as there is no generic
> format for a url (eq mailto:bodg@tiuk.ti.com and http://....). Each package
> ,for the same reasons, is also responsible for stringifying the url.

Hmmmmm...I'll contest that.  It is okay (in fact, necessary) to have a
generic URL parser -- the schemes just have to know how to stringify the
appropriate parts prior to a request.  BTW, where do you put the resolver
for relative -> absolute URLs?

> The user then call methods on this object to retreive/send the object.
> 
> the user interface functions (get,put etc) call the method proxy which returns
> a WWW::URL object to represent the server to connect to and scheme to use.
> This could be the same URL as the object. the appropriate method (GET,PUT etc)
> is then called on the server object passing the object as the first argument.
> 
> The main advantage I found is that if an unknown scheme it specified it
> is blessed into the package WWW::url. This package does not define any methods
> except parse and stringify but it does allow for any currently unknown scheme 
> to be proxied via a different scheme (eq http)

That is what I would expect Scheme to do.

> The methods GET PUT etc create a MIME message and set the content and response
> via methods ->content and ->status. Then return the contents or undef.

A MIME message?  What does this mean for protocols that don't use MIME
messages?

> If people are interested in seeing what I have done then I will put it on a 
> ftp site

Please do -- the more examples, the better.  I could also put it in the
libwww-perl contrib directory if you are short on space/accessability.


......Roy Fielding   ICS Grad Student, University of California, Irvine  USA
                                     <fielding@ics.uci.edu>
                     <URL:http://www.ics.uci.edu/dir/grad/Software/fielding>