Re: URL's (was Re: Perl 5 LWP Design)

Martijn Koster (m.koster@nexor.co.uk)
Wed, 08 Mar 1995 08:44:35 +0000


> > I would argue that 'Proxiness' is almost completely outside the
> > URL class - it should be contained ONLY in a socket/communication/
> > connection class.
>
> Or an HTTP::Client class, I guess.
 
> > The URL should not care whether it is a proxy or not. If proxies
> > are in use, then the socket connections should all just go to the
> > proxy host. The only difference for the URL is whether to include
> > the hostname when passed to the server - and that should depend on
> > the whether this is a 'proxy' session.
> > 
> That all sounds very reasonable.

Quite. We're getting to the crunch of it now...

If you were to keep the proxy info separate, the request() method
should be passed a "this should be proxied to such-and-such" argument
(probably the URL of the proxy).  The HTTP implementation should then
use $url->str() or $url->path() in the HTTP request depending on
wether it's a proxy or a normal request. The implementations of all
other protocols should then complain if they are asked to proxy,
because they can't.  This means an extra parameter is required in the
request() method, which may need type checking, extra logic is
required in the HTTP implementation, and extra error checking code
needs to be replicated into all other protocol modules. This is how I
had it first, worked fine.

In an effort to concentrate Proxy handling I tried out an alternative:
have a WWW::URL::Proxy that has a different constructor and a
overrides fullpath(): where a WWW::URL returns '/welcome.html', the
WWW::URL::Proxy returns 'http://blah/welcome.html'.

This class is only used internally by the WWW::Request, which passes a
normal WWW::URL if there is no need to proxy, or a WWW::URL::Proxy is
there is. The HTTP implementation just does $url->fullpath(), and the
right thing happens. Other implementations need not do a thing.

This also works, looks a lot simpler, and concentrates the Proxy code
into the WWW::Request class. Even philosophically you could argue in
favour of this: because WWW proxying is implemented by using the URL
part of a HTTP request, implementing it by using a specialised form of
a URL matches well. At the same time you could argue that the way WWW
proxying is implemented isn't a pure and abstract view, and that a URL
should only ever be a valid URL as in the RFC's, and should never be
derived from.

I'm not 100% convinced either way, but have a nagging feeling the
URL::Proxy method is a neat hack which might bite back in future, and
am leaning to revert to the original scheme. However, I could easily
be convinced either way by a vote or good reasons :-) What do you
think?

If anyone wants to have a peek at these classes, live development (not
a release, we're still talking architecture here) is sometimes going
on at http://web.nexor.co.uk/users/mak/doc/libwww-perl5/. I tend to
update dist.tar.gz when I have something that sort of works

-- Martijn
__________
Internet: m.koster@nexor.co.uk
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster
Telephone: +44 115 9 520576
WWW: http://web.nexor.co.uk/users/mak/mak.html