URL's (was Re: Perl 5 LWP Design)

Martijn Koster (m.koster@nexor.co.uk)
Mon, 06 Mar 1995 08:47:01 +0000


> Are you sure that URL should be an object?  That seems like an awful
> lot of overhead for a piece of data,

I don't quite know what the overhead is, and given that Larry hasn't
given much thought to optimisation in Perl 5 I'm not sure it's
something we should worry about.

> and the behaviour is not really object-like.  Wouldn't a data
> structure be more appropriate?  I know it would be in C++, but am
> not sure about perl5.

In Perl 5 there appears to be little difference between the two,
an object is just a data structure blessed into a class.

The URL "object" is simply an array, and its "methods" are effectively
mnemonics for individual array locations. The advantage of having it
as an object is that you could add syntax checking to the methods, and
give further object specific methods without having them separate from
the object.

I'v been wondering about URL's and proxies actually. I don't like
$url->proxify really, because routing a specific URL through a proxy
is not actually a different location, so after ->proxify, is it really
still a URL?. It becomes a problem when creating the request: for

	http://web/index.html

you want to create

	GET /index.html HTTP/1.0

whereas for

	http://cache/http://web/index.html

you want to create

	GET http://web/index.html HTTP1.0

ie without the leading slash. But how do you know when to give/hide
the leading slash?

You could either have $url->proxify store a variable IveBeenProxied,
but I wondered about atually using the URL class, and subclass a
specific ProxyURL from it:

	$url = new WWW::URL http://web/index.html
	$proxy = new WWW::ProxyURL $url, 'cache.com', '8001'

or
	$proxy = new WWW::URL 'http://cache:8001';
	$url = new WWW::URL 'http://web/index.html'

	if ($proxy) {
		$url = new WWW::ProxyURL $proxy, $url;
	}

How's that for overhead :-) I actually think it may be not so bad,
as especially in the latter case you only parse the Proxy url once.

Alternatively you could do it completeley outside the URL class, but
that breaks quite a lot of the cleanliness of the internal interface.

-- Martijn
__________
Internet: m.koster@nexor.co.uk
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster
WWW: http://web.nexor.co.uk/mak/mak.html