Re: Please review new Perl 5 Module List
Martijn Koster (m.koster@nexor.co.uk)
Fri, 24 Feb 1995 18:17:26 +0000
And all of a sudden things move fast :-)
> I don't really like the comments to fill more space than the actual
> code. With proper documentation (pods included in the module files)
> the need for that much comments should go away, but I take your point.
> Some of the comments will come back.
I do agree attributions should be in a README, but design
coments should be in the code.
Also consider that people will use this code as a basis to learn
Perl5/Web programing...
> > One thing I dislike is the exporting of so much into the user name
> > space (e.g. the RC_* constants). The Perl5 Module List suggests using
> > EXPORTS_OK instead of EXPORTS, I'd even quite happily use complete
> > name space specifications (WWW::WC_OK) myself and not export
> > anything. In my library I kept the error messages in an array, which
> > is probably slower, but keeps the names more together (but I'm not
> > fussed).
>
> I agree, but why is WC_* better than RC_*? What does WC stand for?
> This might be ugly if the module eventually will be called WWW::WWW.
That's my typo, sorry; I meant WWW::RC_OK instead of having RC_OK
imported into my namespace. ^^^^^
> > I do miss some of the higher level error management functions I have,
> > for example to translate codes to mnemonics (and error messages).
>
> Do you mean callbacks or some functions to do:
>
> $WWW::RespMessage{WWW::RC_SOMETHING}
>
> stuff.
I need to lookup rc_code->mnemonic. Say I do a get, and receive 600,
then I need to be able to translate that to RC_WHATEVER, if these
mnemonics are used as indices into hashes. This might be another
argument not to use subs to define the constants.
I also want to be able to do WWW::isRedirect(600) etc to group
errors without having to check for all possible values.
> > I'm not sure about the 'tie' interface to MIME::Header and URL
> > though.
>
> I am not sure either. It was just a clever thought.
:-) It's tempting to use all new Perl features under the sun. Do you
agree a simpler and more explicit OO architecture would be easier to
learn for new users?
Especially the URL seems ill-suited for a hash; I expect that in a
hash I can store whatever keys I want (eg a tied GDBM file), whereas
with URL's we are talking about a small set of properties of an
object.
> > With
> > the URL I'd prefer using $url->scheme to $url->FETCH(SCHEME). Maybe
> > I've been programming C++ too long, but I think an OO approach looks
> > more obvious than opaque hashes, and I don't think it buys you
> > anything. This is a farily major point interface-wise :-)
>
> But then we would need two methods for each attribute:
>
> $scheme = $url->get_scheme;
> and
> $url->set_scheme($scheme);
Not really:
sub scheme {
my($this, $scheme) = @_;
my($oldscheme) = $this->{'scheme'};
$this->{'scheme'} = $scheme if (defined $scheme);
return($oldscheme);
}
Apart from looking better when used, these explicit routines might
make it easier to add syntax checking, or disallow certain changes.
For example, one shouldn't go around changing just the scheme part of
a url; so you might want to disallow that.
> I am thinking about change in the WWW::request interface, because I am
> not sure it is a good idea to use the same parameter both for passing
> and returning arguments. Using the same parameter makes it more
> difficult if you want to pass the same headers or content to
> subsequent requests. The downside is that the parameter list gets
> longer. I suggest to change the interface to this:
>
> WWW::request $method, $url, $headIn, $contentIn,
> $headOut, $contentOut, $timeout;
>
> where $headIn and $headOut are references to MIME::Header objects or
> undef if you don't want them.
My gut feeling is to prefer the simpler looking one, but I don't mind
much either way.
> Callbacks is one possibility. Getting or returning the content in
> files is another one.
With a callback you could always implement a file based one, although
I don't see why all three different interfaces couldn't exist.
[He said before reading on :-)]
> For the $content parameter I would like 3 opportunities to exist:
>
> 1) pass a string
> 2) pass a file name (content found in (or written to) this file)
> 3) pass a callback routine
>
> These can be differentiated by these rules:
>
> - if it is a reference to a scalar then the scalar is treated as a
> string. (ref($content) eq SCALAR)
>
> - if it is a scalar (not containing a newline) then it is treated as
> a filename. (!ref($content))
>
> - if it is a reference to a sub then it is treated as a callback
> (ref($content) eq CODE)
>
> What do you think about this??
Like it. Leaves a couple of issues: what about failing files writes?
Should that return a new error code (probably not good), should we
have a shadow error code (v. ugly), or die (can always catch it).
What interface is require for the callback (ie what do we pass it?) I
currently pass &$callback($content, $nread);, but you might want to
pass more... The library might want to supply headers and result
codes, and the user might need some specific data. Maybe you could
pass a magic cookie along so people can always pass their own objects
to store state data the callback might need:
&request(... $code, $magic);
sub request {
...
&$code($content, $nread, $magic, $outheaderref, ...);
More fundamental issue: I split &request into &request/&response.
&request returns with the headers and result code, &response then
reads the rest. This is nice in combination with file/callback
reading; I can read a file into memory if the Content-Length is small,
but read it in chunks writing to a file if it's large. Or I can pass
a "unzipper" callback if the encoding is gzip etc. This could save
an extra copy, and help pipelining encoding/translation operations.
Hello future :-)
Another thought is about proxies. In my library the request/response
functions are offered by a HTRequest object; This allows you to
cleanly set defaults:
my($htrequest) = new WWW::HTTP;
$htrequest->setheader('User-Agent', 'w3p/0.1 libwww-perl5/0.1');
$htrequest->proxy('ftp', 'http://web.nexor.co.uk:8001/');
my($response) = $htrequest->request('GET', $url);
etc. Which is nicer than setting variables in the library, or use %ENV.
Again, you can do some sanity checking.
This means you could have multiple objects, each with their own
defaults. Which brings up another point: the FS socket in the HTTP.pm
prevents multiple simultaneous conversations. With a per HTRequest
indirect file handle this is no problem.
Previously I thought this might not fit well with the different
protocols, but thinking about its fairly straightforward (even if it
does introduce a dependency loop): In your structure this would need
to be "my($htrequest) = new WWW; ", and means that WWW.pm will have to
pass a reference to itself to the individual request implementation
for the different protocols:
eval {
$rc = &{$request{$scheme}}($this, $method, $url, $headers,
$content, $timeout);
};
I think it's worth the effort if it allows us to create an interface
that will cater for future needs.
Comments?
-- Martijn
__________
Internet: m.koster@nexor.co.uk
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster
WWW: http://web.nexor.co.uk/mak/mak.html