Re: old bug in 5b6

Gisle Aas (aas@bergen.sn.no)
Fri, 19 Jan 1996 12:25:52 +0100


> Just picked up libwww-perl5b6, and I noticed that URI::URL.pm has not
> changed since 5b5. I was wondering if either of the bugs that I
> reported (escaping/unescaping of queries and the "if $query" problem)
> have been fixed in an unreleased version? These problems prevent me
> from doing any serious work with the package, which is terribly
> unfortunate, since I am happy to provide feedback from the
> front-lines.

This bug has not been fixed properly yet.  What we need to do is to store both 
the path and the query part of URL in escaped form, and then just unescape 
these componnents when used. We would also need some new methods to access 
these components.

If I am lucky (with getting the X11 server running on my home Linux system) 
then I probably will have some time to look at this during the next two weeks.

This is a note about the problem written by Martijn and me in a bar in Boston:


-----
Hi all,

Gisle & me have been chatting about what to do next, and what needs
to be done towards libwww-perl 5.0. One of the main stumbling blocks
is the URL handling (surprise! :-). We'd like to fix the currently
broken handling of paths and query-strings. This makes the URL module
even more complicated, but we feel is required to allow complete
conformance to the spec (Can you check this Roy?). The proposals are
included below in pseudocode, we'd appreciate feedback to the
following questions:
- should we be fixing this (when everyone else is broken too)?
- does this break existing code?
- is there a better way (hey, we wrote this under the influence,
  what can you expect :-)?
- anyone like to contribute implementations?

Martijn & Gisle, live from Boston.

# The current (b6) interface doesn't deal with filenames with slashes,
# even though they can exist on the Mac, and can be represented in URLs:

$url = new URI::URL 'http://www.w3.org.';
$url->path('/this/is/a/slash%2ftoo');
# this means file named 'slash%2ftoo'
$url->path() == '/this/is/a/slash%2ftoo';
$url->as_string() == 'http://www.w3.org/this/is/a/slash%252ftoo';


# However, to address this a preferred interface has been added,
# which treats path components as path components :-)

$url->path_components(['a %2f', 'and a', 'slash/too']);
$url->path_components == ('a %2f', 'and a', 'slash/too');
$url->path() == warn "component 'slash/too' contains '/', ".
                        "use pathComponents";
$url->as_string() == 'http://www.w3.org/a%20%252f/and%20a/slash%2ftoo';
$url->localpath() == warn "UNIX doesn't support '/' in filenames"
$url->localpath() == 'a %2f:and a:slash/too'; # on Mac


# The issues with the query part of a URL is that there are two
# uses, which have become intertwined: the use by ISINDEX to
# send keywords, and the use by FORMs to encode fields.
# These really need separate interfaces, but for
# backwards compatibility we retain current behaviour
# (no escaping whatsoever), and add two new methods:

# The user responds to an ISINDEX query with the keywords
# "dog", and "bones"

$url->keywords(['dog', 'bones']);
$url->keywords() == ('dog', 'bones');

$url->fullpath() == ...?dog+bones

# the user submitted a GET form (XXX what about POST?)
# of field 'foo' with value 'bar',
# and field 'perc' with value '10% lower

$url->query_form([['foo' => 'bar'], ['perc' => 'car = 10% lower']]);

$url->fullpath() == '...?foo=bar&perc=car%20%3d%2010%25%20lower';

$url->query() == warn "Can't do this because components contain = or &.'

# XXX now what about someone using both:
$url->keywords(...)
$url->queryForm() == die 'Cannot mix keywords and queryForm';

-- Martijn


-- 
Gisle Aas                                         <aas@a.sn.no>
Schibsted Nett AS                                 http://www.sn.no/~aas/