Re: Further enhancements to the URL module
Roy T. Fielding (fielding@avron.ICS.UCI.EDU)
Tue, 21 Mar 1995 12:25:16 -0800
>> I've taken the liberty of calling it URI::URL :-)
>
> Whatever.
Okay, so long as you don't pre-define names for URI::URN (which doesn't
exist except as just another URL) and URI::URC (which has never been defined).
>> Can we freeze this soon ?
>
> Ehr, I still think the escaping is insufficient, as it doesn't specify
> class-specific reserved character sets. Roy wrote:
>
> ] Each component of a URL has separate requirements regarding
> ] what must be escaped, and those requirements are also
> ] dependent on the URL scheme.
>
> and rfc1738 specifies different sets of reserved chars.
>
> For example, I think the following test should succeed:
>
> $url = new URI::URL 'file://h/test?ing';
> $url->_expect('path', 'test?ing');
>
> because:
>
> fileurl = "file://" [ host | "localhost" ] "/" fpath
> fpath = fsegment *[ "/" fsegment ]
> fsegment = *[ uchar | "?" | ":" | "@" | "&" | "=" ]
Ummm, actually, I believe that to be an error in the RFC. In fact,
I included a NOTE on that in the relative URL spec. However, the principle
is correct in that a path needs to be escaped one fsegment at a time.
> The RFC also says:
>
> | On the other hand, characters that are not required to be encoded
> | (including alphanumerics) may be encoded within the scheme-specific
> | part of a URL, as long as they are not being used for a reserved
> | purpose.
>
> Which leaves my to wonder if the following test should not succeed too:
>
> $url = new URI::URL 'file://h/';
> $url->path('question?mark');
> $url->_expect('str', 'file//h/question?mark');
>
> instead of
>
> $url->_expect('str', 'file//h/question%3Fmark');
>
> Roy, can you tell me if I'm right or wrong here?
Well, that depends on what "path" means. If you know it does not
include the query information, and "?" is not allowed in that URL's
path segments, then it is appropriate to escape it.
.......Roy