Re: Bug report: using HTTP::Request::Common to send a POST request to a CGI.pm-based script
Gisle Aas (gisle@aas.no)
20 Jun 1999 01:55:03 +0200
"Schwartz, Todd" <todd.schwartz@intel.com> writes:
> CGI module version: 2.46
> LWP version: 5.42
> URI::URL version: 5.01
>
> Problem description: I am using HTTP::Request::Common to build a POST
> request with name/value pairs. One of the value strings is a text block
> that contains, among other things, a semicolon. When the request is sent to
> a CGI.pm-based script, the value string containing the semicolon is
> truncated at the position of the semicolon. I have attached two short
> scripts that duplicate this problem (see below).
>
> The cause: CGI treats the semicolon as a name/value pair delimiter (see
> parse_params in CGI), but when the request is built, a semicolon appearing
> in one of the value fields does not get escaped (see query_form in package
> URI::_query). In my opinion, RFC 2396 requires semicolon characters
> appearing in message content - including URI-encoded name/value pairs -- to
> be escaped.
I guess you say this based on the fact that ";" is a "reserved"
character, but as I read RFC 2396 it does not nessesary mean that it
have to be escaped inside the http query components. Quote:
Characters in the "reserved" set are not reserved in all contexts.
The set of characters actually reserved within any given URI
component is defined by that component. In general, a character is
reserved if the semantics of the URI changes if the character is
replaced with its escaped US-ASCII encoding.
I have never seen anything looking like an official specification on
how an 'application/x-www-form-urlencoded' string is to be encoded.
Does anybody have a reference to something?
Some experience with my current Netscape seems to indicate that it
will encode /[^\w.*-]/. I will not object to change URI.pm if people
rely on this fact.
Regards,
Gisle
> I am not sure whether the semicolon is really a valid
> name/value pair delimiter - this is why I included Lincoln in this posting.
>
> This problem does not occur with CGI version 2.42 - only the ampersand is
> used as a separator. I have not tried this with LWP 5.43, but I don't see
> anything in the code that would change this behavior.
>
> Thanks,
> Todd
>
> #!C:/Perl/bin/perl.exe
> # This is the request script
> use HTTP::Request::Common;
> use LWP::UserAgent;
> $ua = new LWP::UserAgent;
> $cgi_uri = "http://localhost/cgi-bin/testcgi.pl";
> $response = $ua->request(POST $cgi_uri, [TEXT=>"This is one clause; this is
> another."]);
> print $response->as_string;
>
> #!C:/Perl/bin/perl.exe
> # This is the CGI script (testcgi.pl)
> use CGI qw/:all/;
> $query = new CGI;
> $text = $query->param('TEXT');
> print header('text/text');
> print "TEXT=$text\n";
>
> Expected output: TEXT=This is one clause; this is another.
> Actual output: TEXT=This is one clause