[PATCH URI/Escape.pm] Problem with the eval-ed REx in uri_escape

Daniel Chetlin (daniel@chetlin.com)
Tue, 5 Sep 2000 07:37:12 -0700


If you try to specify a '/' in the second argument to URI::Escape::uri_escape,
you get a croak because within the eval-ed regex compilation, '/' is used as
the delimiter.

The real answer is to use qr//, because that's all that's really being done
here anyways. The problem with that is that qr// doesn't exist for 5.004, and
MacPerl is still there.

So, I'm not sure of the solution. Enclosed find a patch for URI 1.07 which
uses qr//, as well as a test for the '/' case. Also, in case it helps, here's
a benchmark of pre- and post-patched Escape.pm:
  
  [~] $ perl bench.pl
  Benchmark: timing 65536 iterations of New, Old...
         New:  7 wallclock secs ( 8.06 usr +  0.00 sys =  8.06 CPU) @ 8131.02/s (n=65536)
         Old:  8 wallclock secs ( 8.19 usr +  0.00 sys =  8.19 CPU) @ 8001.95/s (n=65536)
  [~] $ cat bench.pl
  use lib '/tmp';
  use URI::Escape;
  use EscapeOld;
  use Benchmark;
  
  timethese(1 << 16, {
      'Old' => sub { URI::EscapeOld::uri_escape(
                         "http://www.dmoz.org/foo/bar/baz?blarch=bing",
                         "\0-\377")
                   },
      'New' => sub { URI::Escape::uri_escape(
                         "http://www.dmoz.org/foo/bar/baz?blarch=bing",
                         "\0-\377")
                   }
  });
  [~] $

I'm inclined to think that the correct solution is to do MakeMaker stuff so
that those using 5.004 can get it without qr//. If that's deemed ucky, I'd be
happy to put in a new patch that doesn't use qr// but avoids the '/' problem
anyways.

Thanks!

-dlc

--- Escape.pm	2000/09/05 13:31:42	1.1
+++ Escape.pm	2000/09/05 13:40:10	1.2
@@ -124,12 +124,9 @@
     return undef unless defined $text;
     if (defined $patn){
 	unless (exists  $subst{$patn}) {
-	    # Because we can't compile the regex we fake it with a cached sub
-	    $subst{$patn} =
-	      eval "sub {\$_[0] =~ s/([$patn])/\$escapes{\$1}/g; }";
-	    Carp::croak("uri_escape: $@") if $@;
+	    $subst{$patn} = qr/([$patn])/;
 	}
-	&{$subst{$patn}}($text);
+	$text =~ s/$subst{$patn}/$escapes{$1}/g;
     } else {
 	# Default unsafe characters. (RFC 2396 ^uric)
 	$text =~ s/([^;\/?:@&=+\$,A-Za-z0-9\-_.!~*'()])/$escapes{$1}/g;
--- t/old-base.t	2000/09/05 13:57:43	1.1
+++ t/old-base.t	2000/09/05 14:07:36	1.2
@@ -582,6 +582,11 @@
     $url->_expect('as_string',
 		  'http://web/try%20%25?%23%22%20those');
 
+    # try escaping a /
+    $url = new URI::URL uri_escape('http://web/try/this','/');
+    $url->_expect('as_string',
+                  'http:%2F%2Fweb%2Ftry%2Fthis');
+
     my $all = pack('c*',0..255);
     my $esc = uri_escape($all);
     my $new = uri_unescape($esc);