Buglet in lwp-rget
Sean Slattery (jslttery@gs148.sp.cs.cmu.edu)
Sat, 11 Jul 1998 14:26:30 -0400
Just came across the following buglet. If a page your fetching has and
URL split over a line, like int his case:
<img src=
"pic
.jpg" align="right">
The lwp-rget will try to fetch '%22pic' rather than pic.jpg.
To fix this, edit line 264 and replace the final '\s' with ' \t\r\f',
so it should now read:
$doc =~ s/(<\s*(img|a|body|area|frame)\b[^>]+\b(?:src|href|background)\s*=\s*)(["']?)([^> \t\r\f]+)\3/new_link($1, lc($2), $3, $4, $base, $name, $depth+1)/gie; #"; # help emacs
\s is a synonym for [ \t\n\r\f], but it appears \n is allowed appear
in URLs.
Thanks for a great tool,
S.