irritating bug in lwp-rget

unknown@riverstyx.net
Tue, 25 May 1999 18:55:55 -0700 (PDT)


Well, this is a diff against lwp-rget (wrong version of the original,
don't have the same version handy, doesn't matter).

The problem was, parsing the following:

---
        <TD WIDTH=222 COLSPAN=3><A HREF="/members/" onclick="exit=false"
TARGET="_self" onMouseOver='if(document.images) { if (saveImage != null)
{undoDefault(); isMenuAct = true;} Members.src=aButtonMembers.src;
screen.src=toaMembers0.src; } window.status="Members"; return
true;'onMouseOut='if(document.images) { if (saveImage != null)
redoDefault(); Members.src=dButtonMembers.src; screen.src=taaMembers0.src;
} window.status=""; return true;'>


---

It would grab this bit here ( screen.src=taaMembers0.src; ) and turn it
into ( screen.src=http://www.domain.com/taaMembers0.src; ).  that's
because it was treating anything with a "src" as an "img src" and diddling
it.  unfortunately, javascript also has a "src" keyword which is a 
property of images, which can sometimes be an image but can
sometimes be another variable.  anything that's javascript will have a
leading "." so I threw in a not-dot and it's fine.  i guess the downside
is lwp-rget won't go and fetch images that are used in mouseovers, but it
never did that properly anyhow, and at least it doesn't eat the pages.
i'm not on this mailing list (just found it in the pod) so cc: me any
replies, thx.


[silvercash@mars7 cgi]$ diff lwp-rget lwp-rget.old
158c158
< $VERSION = sprintf("%d.%02d", q$Revision: 1.1 $ =~ /(\d+)\.(\d+)/);
---
> $VERSION = sprintf("%d.%02d", q$Revision: 1.19 $ =~ /(\d+)\.(\d+)/);
331c331
<     \b(?:[^.]src|href|background)         # some link attribute
---
>     \b(?:src|href|background)     # some link attribute


---
tani hosokawa
river styx internet