Re: HTML::LinkEx(trac)tor
Fred Douglis (douglis@research.att.com)
Fri, 12 Jul 1996 18:16:32 -0400
My apologies - my last note was apparently due to a red herring. There's still
a bug, or at least something that might be addressed, but most of my
description was way off.
Yes, I had just converted, and perhaps the new library is tripping over
something the old one didn't, but the idea that it wasn't parsing within the
anchor was my confusion based on the other mail and the results I was seeing.
In fact, after I sent the mail I kept at it (far too long :) because I hate
reporting bugs without also reporting the fix, if it's at all possible to
figure it out myself. And after some debugging I determined that the ... in
my example was actually the kicker: the text I was parsing had nonconforming
ALT=[ ]
in it!
Now, it's true that despite this not conforming to the HTML spec, it would be
better for the parser to handle it than to decide the whole thing isn't a tag
after all and then treat it as regular text (then converting it to < form
when outputting it later).
However, I'm not going to try and figure out how to munge the regular
expression to handle this, and hope someone else can figure that part out
(somewhere in the vicinity of Parser.pm line 232...)
Fred Douglis MIME accepted douglis@research.att.com
AT&T Research 908 582-3633 (office)
600 Mountain Ave., Rm. 2B-105 908 582-3063 (fax)
Murray Hill, NJ 07974 http://www.research.att.com/orgs/ssr/people/douglis/