Re: Parsing of comments in LWP
Andrew McRae (mcrae@internet.com)
Mon, 13 May 96 21:55:21 -0400
[Chet Murthy wrote: ]
>The idea I have is to create a new node, with tag "!", witha single
>attribute, "TEXT". Then, the "starttag" function would emit the text
>directly for this node.
[ ... ]
>How does that sound?
Very unpleasant indeed. I believe that putting these "comment elements"
in the parse tree is an unmitigated Bad Idea. It's also likely to break
code which can perfectly well handle a tree of ordinary HTML::Element
objects.
(Well, you asked. :-> )
A better approach would be think of comment text as being a property of
the element which encloses the comment. You could subclass HTML::Element
to add a comment() method, which gets/sets comment text associated with
an element; then subclass HTML::Parse to make it store comment text in
this way. The text of multiple comments enclosed in the same element
could just be concatenated. (Or, if that's not good enough, just make
comment() return an array ref.)
How does that sound?
I haven't (yet) paid close attention to the guts of HTML::Parse, but two
other thoughts come to mind:
* You'd probably have to do some careful work to preserve its
"incremental parse" ability
* This might be a good time to fix its parsing of tags which contain the
">" character.
Cheers,
Andrew.
--
Andrew McRae <mcrae@internet.com>
The Internet Company