Re: HTML::Parse and comment removal
Chris Fedde (cfedde@loupe.ezsrc.mrg.uswest.com)
Thu, 28 Sep 1995 14:35:00 -0600
Gisle,
In message <199509281214.NAA06533@bergen.oslonett.no>you write:
>> Why does the HTML::Parse module throw away comments?
>
>Because they are comments! Comments should not have any semantic meaning.
>
My intention in using HTML::Parse was to re-write an html so that it includes
standard header/footer markup. Historically we have used server side
includes to encapsulate these headers/footers in a file that
can be changed without having to touch all the content files.
I agree with you that the server side mechanism is a rude violation
of the meaning of comments to HTML. Still I am left with the problem.
How do I validate a large document base? My intention was to use
the callback from HTML::Element::traverse to perform my validation.
Of course that was assuming that comments were preserved in the parse tree.
As an alternative to the current model where HTML::Parse::parse_html
returns a full tree. Perhaps a method named HTML::Parse::parse
could take the html raw text as input and use a callback to create
various output for each html "event". In this way HTML::Parse::parse
would act as an interpreter of the html text and a variety of
different callback functions could be used to translate the html
into many different end products. Then HTML::Parse::parse_html
becomes a wrapper around a call to HTML::Parse::parse with a callback
that uses HTML::Element::new and HTML::Element::pushContents to build
a HTML::Element tree.
Other callbacks could be written that watch for specific html
"events" and perform translations in their own way. Such tasks
might include: direct translation to alternative markup languages,
augmentation of existing html by injecting ID or ALT attributes
into existing tags, re-writing URI, Translating to graphic visualization
languages such as graph-vis or DaVinci, and my favorite, processing
HTML comments that contain meta-info for server side parsers and
such.
I'm not really sure that these thoughts are even appropriate to your vision
for the HTML:: class. Please take them in the spirit that they are
offered. I have nothing but the highest respect for these tools and have
found them much more cohesive than any other set that I have yet used.
Have a Good Day
chris