Re: possible bug in HTML::Parser comment handler
Gisle Aas (gisle@activestate.com)
12 Jan 2001 10:45:31 -0800
"Sean M. Burke" <sburke@spinn.net> writes:
> At 11:21 PM 2001-01-11 +0100, Bjoern Hoehrmann wrote:
> >At 15:28 11.01.01 -0500, you wrote:
> >>It seems that the parser is not properly detecting multi-line HTML
> >>comments. I was trying to print out the dtext of a html document and
> >>noticed that comments kept showing up in the output. Upon further
> >>examination, the single line comments were being ignored but ones like
> >>this:
> >>
> >><!--
> >>td {font-family: Arial,Geneva,Helvetica,sans-serif; color: #000000;}
> >>-->
> >
> >Well, the content model of the style element is CDATA, your "comments"
> >may look like comments but they are no comments in HTML and SGML
> >terms. That's not a bug.
>
> I don't see what's wrong with that comment.
From the shape of the text we can guess that the original poster has
left out the fact that the context for this "comment" was a <style>
element. The fact that he says that comment handlers do not work is
also an indication of this.
This is probably what he parsed:
<style>
<!--
td {font-family: Arial,Geneva,Helvetica,sans-serif; color: #000000;}
-->
</style>
A <style> element is parsed in literal (CDATA) mode. No tags are
recognized inside. It is always just text.
The other elements that are parsed like this are <script>, <xmp> and
<plaintext>.
Regards,
Gisle