Re: possible bug in HTML::Parser comment handler

Gisle Aas (gisle@activestate.com)
12 Jan 2001 11:09:39 -0800


Bjoern Hoehrmann <derhoermi@gmx.net> writes:

> "Although the STYLE and SCRIPT elements use CDATA for their data
> model, for these elements, CDATA must be handled differently by user
> agents. Markup and entities must be treated as raw text and passed to
> the application as is. The first occurrence of the character sequence
> "</" (end-tag open delimiter) is treated as terminating the end of the
> element's content. In valid documents, this would be the end tag for
> the element."

Note that HTML::Parser does in fact allow "</" inside these CDATA
elements.  You need the complete corresponding end tag to get out of
CDATA mode.  I would say that the "</" rule is pretty stupid.  I can't
find any browser around here that follow it.

Officially this should not work:

   <script language="Perl">
      print "<h1>Hello</h1>\n";
      print "<p>Bla, bla,....";
   </script>

To make this correct the first print statement has to be written
something like:

      print "<h1>Hello<" . "/h1\n";

Regards,
Gisle