Re: Entities in HTML::Parse
Gisle Aas (aas@oslonett.no)
Sun, 05 Nov 1995 11:17:16 +0100
> Hi,
>
> I introduced a new control variable for HTML::Parse. It is called
> $HTML::Parse::KEEP_ENTITIES. Its purpose is to disable the decoding
> of entities while parsing, so that you can do this more safely:
>
> print parse_htmlfile("something.html")->asHTML;
>
> What do you think?
Perhaps it is better to fix HTML::Element::asHTML()??
Regards,
Gisle
>
> Test:
> ----
> #!/bin/perl
>
> use HTML::Element;
> use HTML::Parse;
>
> $text = <<EOT
> <HTML>
> <BODY>
> <P> <text in angle brackets>
> </BODY>
> </HTML>
> EOT
> ;
> # $HTML::Parse::KEEP_ENTITIES = 1;
> print "original:\n$text\n";
> print "as parsed and output by libwww:\n";
> print parse_html($text)->asHTML;
>
> Patch:
> ----
> *** Parse.pm.org Tue Oct 31 13:38:28 1995
> --- Parse.pm Wed Nov 1 16:51:58 1995
> ***************
> *** 61,66 ****
> --- 61,71 ----
> all you want is to examine the structure of the document. Default is
> false.
>
> + =item $HTML::Parse::KEEP_ENTITIES
> +
> + Setting this variable to true will disable the expansion of entities.
> + Default is false.
> +
> =back
>
> =head1 SEE ALSO
> ***************
> *** 96,101 ****
> --- 101,107 ----
> $IMPLICIT_TAGS = 1;
> $IGNORE_UNKNOWN = 1;
> $IGNORE_TEXT = 0;
> + $KEEP_ENTITIES = 0;
>
>
> # Elements that should only be present in the header
> ***************
> *** 249,255 ****
> die "This should not happen";
> }
> # expand entities
> ! HTML::Entities::decode($val);
> } else {
> # boolean attribute
> $val = $key;
> --- 255,261 ----
> die "This should not happen";
> }
> # expand entities
> ! HTML::Entities::decode($val) unless $KEEP_ENTITIES;
> } else {
> # boolean attribute
> $val = $key;
> ***************
> *** 401,407 ****
> $pos = $html unless defined($pos);
>
> my @text = @_;
> ! HTML::Entities::decode(@text) unless $IGNORE_TEXT;
>
> if ($pos->isInside(qw(pre xmp listing))) {
> return if $IGNORE_TEXT;
> --- 407,413 ----
> $pos = $html unless defined($pos);
>
> my @text = @_;
> ! HTML::Entities::decode(@text) unless ($IGNORE_TEXT || $KEEP_ENTITIES);
>
> if ($pos->isInside(qw(pre xmp listing))) {
> return if $IGNORE_TEXT;
>
>
>