Re: HTML::Entities
Gisle Aas (gisle@activestate.com)
11 Apr 2001 10:25:50 -0700
Gisle Aas <gisle@ActiveState.com> writes:
> Given this quick survey, I think it would be unwise to just add it to
> HTML::Entities unless we can make it so that it only affects decoding.
> It seems more correct to continue to encode ' as '
FYI, I just checked in the following patch:
Index: lib/HTML/Entities.pm
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/libwww-perl/html-parser/lib/HTML/Entities.pm,v
retrieving revision 1.21
retrieving revision 1.22
diff -u -p -u -r1.21 -r1.22
--- lib/HTML/Entities.pm 2001/02/23 07:07:01 1.21
+++ lib/HTML/Entities.pm 2001/04/11 17:22:45 1.22
@@ -85,6 +85,7 @@ require HTML::Parser; # for fast XS imp
'gt' =3D> '>', # greater than
'lt' =3D> '<', # less than
quot =3D> '"', # double quote
+ apos =3D> "'", # single quote
=20
# PUBLIC ISO 8879-1986//ENTITIES Added Latin 1//EN//HTML
AElig =3D> '=C6', # capital AE diphthong (ligature)
@@ -349,6 +350,7 @@ require HTML::Parser; # for fast XS imp
while (my($entity, $char) =3D each(%entity2char)) {
$char2entity{$char} =3D "&$entity;";
}
+delete $char2entity{"'"}; # only one-way decoding
=20
# Fill inn missing entities
for (0 .. 255) {
Index: t/entities.t
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/libwww-perl/html-parser/t/entities.t,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -p -u -r1.3 -r1.4
--- t/entities.t 1997/09/05 09:00:06 1.3
+++ t/entities.t 2001/04/11 17:22:46 1.4
@@ -1,6 +1,6 @@
use HTML::Entities qw(decode_entities encode_entities);
=20
-print "1..8\n";
+print "1..9\n";
=20
$a =3D "Våre norske tegn bør æres";
=20
@@ -65,6 +65,10 @@ print "not " unless decode_entities("abc
"abc&def&ghi&abc;&def;";
print "ok 8\n";
=20
+# Decoding of '
+print "not " unless decode_entities("'") eq "'" &&
+ encode_entities("'", "'") eq "'";
+print "ok 9\n";
=20
=20
__END__