Re: HTML::Entities
Gisle Aas (gisle@activestate.com)
11 Apr 2001 09:52:58 -0700
Robin Berjon <robin@knowscape.com> writes:
> I bumped into a problem today using the HTML::Entities module. I'm dealing
> with some XHTML into which I insert hidden input fields, no rocket science
> there. In order to protect the content of the fields, I'm encoding them.
> The problem occurs because the XHTML uses ' (') as attribute value
> delimitres -- legal in XML -- but HTML::Entities doesn't encode those by
> default. In fact, it doesn't seem to know about '
The reason HTML::Entities doesn't know about ' is that it's not
mentioned in the HTML specs:
http://www.w3.org/TR/html4/charset.html#entities
http://www.w3.org/TR/html4/sgml/entities.html
It is part of XHTML, because it is part of XML.
A quick test with some HTML browsers I had access to reveals:
Netscape 4.76 don't know about it
Netscape 6 does
Konqueror 1.9.8 doesn't known about it
Lynx 2.8.3 decoded it as ` instead of '
Given this quick survey, I think it would be unwise to just add it to
HTML::Entities unless we can make it so that it only affects decoding.
It seems more correct to continue to encode ' as '
> It's not a big problem for me as I know how to work around it, and I was
> inches away from submitting a patches, but I was wondering if there was a
> good reason why you hadn't included "'" in the list of default encoded
> characters ? I believe it belongs there with '"', the latter being in the
> list precisely because of attribute values, which can be delimited by both.
But HTML spec only mentions '"' so I think it makes sense to stick
with it for now. Especially if we continue to encode ' as '.
Regards,
Gisle