Re: proposed HTTP changes for charset
Francois Yergeau (yergeau@alis.ca)
Thu, 4 Jul 1996 14:56:35 -0500
> From: hardie@merlot.arc.nasa.gov (Ted Hardie)
> Date: Wed, 3 Jul 1996 09:15:12 -0700 (PDT)
>
> As Harald made very clear at the meetings in Montreal, the
> group proposing UTF-8 as a target for new standards recognizes the
> problems associated with an installed base of clients and servers;
As I made very clear in Montreal, I don't care if UTF-8 is the
default or not. It is the blessing of ISO-8859 as a default that I
strongly object, especially when done on the false pretenses of
"current practice" and "backward compatibility".
> Date: Wed, 03 Jul 1996 12:43:31 -0700
> From: "Roy T. Fielding" <fielding@liege.ICS.UCI.EDU>
>
> Note: The reason for "ISO-8859-1" being the default value when
> no charset parameter is provided is due to current practice and
> should not be interpreted as any sort of preference for that
> character set.
Current practice is that there is no default, everything is sent
unlabelled, this is a serious interoperability problem, and this
group's unwillingness to deal with it by simply recognizing that there
is no default is very disappointing.
> Larry's language is appropriate for a deployed protocol
> and for a reasonable transition.
Meaning that the problem will go unsolved for the foreseeable future.
HTTP/1.1 is a new protocol, it mandates a number of new things like
Host:, persistent connections, etc. so that both servers and clients
will require updating. Making charset mandatory is very minor,
especially since it was already there in 1.0 and hence (in principle
at least) understood by clients.
> HTTP 1.1
> is very far along the road at this point and it is not the place to
> consider a sudden, basic shift in assumed character sets.
There is no point in *assuming* character sets, this is the source of
the problem: they must be labelled.
I initially proposed UTF-8 as a default as one forward-looking way out
of the current brokenness, if the WG doesn't want it, fine. But
insisting on ISO-8859-1 is both wrong w/r to current practice and
misguided, IMHO, and *does* indicate a preference for this limited
character repertoire.
If that is the WG consensus, so be it. I have expressed my views,
and will live with a broken, biased standard if that is what comes
out.
I do hope, however, that at least the language on the Warning header
will be corrected. There cannot be the slightest pretense of prior
art in that case, the current language is purely parochial.
--
Francois Yergeau <yergeau@alis.com>
Alis Technologies Inc., Montreal
Tel : +1 (514) 747-2547
Fax : +1 (514) 747-2561