proposed HTTP changes for charset
Larry Masinter (masinter@parc.xerox.com)
Tue, 2 Jul 1996 18:38:02 PDT
I suggest making the following change, which is less controversial
than the "charset=unknown" proposal:
Current HTTP/1.1 spec:
> The "charset" parameter is used with some media types to define the
> character set (section 3.4) of the data. When no explicit charset
> parameter is provided by the sender, media subtypes of the "text" type
> are defined to have a default charset value of "ISO-8859-1" when
> received via HTTP. Data in character sets other than "ISO-8859-1" or its
> subsets MUST be labeled with an appropriate charset value.
My proposal:
< The "charset" parameter is used with some media types to define the
< character set (section 3.4) of the data. Origin servers SHOULD
< include an appropriate charset parameter for those media types which
< allow one (including text/html and text/plain) to avoid ambiguity.
< In the absence of a charset parameter, the default charset value MAY
< be assumed to be "ISO-8859-1" when received from a HTTP/1.1 server.
< Unfortunately, some HTTP/1.0 clients do not properly deal with
< explicit charset parameters for text/html data, and some HTTP/1.0
< server sites send no charset parameter, even when the charset of the
< data is not ISO-8859-1. For compatibility with older clients and
< servers, implementations may need to be careful when communicating
< with older versions, by not sending a charset parameter when the
< data is ISO-8859-1, and by allowing local configuration when
< recieving unlabelled data from HTTP/1.0 servers.
This establishes a convention that charset SHOULD be sent, but lays
out some of the compatibility constraints during the transition
period. Is this sufficient?
Larry