
Actually, UTF-32 (equivalently UCS-4) *is* fixed-width (as of the Unicode 5.0.0 standard). Page 31 of the standard (chapter 2) says: "UTF-32 is the simplest Unicode encoding form. Each Unicode code point is represented directly by a single 32-bit code unit. Because of this, UTF-32 has a one-to-one relationship between encoded character and code unit; it is a fixed-width character encoding form." - James Michael Marcin wrote:
James Porter wrote:
On a different note, does anyone see a practical use in having (mutable) strings with variable-width character encodings? I can't think of any practical use for them that wouldn't be equally well-served with an array of bytes (like the email MIME-type example).
What encoding would you propose we use that is not variable length?
UTF-8, UTF-16, and UTF-32 certainly are all variable length encodings.
- Michael Marcin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost