
Edward Diener wrote:
Inevitably a Unicode standard will be adapted where every character of every language will be represented by a single fixed length number of bits.
This was the prevailing thinking once. First this number of bits was 16, which incorrect assumption claimed Microsoft and Java as victims, then it became 21 (or 22?). Eventually, people realized that this will never happen even if we allocate 32 bits per character, so here we are. At 32 bits we can encode all current languages, all extinct languages, Klingon, and still have most the space empty. You might want to read
On 01/19/2011 08:33 AM, Peter Dimov wrote: the Unicode spec which talks clearly about this. If you just read through the end of Chapter 6 you'll have a great overall understanding of Unicode. It's available as a compressed pdf file at: http://www.unicode.org/versions/Unicode5.2.0/UnicodeStandard-5.2.zip Patrick