
On Thu, 20 Jan 2011 00:05:47 -0800 Patrick Horgan <phorgan1@gmail.com> wrote:
Inevitably a Unicode standard will be adapted where every character of every language will be represented by a single fixed length number of bits. [...]
I'm no Unicode expert, but the reason this hasn't happened might be combinatorial explosion. In which case it might never happen. But I could well be wrong. And I hope I am, the design you outline is something I'd love to see.
It's already here and has been for a long time. That's just UCS encoded as UTF-32. [...]
The problem, in my uninformed view of it, is the idea of combining characters. Any time you can have a single character that requires more than one code-point, you can't assume that a fixed number of bits will be able to represent every character. I may be wrong, and I hope I am. If a character is guaranteed never to consist of more than X code-points, it would be simple to offer a fixed-width character type, even if the width is huge by comparison to the eight-bit char type. But from what I've seen, I don't think that's the case. -- Chad Nelson Oak Circle Software, Inc. * * *