Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]

20 Jan 2011


      On Thu, 20 Jan 2011 00:05:47 -0800
Patrick Horgan <phorgan1@gmail.com> wrote:
...
...
...
Inevitably a Unicode standard will be adapted where every character
of every language will be represented by a single fixed length
number of bits. [...]
I'm no Unicode expert, but the reason this hasn't happened might be
combinatorial explosion. In which case it might never happen. But I
could well be wrong. And I hope I am, the design you outline is
something I'd love to see.
It's already here and has been for a long time.  That's just UCS
encoded as UTF-32. [...]
The problem, in my uninformed view of it, is the idea of combining
characters. Any time you can have a single character that requires more
than one code-point, you can't assume that a fixed number of bits will
be able to represent every character.

I may be wrong, and I hope I am. If a character is guaranteed never to
consist of more than X code-points, it would be simple to offer a
fixed-width character type, even if the width is huge by comparison to
the eight-bit char type. But from what I've seen, I don't think that's
the case.
-- 
Chad Nelson
Oak Circle Software, Inc.
*
*
*