[boost] Re: [Unicode strings] We're off

4 Apr 2005

      Daniel James wrote:
...
...
That was the idea behind the "character_set_traits" class in the 
current prototype. You could just implement the tratis for some other 
encoding, and you'd be set. The problem though (and in my opinion it's 
a big one), is that for the encoded_string class (and any iostream 
implementation based on the same concepts) to be useable at all as a 
Unicode string class, we would have to include a lot of functionality 
that is Unicode specific. (Normalization is one example) What would we 
do with this functionality for Shift-JIS?
I have no idea ;)
Neither do I. :) That's why I feel it's a dead end.
...
I was writing about the suggested dyanmic string, 'utf_string', possibly
better called 'any_string', or 'encoded_string'.
Actually, it *is* already called encoded_string. I think 
code_point_string would be a more descriptive name, given it's function 
though. I'm not sure what it will end up being.

  IMO your library should
...
concentrate on unicode (and perhaps encodings that are close enough to
unicode), and leave other encodings to other libraries. A dynamicly
encoded string class would probably require a different interface,
partly for efficiency's sake and partly because of the differences
between encodings. Also, it will be more important that it interacts
well with other string implementations.
Yes, that is basically how I am beginning to feel too. After all, since 
Unicode is supported by all major players in the industry, it will (I 
hope) eventually take over for all the encodings in existance today. 
Support for those encodings will therefore not be as important in the 
future, making concentrating on Unicode exclusively a more viable solution.

- Erik

[boost] Re: [Unicode strings] We're off

Erik Wien