
Daniel James wrote:
That was the idea behind the "character_set_traits" class in the current prototype. You could just implement the tratis for some other encoding, and you'd be set. The problem though (and in my opinion it's a big one), is that for the encoded_string class (and any iostream implementation based on the same concepts) to be useable at all as a Unicode string class, we would have to include a lot of functionality that is Unicode specific. (Normalization is one example) What would we do with this functionality for Shift-JIS?
I have no idea ;)
Neither do I. :) That's why I feel it's a dead end.
I was writing about the suggested dyanmic string, 'utf_string', possibly better called 'any_string', or 'encoded_string'.
Actually, it *is* already called encoded_string. I think code_point_string would be a more descriptive name, given it's function though. I'm not sure what it will end up being. IMO your library should
concentrate on unicode (and perhaps encodings that are close enough to unicode), and leave other encodings to other libraries. A dynamicly encoded string class would probably require a different interface, partly for efficiency's sake and partly because of the differences between encodings. Also, it will be more important that it interacts well with other string implementations.
Yes, that is basically how I am beginning to feel too. After all, since Unicode is supported by all major players in the industry, it will (I hope) eventually take over for all the encodings in existance today. Support for those encodings will therefore not be as important in the future, making concentrating on Unicode exclusively a more viable solution. - Erik