
Erik Wien wrote:
Daniel James wrote:
Why should such a string class stop at unicode? Wouldn't it be a good idea to support other encodings? It might be better to have such a class as part of a separate library, probably with 'pluggable' encodings, which would include unicode.
That was the idea behind the "character_set_traits" class in the current prototype. You could just implement the tratis for some other encoding, and you'd be set. The problem though (and in my opinion it's a big one), is that for the encoded_string class (and any iostream implementation based on the same concepts) to be useable at all as a Unicode string class, we would have to include a lot of functionality that is Unicode specific. (Normalization is one example) What would we do with this functionality for Shift-JIS?
I have no idea ;) I know this is a complicated subject, and I'm far from an expert. I was writing about the suggested dyanmic string, 'utf_string', possibly better called 'any_string', or 'encoded_string'. IMO your library should concentrate on unicode (and perhaps encodings that are close enough to unicode), and leave other encodings to other libraries. A dynamicly encoded string class would probably require a different interface, partly for efficiency's sake and partly because of the differences between encodings. Also, it will be more important that it interacts well with other string implementations. Daniel