[boost] Re: [Unicode strings] We're off

17 Mar 2005


      Miro Jurisic wrote:
...
Here I also agree. Having multiple string classes would just force everyone to 
pick one for, in most cases, no good reason whatsoever. If I am writing code 
that uses C++ strings, which encoding should I choose? Why should I care? 
Particularly, if I don't care, why would I have to choose anyway? More than 
likely, I would just choose the same thing 99% of the time anyway.
If we went with an implemetation templated on encoding, I would suggest 
simply having a typedef like todays std::string, let's say "typedef 
encoded_string<utf16_tag> unicode_string;", and market that like "the 
unicode string class". Users that don't care, would use that and be 
happy, possibly not even knowing they are using some template 
instansiation. Advanced users could still easily use one of the other 
encodings, or even template their code to use all of them if found 
neccesary. But then, like I have said, you wouldn't have 
functions/classes that are encoding independent without templating them.
...
I believe that the ability to force a Unicode string to be in a particular 
encoding has some value -- especially for people doing low-level work such as 
serializing Unicode strings to XML, and for people who need to understand time 
and space complexity of various Unicode encodings -- but I do not believe that 
this justifiable demand for complexity means we should make the interface harder 
for everyone else.
I agree. But having a templated implementation, would not mean a complex 
interface for the end user. It would probably be simpler than the 
current implementation, since you could loose all the encoding setting 
and getting. Especially if we for for the above mentioned typedef, to 
remove the template syntax for the casual user.
...
I do, however, think that some people are going to feel that they need to 
eliminate the runtime overhead of generalized strings and explicitly instantiate 
strings in a particular encoding, and I don't know whether the library currently 
provides a facility to accomplish this.
It doesn't currently. But it would be pretty simple to create an 
implementation that allows that through use of the encoding_traits 
classes. I have done that before, and could probably use most of that 
code again if we were to include that.

- Erik

[boost] Re: [Unicode strings] We're off

Erik Wien