Re: [boost] [General] Always treat std::strings as UTF-8

14 Jan 2011

      Dave Abrahams wrote:
...
Let me try asking it differently: how do I program in an environment that 
has both "right" and "wrong" libraries?
There's really no good answer to that; it's, basically, a mess. You could 
use UTF-8 everywhere in your code, pass that to "right" libraries as-is, and 
only pass wchar_t[] to "wrong" libraries and the OS. This doesn't work when 
the "wrong" libraries or the OS don't have a wide API though. And there's no 
standard way of being wrong; some libraries use the OS narrow API, some 
convert to wchar_t[] internally and use the wide API, using a variety of 
encodings - the OS default (and there can be more than one), the C locale, 
the C++ locale, or a global encoding that can be set per-library. It's even 
more fun when supposedly portable libraries use different decoding 
strategies depending on the platform.
...
Also, is there any use in trying to get the difference into the type 
system, e.g. by using some kind of wrapper over std::string that gives it 
a distinct "utf-8" type?
This could help; a hybrid right+wrong library ought probably be able to take 
either utf8_string or non_utf8_string, with the latter using who-knows-what 
encoding. :-)

The "bite the bullet" solution is just to demand "right" libraries and use 
UTF-8 throughout.

Re: [boost] [General] Always treat std::strings as UTF-8

Peter Dimov