
Dave Abrahams wrote:
Let me try asking it differently: how do I program in an environment that has both "right" and "wrong" libraries?
There's really no good answer to that; it's, basically, a mess. You could use UTF-8 everywhere in your code, pass that to "right" libraries as-is, and only pass wchar_t[] to "wrong" libraries and the OS. This doesn't work when the "wrong" libraries or the OS don't have a wide API though. And there's no standard way of being wrong; some libraries use the OS narrow API, some convert to wchar_t[] internally and use the wide API, using a variety of encodings - the OS default (and there can be more than one), the C locale, the C++ locale, or a global encoding that can be set per-library. It's even more fun when supposedly portable libraries use different decoding strategies depending on the platform.
Also, is there any use in trying to get the difference into the type system, e.g. by using some kind of wrapper over std::string that gives it a distinct "utf-8" type?
This could help; a hybrid right+wrong library ought probably be able to take either utf8_string or non_utf8_string, with the latter using who-knows-what encoding. :-) The "bite the bullet" solution is just to demand "right" libraries and use UTF-8 throughout.