Re: [boost] Any interest in adding unicode support to boost?

19 Oct 2004

      Rogier van Dalen wrote:
...
An assumption I think is wrong is that wchar_t would be suitable for
Unicode. Correct me if I'm wrong, but IIRC wchar_t has 16 bits on
Microsoft compilers, for example. The utf8_codecvt_facet
implementation will on these compilers cut off any codepoints over
0xFFFF. (U+1D12C will come out as U+D12C.)
This is because the Windows NT ABI is hardwired for 16-bit wide
characters.  I beleive that means the wide characters are actually
UTF-16 characters that use "surrogate pairs."  Regardless of whether
this is a good thing or not, Windows compilers need to follow suit as
the underlying implementation of their wide characters is in Windows,
not in the compiler.

It might be possible for a compiler to provide their own Unicode
implementation, and map that to Windows' wide characters, but in the
user-visible situations where the two implementations disagreed, there
might be suprising results that might make the compiler-provided
implementation unusable.

Aaron W. LaFramboise