
19 Oct
2004
19 Oct
'04
12:28 p.m.
On Tuesday 19 October 2004 12:37, Aaron W. LaFramboise wrote:
An assumption I think is wrong is that wchar_t would be suitable for Unicode. Correct me if I'm wrong, but IIRC wchar_t has 16 bits on Microsoft compilers, for example. The utf8_codecvt_facet implementation will on these compilers cut off any codepoints over 0xFFFF. (U+1D12C will come out as U+D12C.)
This is because the Windows NT ABI is hardwired for 16-bit wide characters. I beleive that means the wide characters are actually UTF-16 characters that use "surrogate pairs."
We tested at some point, and it is UCS-2 by default, and will switch to UTF-16 if support for east asian languages is enabled. Teemu