
19 Oct
2004
19 Oct
'04
6:02 p.m.
In article <e094f9eb041019032718d58d04@mail.gmail.com>, Rogier van Dalen <rogiervd@gmail.com> wrote:
An assumption I think is wrong is that wchar_t would be suitable for Unicode.
wchar_t is to be avoided at all costs. Its size differs from compiler to compiler, and even depends on compiler settings (2 or 4 bytes). Encoding of wchar_t strings is ill-defined and also varies from system to system (usually UCS-2 or UCS-4, but there is no guarantee it's a Unicode encoding).
So unicode::string<unicode::codepoint_string<std::string> > would be a UTF8-encoded string that is manipulated using its characters.
Encoded characters or abstract characters? (See section 2.4 of Unicode standard for definitions) meeroh