
On 8/19/05 2:33 PM, "Graham" <Graham@system-development.co.uk> wrote:
Both:
int_fast32_t char_to_Unicode( char c ); int_fast32_t wchar_to_Unicode( wchar_t c )
will require processing of surrogates on order to be Unicode 4 compliant.
I thought of these functions while considering how Wave process the various phases of C++ translation (see section 2.1 of the standard). I wanted the conversion to be one native-character to one code-point because that is how Phase 1 implies it[1]. If you don't think that's right, maybe we should file a defect with the Standard committee.
A Unicode library is currently under development that will give access to the surrogate ranges directly from the ucd to allow this to be done properly.
[1] In other words, any extended native character (i.e. not a character C++ uses for parsing) must be mapped to one C++ Unicode name, which maps to a single code-point. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com