Re: [boost] Should we add two simplecharacter-to-Unicode converters?

22 Aug 2005


      The "Dataflow Iterators" section of the serialization library contains 
iterators for this purpose.  Its been part of boost since last November.

Robert Ramey

Daryle Walker wrote:
...
On 8/19/05 9:15 AM, "Vladimir Prus" <ghost@cs.msu.su> wrote:
...
Daryle Walker wrote:
...
Nothing fancy, just something like:
int_fast32_t   char_to_Unicode( char c );
    int_fast32_t  wchar_to_Unicode( wchar_t c );
that converts a native character to a Unicode value.
Maybe, but it's hard to comment as you haven't even explained what
those function will do. What's a "native character" and what a
"Unicode value" and how the conversion will be done? If the first
function does conversion from local 8 bit encoding to unicode then:
"Native characters" are the character set a particular platform uses.
Before the Unicode era, a platform (could) assume that all text files
used the platform's character set.  (e.g. MacRoman for pre-X Macs,
Cp-1251 for Windows, Latin-1 for UNIX)  My functions assume a
one-to-one mapping from a native character to a Unicode code-point,
because Phase 1 of C++ translation (see section 2.1 of the standard)
assumes that.
...
- do you have a working implementation?
No.  I'm just requesting for comments.
...
- isn't dealing with individual characters too slow?
Probably.  Maybe we could add an iterator-copying version.