Should we add two simple character-to-Unicode converters? - Boost - lists.preview.boost.org

newer
Start separating mandatory source?

Should we add two simple character-to-Unicode converters?

older
interest in "policy_composite"...

Daryle Walker

19 Aug 2005 19 Aug '05

1:02 p.m.

Nothing fancy, just something like: int_fast32_t char_to_Unicode( char c ); int_fast32_t wchar_to_Unicode( wchar_t c ); that converts a native character to a Unicode value. They would need a separate source file that contains #if blocks for each platform. Maybe they can start a namespace with "utf8_codecvt_facet"? -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com

Reply

Sign in to reply online Use email software

Show replies by date

Vladimir Prus

19 Aug 19 Aug

1:15 p.m.

New subject: Should we add two simple character-to-Unicode converters?

Daryle Walker wrote:

Nothing fancy, just something like:

int_fast32_t char_to_Unicode( char c ); int_fast32_t wchar_to_Unicode( wchar_t c );

that converts a native character to a Unicode value.

Maybe, but it's hard to comment as you haven't even explained what those function will do. What's a "native character" and what a "Unicode value" and how the conversion will be done? If the first function does conversion from local 8 bit encoding to unicode then: - do you have a working implementation? - isn't dealing with individual characters too slow? - Volodya

Reply

Sign in to reply online Use email software

Daryle Walker

22 Aug 22 Aug

10:57 a.m.

New subject: Should we add two simple character-to-Unicode converters?

On 8/19/05 9:15 AM, "Vladimir Prus" <ghost@cs.msu.su> wrote:

Daryle Walker wrote:

...
Nothing fancy, just something like:

int_fast32_t char_to_Unicode( char c ); int_fast32_t wchar_to_Unicode( wchar_t c );

that converts a native character to a Unicode value.

Maybe, but it's hard to comment as you haven't even explained what those function will do. What's a "native character" and what a "Unicode value" and how the conversion will be done? If the first function does conversion from local 8 bit encoding to unicode then:

"Native characters" are the character set a particular platform uses. Before the Unicode era, a platform (could) assume that all text files used the platform's character set. (e.g. MacRoman for pre-X Macs, Cp-1251 for Windows, Latin-1 for UNIX) My functions assume a one-to-one mapping from a native character to a Unicode code-point, because Phase 1 of C++ translation (see section 2.1 of the standard) assumes that.

- do you have a working implementation?

No. I'm just requesting for comments.

- isn't dealing with individual characters too slow?

Probably. Maybe we could add an iterator-copying version. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com

Reply

Sign in to reply online Use email software

Robert Ramey

3:30 p.m.

New subject: Should we add two simplecharacter-to-Unicode converters?

The "Dataflow Iterators" section of the serialization library contains iterators for this purpose. Its been part of boost since last November. Robert Ramey Daryle Walker wrote:

On 8/19/05 9:15 AM, "Vladimir Prus" <ghost@cs.msu.su> wrote:

...
Daryle Walker wrote:

...
Nothing fancy, just something like:

int_fast32_t char_to_Unicode( char c ); int_fast32_t wchar_to_Unicode( wchar_t c );

that converts a native character to a Unicode value.

Maybe, but it's hard to comment as you haven't even explained what those function will do. What's a "native character" and what a "Unicode value" and how the conversion will be done? If the first function does conversion from local 8 bit encoding to unicode then:

"Native characters" are the character set a particular platform uses. Before the Unicode era, a platform (could) assume that all text files used the platform's character set. (e.g. MacRoman for pre-X Macs, Cp-1251 for Windows, Latin-1 for UNIX) My functions assume a one-to-one mapping from a native character to a Unicode code-point, because Phase 1 of C++ translation (see section 2.1 of the standard) assumes that.

...
- do you have a working implementation?

No. I'm just requesting for comments.

...
- isn't dealing with individual characters too slow?

Probably. Maybe we could add an iterator-copying version.

Reply

Sign in to reply online Use email software

7278

Age (days ago)

7281

Last active (days ago)

Download

3 comments

3 participants

tags

participants (3)

Daryle Walker
Robert Ramey
Vladimir Prus