Re: [boost] Strings tagged with their character set

27 Sep 2007

      On 9/27/07, Sebastian Redl <sebastian.redl@getdesigned.at> wrote:
...
That has nothing to do with what basic_string<wchar_t> is, though,
because that state is to be used when converting the string to an
external encoding.
Well, clearly that state needs to know what the internal encoding is in the
first place, so they are related in some way. Ideally, I'd like to be able
to use the state_type for basic_string as one half of the shift state, with
the other half being the state_type for the target (say, an output stream).
Put those two together, and we could build a codecvt facet of the form:

   codecvt<internal_char, external_char,
conversion_pair<internal_state_type, external_state_type> >

Doing it this way obviously wouldn't work with any of the I/O streams now,
since they require the char_traits to be the same, but perhaps we could
define a converting_stream type that creates the above codecvt facet
automatically and handles conversion.

In some sense, I suppose this is an abuse of char_traits<Ch>::state_type,
but I think it's the most backwards-compatible way. That is, if the internal
(string) and external (stream) encodings were the same, I/O would behave
like it does now (only hopefully in a more predictable/useful fashion when
dealing with Unicode).

- James

Re: [boost] Strings tagged with their character set

James Porter