
Perhaps I'm misunderstanding the purpose of the state_type typedef in char_traits. It seems that it's used for two things: to specify the type that will hold the actual shift state for encodings that require it, and to specify a codecvt facet for the encoding in question (to read/write it from/to a stream of bytes). The latter part is what I'm focusing on. Appendix D of "The C++ Programming Language" said of codecvt: "The State template argument is the type used to hold the shift state of the stream being converted. State can also be used to identify different conversions by specifying a specialization." I probably should have been clearer that I was referring to the state type and not the shift state itself. What I meant was that, if you defined a shift state as class JISstate { ... };, you would need to specialize codecvt to convert a Shift JIS encoding on disk to a *particular* encoding in memory (say UTF-16). You'd need a different specialization of codecvt to convert to UTF-8. Hopefully this explains my position better, and I apologize if I caused needless confusion. This may not even be the best way, but with a converting_stream class, we could do the following: - create a converting_ifstream with char_traits<Ch>::state_type of JISstate - create a string with char_traits<Ch>::state_type of UTF8 - (automatically) build a codecvt facet with a state_type of conversion_pair<JISstate,UTF8> - the conversion_pair would take bytes encoded as Shift JIS, convert them to a Unicode code point, and convert that to UTF-8 byte(s) - read data from the converting_ifstream to the string - the codecvt facet would then run the conversion from conversion_pair, resulting in a UTF-8 encoded string from Shift JIS data on disk This could then be extended to UTF-16 simply by creating a state_type class for it and specifying a conversion between Unicode code points and UTF-16. Like I said, this may not be the best way, but hopefully it at least explains my idea better. - James On 9/27/07, Sebastian Redl <sebastian.redl@getdesigned.at> wrote:
On 9/27/07, Sebastian Redl <sebastian.redl@getdesigned.at> wrote:
That has nothing to do with what basic_string<wchar_t> is, though, because that state is to be used when converting the string to an external encoding.
Well, clearly that state needs to know what the internal encoding is in
James Porter wrote: the
first place, No, why? What difference does it make to the shift state of Shift-JIS whether you convert to this encoding from UTF-8 or UTF-16?
Sebastian Redl _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost