
On 05/24/2010 11:15 PM, Artyom wrote:
Well, that's not exactly true. mbstate_t is defined by the C standard, and indeed, it says pretty much nothing about its nature, except that it's not an array. But on any platform I worked with (including Windows) it's an integer.
ֹUnder Linux it is structure and AFAIK gcc uses iconv for conversion.
So I'm not sure how safe is to write anything to it.
Ah, right. I forgot about Linux. But still it's POD and can hold an integral value. How it is used by the standard facet is not relevant as long as you don't interchange states between your facet and the standard one.
The C++ standard does give some hints regarding how the conversion state shall be handled by the stream. In particular, it specifies that the state will be value-initialized at the beginning of the conversion, and it will call `shift` at the end of the conversion in order to finalize the converted character sequence and return the state to its initial value.
I was thinking about it but unfortunately standard does not specify how mbstate_t initialized. If I could assume that it is at leaset POD filled with zeros I could do something but I actually can't.
It is POD since it's defined by the C standard.
At least I didn't find any reference for this.
The C standard describes that the zero-valued mbstate_t shall count as an initial state. From n1256: 7.24.6 Extended multibyte/wide character conversion utilities ... 3 The initial conversion state corresponds, for a conversion in either direction, to the beginning of a new multibyte character in the initial shift state. A zero-valued mbstate_t object is (at least) one way to describe an initial conversion state. A zero-valued mbstate_t object can be used to initiate conversion involving any multibyte character sequence, in any LC_CTYPE category setting. ... Also, there is the mbsinit function that allows to detect if the state has the initial value (just in case there are other initial values, other than zero-filled). Next, for do_in/do_out the C++ standard says (22.2.1.5.2): 1 Preconditions: [...] state initialized, if at the beginning of a sequence, or else equal to the result of converting the preceding characters in the sequence. and further on, in the paragraph 5 (regarding do_unshift), there is a footnote that explains that the method is intended to return the state to the initial value (typically, stateT()).