
Hello,
Yes, are you using the latest version 2.x from sourceforge site or you had taken the "/trunk"? Because latest boost.locale sits in its own branch - rework.
I didn't use your library.
Ahh I see, I do following: When I read for example 4 byes of UTF-8 that go to codepoint > 0xFFFF I do following: 1. I write first surrogate pair to output stream, I update the state to reflect that first part of the pair was written and **I do not consume input** 2. Same 4 utf-8 bytes again and see that state is marked to that first part of pair was written so I write the second and consume the input. So actually do_in called twice for same input.
But then, looking at your library, you seem to do some weird (and dangerous!) reinterpret casting,
which suggests you're not making the fstream interface directly with a std::codecvt<wchar_t, char, std::mbstate_t> facet.
Actually the mbstate_t is POD type that should be initialized to 0. I must make sure that sizeof(mbstate_t) >= 2, and then I use it as temporary storage for state. So it is fine to use it as storage for state. The biggest problem is that standard says nothing about mbstate_t but a fact that this is POD and initialized to 0, that is what I use. Artyom