On Wed, Oct 7, 2009 at 11:58 AM, Igor R
Hello,
I try to accomplish the subj with help of boost's utf8_codecvt_facet. I based my code on this example: http://www.boost.org/doc/libs/1_40_0/libs/serialization/doc/codecvt.html . The only difference is that my utf8 text resides in std::string:
#include <sstream> #include <iostream> #include "boost/archive/detail/utf8_codecvt_facet.hpp" // link with boost/libs/serialization/src/utf8_codecvt_facet.cpp
int main() { std::string utf; utf.resize(11); // hardcode some utf8 text utf[0] = 0xd7; utf[1] = 0x90; utf[2] = 0xd7; utf[3] = 0x99; utf[4] = 0xd7; utf[5] = 0x92; utf[6] = 0xd7; utf[7] = 0x95; utf[8] = 0xd7; utf[9] = 0xa8; utf[10] = 0x0; std::locale old_locale; std::locale utf8_locale(old_locale, new boost::archive::detail::utf8_codecvt_facet()); std::locale::global(utf8_locale); std::stringstream in; in.imbue(utf8_locale); in.str(utf); std::wstringstream out; out << in; std::wcout << out.str() << std::endl; }
The above code doesn't work: "out" buffer doesn't contain correct unicode interpretation of the string. Actually, all i want is a c++ equivalent to the following WinAPI:
#include "windows.h" int main() { std::string utf; utf.resize(11); // hardcode some utf8 text utf[0] = 0xd7; utf[1] = 0x90; utf[2] = 0xd7; utf[3] = 0x99; utf[4] = 0xd7; utf[5] = 0x92; utf[6] = 0xd7; utf[7] = 0x95; utf[8] = 0xd7; utf[9] = 0xa8; utf[10] = 0x0; wchar_t outBuff[11]; MultiByteToWideChar(CP_UTF8, 0, utf.c_str(), -1, outBuff, 10); }
...which works well.
Any idea would be greatly appreciated!
Er... I thought UTF8 *is* a form of Unicode? Looks like you are trying to convert UTF8 to UTF16, for what reason?