
Hi, I wrote a wrapper around John Maddock's unicode iterators (Thanks to Tomas for the hint), which provide a std::string like interface to access utf8, utf16 or utf32 encoded strings. <example> utf8_string u8("unicode string"); // construct by utf8 coded char[] u8 += 0x0020; // add some chars u8 += 0x0391; // alpha u8 += 0x0392; // betha u8 += 0x0393; // gamma std::cout << u8.raw() << std::endl; // access encoded string std::copy(u8.begin(), u8.end(), std::ostream_iterator<utf32_char>(std::cout, ", ")); std::cout << std::endl; utf32_string u32=u8; // assign and convert to utf32; std::copy(u32.begin(), u32.end(), std::ostream_iterator<utf32_char>(std::cout, ", ")); std::cout << std::endl; </example> The wrapper can be extended to support additional encodings like latin-1 or windows-1252, by providing encode and decode iterators. The source for the wrapper: http://opensource.nicai-systems.com/unicode/unicode.h And some test code: http://opensource.nicai-systems.com/unicode/test_unicode.cpp If there is any interest I can extend the code to support more std::basic_string methods, and add additional encodings... Regards, Nils