
I was talking about **codecvt facet only** i.e. conversion via imbue locale to file stream - and this is due to limitation of definition of codecvt facet thats it.
What is the concrete limitation of codecvt specification that prevents creating a codecvt facet that converts UTF-16 to-from UTF-8? I just re-read 22.2.1.5 but wasn't able to see it.
Direct conversion functions like to_utf/from_utf has no this limitation.
All Boost.Locale support wide characters and it supports characters outside of BMP for UTF-16 encoded strings. If it wasn't it was absolutely useless software.
Good to hear. Yes, I agree it's very important. boost::detail::utf8_codecvt_facet fails that test, at least on windows, but I'm wondering what is the fundamental restriction it can't be patched to support them?
So don't worry...
As matter of fact Boost.Locale supports:
- narrow (normal) characters - char for 8 bits locale like ISO-8859-8. - narrow (normal) characters - char for variable length locale like UTF-8 or even Shift-JIS. - wide characters wchar-t for both UTF-16 (Windows) and UTF-32 (POSIX) encodings. - C++0x char16_t/char32_t for utf-16/utf-32 if available
For the original (non-compliance) point I raised it would be interesting to see how well codecvt< char32_t, char, std::mbstate_t > is going to be implemented under windows :) BTW, I see some interesting additions to codecvts in n3090, 22.5. Any plans to implement them in Boost.Locale?
And of course the support is of all Unicode from 0 to 10FFFF. Everything is fully supported.
Only issue that exists if codepage conversion via standard std::loclae::codecvt facet due to its limitation.
And BTW I do not recommend use it widely as it has some other issues as well.
Non-iterator interface is a real pain in using codecvt, I admit.
Best, Artyom
Best Regards, Gevorg