
________________________________ From: Andrey Semashev <andrey.semashev@gmail.com> To: boost@lists.boost.org Sent: Saturday, March 2, 2013 12:56 PM Subject: [boost] [locale] Composing asymmetric locale for character encoding conversion
Hi,
Suppose I have a logging application that writes log records in wide (wchar_t, UTF-16) and narrow (char, UTF-8) encodings and I want these logs to be stored in a UTF-16LE encoded file. For simplicity, let's assume that I write log files with std::wofstream. Now, the standard says that the file stream buffer is supposed to convert wide characters to byte sequences using the locale imbued into the buffer.
In generally it is done by codecvt facet, but it id designed to covert wide characters to 8 bit encode and vise versa.
However, it seems that the locale should be the same as the one imbued into the stream (basic_ostream::imbue makes sure of that).
No you can install your own codecvt to existing locale object and than imbue it into the stream.
What this leads to is that in order to achieve my goal the locale should be able to convert narrow characters of UTF-8 to wide characters of UTF-16 and wide characters of UTF-16 to narrow characters representing byte sequence of UTF16LE. Is it possible to make such an asymmetric locale with Boost.Locale? Or maybe there is another way of doing this?
No, the stuff you are probably looking for is in an interface that provides both `std::basic_ostream<char>` and `std::basic_ostream<wchar_t>~ And than implement your stream buffer that would do the conversion.
An additional question. Is it possible to to achieve my goal with
std::ofstream (as opposed to std::wofstream)?
No, you will need: 1. two different wide and narrow streams. 2. Your custom stream buffer that would convert input characters to your arbitrary encoding You'd better start from boost::iostream and use boost::locale::utf::* functions for character set manipulation.
I have a very strong suspicion that the answer is no because the narrow characters will pass on unconverted to the file instead of being translated from UTF-8 to UTF-16LE, but maybe I'm missing something.
Yes you are correct the codecvt<char,char> is no-op.
Thank you.
Artyom Beilis -------------- CppCMS - C++ Web Framework: http://cppcms.com/ CppDB - C++ SQL Connectivity: http://cppcms.com/sql/cppdb/
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost