
Beman Dawes wrote:
hm...why not remove the dependency of std::basic_string altogether and make it a template parameter.
Jonathan Turkanis' original comment was:
(One thing I don't understand is why the character type of wbuffer_convert is allowed to be specified as the second template argument. It seems to me that the character type should always be equal to Codevt::intern_type.)
But I think that you are closer to the real problem with the proposal; the full string type rather than just the character type should be a template parameter. That allows any std::basic_string to be used.
I was talking about wbuffer_convert; at the time I hadn't looked at wstring_convert very closely. Since then I started to factor the code conversion routines out of the iostreams library to make them more useful for string conversion. I haven't worked on it much since I finihsed the iostreams revision, but I was leaning toward an interface someting like this for string conversion: template<typename Codecvt = use_default> struct string_converter { // Nice name ;-) // typedefs template<typename InIt, typename OutIt> OutIt narrow(InIt first, InIt last, OutIt dest); template<typename InIt, typename OutIt> OutIt widen(InIt first, InIt last, OutIt dest); // Convenience functions: template<typename WideStr> // Version of Thorsten's suggestion basic_string<typename Codecvt::extern_type> narrow(const WideStr&); template<typename NarrowStr> // Version of Thorsten's suggestion basic_string<typename Codecvt::intern_type> widen(const NarrowStr&); }; // Convenience functions: template<typename InIt, typename OutIt> OutIt narrow(InIt first, InIt last, OutIt dest) { string_converter<> cvt; return cvt::narrow(first, last, dest); } template<typename InIt, typename OutIt> OutIt widen(InIt first, InIt last, OutIt dest) { string_converter<> cvt; return cvt::widen(first, last, dest); } template<typename WideStr> basic_string<typename Codecvt::extern_type> narrow(const WideStr& str) { string_converter<> cvt; return cvt::narrow(str); } template<typename NarrowStr> basic_string<typename Codecvt::intern_type> widen(const NarrowStr& str) { string_converter<> cvt; return cvt::widen(str); } Remarks: 1. The names 'narrow' and 'wide' could be confused with the ctype members of the same name, which do not perform code conversion, but I like them better than 'to_bytes' and 'from_bytes' (since extern_type may not represent a byte) and 'wide_to_multi_char' and 'multi_char_to_wide' (too long) 2. The narrow and widen overloads which take iterators have the same signature as std::copy. 3. If no Codecvt template parameter is specified, an instance of codecvt<wchar_t, char, mbstate_t> is fetched from the global locale. The non-member versions of narrow and widen use this option. 4. Thorsten asks why the widening and narrowing functions shouldn't be non-member functions. One answer is that code conversion can be (slightly) more efficient if a large buffer is used. Making the core conversion functions member functions allows buffers to be used for several string conversions. A second answer is that it's a bit awkward to specify a codecvt in a non-member function: narrow< utf8_codecvt_facet<char_t> > (str.begin(), str.end(), back_inserter(dest)); or narrow( str.begin(), str.end(), back_inserter(dest), utf8_codecvt_facet<wchar_t>() ); When a non-default codecvt is being used, I think it's reasonable to ask people to use a member function, the keep the non-member usage simple.
--Beman
Jonathan