
Toward the end of a thread with the subject "std::string <-> std::wstring conversion" there was some discussion of how the C++ committee N1683 proposal could be improved. I volunteered to write up our discussions. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1683.html for a copy of the proposal. Here is a draft of what I have written so far. Comments and improvements welcome. --Beman Critique of Code Conversion Proposal (N1683) -------------------------------------------- N1683==04-0123, Proposed Library Additions for Code Conversion, proposes sorely need code conversion facilities for the standard library. (See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1683.html) Without these facilities programmers concerned with internationalization are forced to reinvent the wheel; Boost has run into that problem two or three times in existing libraries, and additional times in libraries currently in the Boost pipeline. The proposal should be accepted by the LWG as a high priority need. That being said, there are several concerns described in this paper which may indicate the proposal can be further refined and improved. 1. Hard-wired byte_string type in wstring_convert ------------------------------------------------- The underlying wstring_convert design seems flexible enough to cope with conversion between any two character types which meet std::basic_string requirements. Conversion is actually performed by std::codecvt, which is already parameterized by both internalT and externalT types. It seems artificial to restrict wstring_convert::byte_string to std::basic_string<char>. New character types such as the proposed char16_t and char32_t will need conversions to and from other wide types, yet with the current restriction wstring_convert could not be used for that purpose. Suggested change: replace typedef std::basic_string<char> byte_string; with: typedef std::basic_string<typename Codecvt::extern_type> byte_string; and change from_bytes argument types accordingly. If this suggested change is accepted, it will probably make sense to rename some wstring_convert members. 2. wstring_convert template parameter Elem seems unneeded --------------------------------------------------------- The wstring_convert template parameter Elem seems unneeded. Isn't it always Codecvt::intern_type? Suggested change: remove the Elem parameter and replace Elem with Codecvt::intern_type 3. Need target-argument form for wstring_convert conversion functions --------------------------------------------------------------------- wstring_convert's conversion functions are in the form: byte_string to_bytes(const wide_string& wstr) const; While this form is often useful and should be retained, it may imply an extra copy of the result if a compiler is not smart enough to optimize the copy away. Suggested change is to add additional functions in the form: void to_bytes(const wide_string& wstr, byte_string & target) const; 4. More explicit name for wstring_convert ----------------------------------------- "wstring" might be misleading, depending on the actual types involved. "convert" is a verb, yet nouns make better class names. Suggested change: wstring_convert to: string_converter 5. Standardese needed --------------------- The proposal needs improved standardese. For example, the requirements on the template parameters need to be specified and the function description converted to canonical form. 6. Comparable changes need to be made for wbuffer_convert --------------------------------------------------------- Any of the above changes which are accepted need to be folded into wbuffer_convert. Acknowledgements ---------------- This critique is based on discussions with Thorsten Ottosen, Stefan Slapeta, and Jonathan Turkanis. Revised: 05 January 2005