
On Wednesday 23 March 2005 18:07, Beman Dawes wrote:
CVS now contains a branch "i18n" of the filesystem directories:
* Class templates basic_path, basic_directory_iterator, etc, support narrow, wide, and user-defined path types. Typedefs path, directory_iterator, etc, are provided, so most existing code continues to work.
I recall we had a long discussion concerning basic_path vs. single path type. I don't think results of that discussion are present in i18n.html -- essentially, there's no rationale for going with basic_path. There were several distinct issues. First is that if you have single path that stores unicode, then exists(path("foo")) will perform char -> wchar_t conversion inside path constructor, and that conversion might be not exactly the same that OS would have performed. One issues is that program might not have initialized global locale with locale(""). Another is that conversion performed by OS might be different then those of locale(""). I must admit I don't know when it might be the case on Windows (and POSIX don't do such conversions). So, I'd really like to know about real use cases. After all, QFile + QString works on windows. See docs at http://doc.trolltech.com/3.3/qfile.html The second issue, only relevant if the above one is real, is mixing different types of path. With single path: path p("a"), p2(L"b"); p /= p2; // must do conversion, might not do what's desired With basic_path: path p("a"); wpath p2(L"b"); p /= p2; // won't compile p /= path(p2); // explicit conversion is clearly seen. This again relies on the assumption that conversion from char to wchar_t might not do exactly the same as OS conversion would do. The third issue is that I don't like templated implementation of all functions. There's already compiled library, why not move all code there. For example: class common_path { public: char* data; bool is_wide; }; bool exists(const common_path& p) { if (p.is_wide) SomeOSFunctionW((wchar_t)*)p.data); else SomeOSFunctionA(p.data); } Also I note that there's no conversion from basic_path<char> to basic_path<wchar_t> or vice versa, as far as I can say. To recall my argument for conversion: say I have a library which exposes paths in the interface, should I use path or wpath in it? If I use path, then due to missing conversion, the library is unusable with other code that uses wpath. So I need to use wpath. And so basically, all libraries need to use wpath everywhere. So, why do you need path at all?
* The POSIX wpath implementation assumes that UTF-8 is always the operating system's preferred external path encoding. If any Boost users are concerned about other encodings, please let me know.
I certainly do. The standard encoding for russian on Linux is koi8-r. Probably, we need to use the conversion facet that's part of global locale. Qt uses char *charset = nl_langinfo (CODESET); and values of LC_ALL, LC_CTYPE and LC_LANG variables. But then it contains its own translation tables. So using locale("") is the best guess, I think. - Volodya