
Me too.
I'm saying that Filesystem v3 on Windows doesn't interpret narrow strings as UTF-8 by default. Berman said that it did but I beg to differ. Here's what the comments say:
// For Windows, wchar_t strings do not undergo conversion. char strings // are converted using the "ANSI" or "OEM" code pages, as determined by // the AreFileApisANSI() function, or, if a conversion argument is given, // using a conversion object modeled on std::wstring_convert.
In other words "שלום.txt" would be interpreted as being in whatever encoding the local code page is set to and would, therefore, produce a path containing gibberish for most people. This is standard Windows behaviour :P
This standard Windows behavior is exactly **the** problem. To be honest, have you seen anybody using "wide-path" outside of Windows scope? Do you actually need such "wide-path" for POSIX platforms? The answer is not. Actually, POSIX OS does not care about filename charset, as I can create a file std::ofstream f("\xf9\xec\xe5\xed.txt"); Which is valid file (שלום in ISO-8859-8) but invalid UTF-8. But it is valid file-name (and the locale is UTF-8 locale).
Your problem is yet another step further than this. Assuming fs3 correctly converted "שלום.txt" to the UTF-16 equivalent, how do you then open a file using this wide-char name? Well, MSVC has wchar_t overloads so this works fine. You're right about glibc++/MinGW though. fs::fstream will fail there. Rather than introducing a nowide library, why don't we just try to fix this in Boost.Filesystem?
I think that this can be fixed (the way I fixed it in nowide implementing fstreambuf over stdio+_wfopen) http://art-blog.no-ip.info/files/nowide.zip But this is one particular problem. There are more. What about filesystem::remove and others? From what I see in the code, it supports only path and not wpath --------------------- But this is a part of one bigger problem. When I develop cross platform applications I have following options for operating of files. For example when I want to remove, rename, create a file in a program writing cross platform applications, writing using standard platform independent C++, Writing for POSIX operating systems and for MS Windows. OS \ Str | std::string | std::wstring | ----------------------------------------------- Std C++ | Ok | Not Defined! POSIX | Ok | Not Defined! WinAPI | Not UTF-8 | Ok What I can see. I need either use wide strings that works only on Windows but require me to convert to other encoding for operations on files. Or I may use normal strings as standard requires and have problems with Windows as it is not fully supported. Or I need to write two kinds of code: - One for Windows using "Wide" strings - One for anything else using normal strings. Because windows does not support UTF-8 code-page. So far? Why? Why do you need all this if you can just create a tiny layer that makes Window support UTF-8 code page by converting std::string to std::wstring and calling appropriate API? My Opinion: ----------- - There is Neither use nor Need of "Wide" strings for file system operations on all platforms but Windows. - Introducing boost::filesystem::wpath does not help as it meaningless on other OSes. - Using Wide strings is extremely error prone in cross platform applications as on Windows they are UTF-16 and on POSIX they are UTF-32 encodings. Wide Path support just make our applications more complicated and error prone. So... Just create an API that is friendly to UTF-8 strings and forget about this hell. ------------- But from what I see this will never happen in Boost as it is too Windows centric, and Windows is too ignorant to basic programmers needs who want to write a portable programs. Regards. Artyom P.S.: The title of this mail is request for interest. It is ok not to have one.