
2010/6/16 Alexander Lamaison <awl03@doc.ic.ac.uk>:
I'm saying that Filesystem v3 on Windows doesn't interpret narrow strings as UTF-8 by default. Beman said that it did...
There is a misunderstanding here. V3, like any Windows program, by default interprets narrow strings according to the File code page. You have to configure that yourself if you want it to be UTF-8. Since that is a pain, and you are using Microsoft or one of the other compilers that support wide opens, it seems easier just to convert from the narrow string to the wide string yourself. But if you want to fool around getting the codepage support in place, V3 should handle it AFAIK.
but I beg to differ. Here's what the comments say:
// For Windows, wchar_t strings do not undergo conversion. char strings // are converted using the "ANSI" or "OEM" code pages, as determined by // the AreFileApisANSI() function, or, if a conversion argument is given, // using a conversion object modeled on std::wstring_convert.
In other words "שלום.txt" would be interpreted as being in whatever encoding the local code page is set to and would, therefore, produce a path containing gibberish for most people. This is standard Windows behaviour :P
Your problem is yet another step further than this. Assuming fs3 correctly converted "שלום.txt" to the UTF-16 equivalent, how do you then open a file using this wide-char name? Well, MSVC has wchar_t overloads so this works fine. You're right about glibc++/MinGW though. fs::fstream will fail there. Rather than introducing a nowide library, why don't we just try to fix this in Boost.Filesystem?
Agreed. If anyone wants to submit a patch for glibc++/MinGW that uses the wide Windows API, that would be a better solution. --Beman