
On 27.10.2011 20:01, Peter Dimov wrote:
Alf P. Steinbach wrote:
On 27.10.2011 18:47, Peter Dimov wrote:
Alf P. Steinbach wrote:
However, I still ask:
why FORCE INEFFICIENCY & AWKWARDNESS on Boost users -- why not just do it right, using the platforms' native encodings.
Comment out the imbue line.
But that line is much of the point, isn't it?
There wouldn't be much point in calling imbue if you didn't want a change in the boost::filesystem default behavior, which is to convert using the ANSI CP (or the OEM CP if AreFIleApisAnsi() returns false, if I'm not mistaken).
Oh there is. It is a level of indirection. You want Boost.Filesystem to assume /the same/ narrow character encoding as Boost.Locale, whatever it is. And to quote the docs where I found that program, "Boost Locale fully supports both narrow and wide API. The default character encoding is assumed to be UTF-8 on Windows."
(The platform's native encoding is UTF-16. The "ANSI" code page, which is not necessarily ANSI or ANSI-like at all, despite your assertion,
The article you responded to did not contain the word "ANSI".
Thus, when you refer to an assertion about "ANSI", you have fantasized something.
http://boost.2283326.n4.nabble.com/Making-Boost-Filesystem-work-with-GENERAL...
That's a different context and a different discussion, where it was neither necessary nor natural to dot the i's and cross the t's to perfection. Talk about dragging in things from out of the blue. If you wanted to point out the possibility of e.g. a Japanese codepage as ANSI, then you should have done that over there, in that thread. I mean in the context where it could make sense and where it could help prevent readers getting a wrong impression. If it was that important. [snippety]
Under Windows (NT+ and NTFS), the narrow character API is a wrapper over the wide character API. The system converts from/to the ANSI code page as needed. The narrowing conversion may lose data.
OK, we're just talking about two different meanings of "native", for two different contexts: windows internals, and windows apps. The relevant context for discussing Boost.Locale's treatment of narrow strings, is the application level.
[the program] will work fine until it's given a file name that is not representable in the ANSI CP.)
Nope, sorry, for any /reasonable interpretation/ of what you're writing.
File names on NTFS are not necessarily representable in the ANSI code page. A program that uses narrow strings in the ANSI code page to represents paths will not necessarily be able to open all files on the system.
Right, that's one reason why modern Windows programs should best be wchar_t based. Other reasons include efficiency (avoiding conversions) and simple convenience. Some API functions do not have narrow wrappers. However, a default assumption of UTF-8 encoding for narrow strings, as in Boost.Locale, seems to me to clash with most uses of narrow strings. For example, if you output UTF-8 on standard output, and then try to pipe that through `more` in Windows' [cmd.exe], you get this: <example> d:\dave> chcp 65001 Active code page: 65001 d:\dave> echo "imagine this is utf8" | more Not enough memory. d:\dave> _ </example> So utf-8 is, to put it less than strongly, not very practical as a general narrow-character encoding in Windows. The example that I gave at top of the thread was passing a `main` argument further on, when using Boost.Locale. It causes trouble because in Windows `main` arguments are by convention encoded as ANSI, while Boost.Locale has UTF-8 as default. Treating ANSI as UTF-8 generally yields gobbledygook, except for the pure ASCII common subset. But with ANSI as Boost.Locale default, with that more reasonable choice of default, the imbue call would not cause trouble, but would instead help to avoid trouble -- which is surely the original intention. Cheers & hth., - Alf