
Beman Dawes wrote:
"Peter Dimov" <pdimov@mmltd.net> wrote in message news:00af01c6334a$b9ed2200$6407a8c0@pdimov2...
At the end I reverted the changes and just encoded the wide path into UTF-8 at the very start, passed the UTF-8 string through the existing code, then decoded the UTF-8 into a wstring at the very end, immediately before calling the Windows API. It worked.
Seems like a reasonable and practical approach. I've wondered several times if we wouldn't have been better off if Microsoft had chosen UTF-8 as their Window external representation, too.
I don't think that they could have done that because of legacy FAT filesystems that could have been using narrow paths with an arbitrary encoding. But my point is that the library can use UTF-8 as its _internal portable encoding_, encoding into UTF-8 when it is given a path as a wstring or a (string, encoding) pair, and decoding into the appropriate (string, encoding) or wstring when it passes a path to the OS. Everything else can be string-based. With this approach, we can have a single path class that handles everything. No need to choose between a narrow path and a wide path, and no need to encode the character encoding into the path type. I've tried to communicate this via code, apparently with mixed success. :-)