Re: [boost] [filesystem] Thoughts on wide character paths

17 Feb 2006

      Beman Dawes wrote:
...
"Peter Dimov" <pdimov@mmltd.net> wrote in message
news:00af01c6334a$b9ed2200$6407a8c0@pdimov2...
...
...
At the end I reverted the changes and just encoded the wide path into
UTF-8 at the very start, passed the UTF-8 string through the existing
code, then decoded the UTF-8 into a wstring at the very end,
immediately before calling the Windows API. It worked.
Seems like a reasonable and practical approach. I've wondered several
times if we wouldn't have been better off if Microsoft had chosen
UTF-8 as their Window external representation, too.
I don't think that they could have done that because of legacy FAT 
filesystems that could have been using narrow paths with an arbitrary 
encoding.

But my point is that the library can use UTF-8 as its _internal portable 
encoding_, encoding into UTF-8 when it is given a path as a wstring or a 
(string, encoding) pair, and decoding into the appropriate (string, 
encoding) or wstring when it passes a path to the OS. Everything else can be 
string-based.

With this approach, we can have a single path class that handles everything. 
No need to choose between a narrow path and a wide path, and no need to 
encode the character encoding into the path type.

I've tried to communicate this via code, apparently with mixed success. :-)