Re: [boost] status of wchar_t support in filesystem::path?

16 Feb 2004

      At 12:49 AM 2/16/2004, Walter Landry wrote:
...
Beman Dawes <bdawes@acm.org> wrote:
...
At 12:19 PM 2/15/2004, Paul Miller wrote:
...
Presumably
Linux still works with multi-byte characters.
Is there progres toward a wchar_t-aware path?
Yes. I now have the outline of a design for the internationalization of
...
...
Boost.Filesystem paths.
Care to share?  I'm curious how you handle some of the legacy Japanese
encodings.
The framework looks something like this:

There are internal representation types like char, wchar_t, or user-defined 
character types meeting std::string requirements. Those are handled by 
path, wpath, or a basic_path class template respectively. The encoding of 
char and wchar_t, of course, are defined by the compiler. The encodings of 
UDT's are defined by their implementations.

There is one (usually, but with exceptions) external representation type. 
Each representation type may support multiple external path name encodings, 
including user defined encodings, subject to the operating system's 
encoding limitations.

There will be a locale based (ie codecvt) mechanism for converting between 
the internal representation type and encoding, and the external 
representation type and encoding. The mechanisms for default and explicit 
locale operations will presumably be modeled on those of I/O streams.

So handling the legacy Japanese encodings works like this:

The programmer selects an internal type and encoding that can represent 
those external types and encodings. Perhaps wchar_t, but perhaps some UDT.

The external type and encoding is presumably the operating system's 
default. The default locale mechanism will provide the codecvt facet to 
handle the conversions. So on a Japanese O/S, the external representation 
may be one of the legacy encodings, and if so the correct conversions will 
take place.

Does that make sense?

--Beman