
In-Reply-To: <6.0.3.0.2.20041115095103.028e91e0@mailhost.esva.net> bdawes@acm.org (Beman Dawes) wrote (abridged):
Your strongest argument IMO is the point about conversions not necessarily being value preserving.
I believe conversions from char to wchar_t are always value preserving, or can be made to be. I've known MultiByteToWideChar() to fail with CP_SYMBOL on Win98, but it's easy to handle that as a special case (and replicate the XP behaviour). I would be interested to hear of any other counter-examples.
(I guess we could tell Windows users that they should not expect such conversions to work unless supported by the applicable codepage. But that seems spin rather than a real solution.)
I don't think there's any magic here. WideCharToMultiByte() is usually a misnomer, as most code pages only support 256 characters. Even the double-byte ones cannot support the whole of 16-byte Unicode, let alone surragate pairs. Typically converting an unsupported character will yield a question mark, which is not valid in a file name. So if we do the conversion, the unsupported characters will fail in our code, and if we don't, it will fail in the OS. There's no magic to make it work. When "OS" means "Win98+MSLU", in my experience it is better to handle the conversion explicitly as MSLU doesn't always do what I'd want. I have sometimes found the best way to convert a Unicode path to ANSI is via GetShortPathName(). This isn't an argument against using two path classes. It is an argument against wpath relying on MSLU. In my Unicode apps an important use case was passing Unicode filenames to other, ANSI apps and libraries. So if we do have two path classes we will still need to offer conversions between them. Probably this conversion should match the one you get when making OS calls, which is another argument for doing the conversion ourselves. Having 2 classes is probably the clearest way to manage such cases. -- Dave Harris, Nottingham, UK