Re: [boost] [C++0x] Emulate C++0x char16_t, char32_t, std::u16string, and std::u32string

21 Jul 2011


      On Wed, Jul 20, 2011 at 6:12 PM, Artyom Beilis <artyomtnk@yahoo.com> wrote:
...
...
...
Does this make sense?
--Beman
No, you can't emulate them.
Sorry, but people have been doing this for at least ten years,
although the specific type names used vary.
...
Emulation of char16_t/char32_t is useless for any real use.
People have been using char16_t/char32_t or equivalent to handle
UTF-16/UTF-32 for years. It does work, it is useful, it is used in
production code, and Boost.Filesystem users are asking for it. So
there are existence proofs that this sort of emulation does work for
enough purposes to be quite useful.
...
You can't create working
 std::basic_ostringstream<char16_t> stream;
Because stream << 1245 would not work due to lack of std::locale facets.
You can't create requires facets as for example they are specialized
in many standard libraries.
Emulation via simple uint16_t and uint32_t typedefs doesn't work for
all use cases. So only use it when it does work.
...
Even existing Microsoft's VC2010 does not work if you compile application
with /MD or /MDd
I'll retest just to be sure, but I'm fairly sure that some of my tests
have used those switches.
...
Note: char, wchar_t, char16_t and char32_t are much more then basic types
that can be distinguished, they bring character information with them.
If you want to represent a UTF-16 or UTF-32 code unit just use uint16_t
or uint32_t, like for example ICU does for UChar and Qt does for QChar,
but this isn't something that suppose to work with standard library
in place where characters exists.
Ummmm? Did you look at the attachment?  That's what it does. Uses
uint16_t and uint32_t if the compiler does not supply the new
character types, otherwise just uses the supplied standard library
unchanged.
...
Also for File system? Please, don't try make it more complicated then it
is now.
There is no change to the filesystem interface; char16_t and char32_t
were designed into V3 right from the beginning. It is just a case of
adding char16_t/char32_t overloads to some implementation code. (That
may not be entirely correct for POSIX systems when the native char
encoding for filenames is not UTF-8. I'm just about to work that out.)
...
You want to make boost.filesystem better? Make it use UTF-8 on Windows
by default and drop all "wide-crap" (sorry windows users).
Well, that certainly would be exciting:-)  But more seriously,
Boost.Filesystem and the C++ standard library are designed to work
with native encodings as well as UTF-8, UTF-16, and UTF-32. Users will
do what best serves their interests, and means a plethora of encodings
for years and years to come. Get over it.
...
All operating systems around (with one exception) use char * API and
one operating system uses utf-16/wchar_t API.
So adding arbitrary character that no operating system uses seems
to be waste of effort.
The issue isn't what the operating system uses, it is what users want
and the standard library demands. We are moving from a C++ world that
only supports char and wchar_t to a C++ world that supports char,
wchar_t, char16_t, and char32_t.
...
I **personally** don't see any benefit in adding char16_t/char32_t emulation
to the Boost and specialty to the Boost.Filesystem.
Today Boost.Filesystem has enough problems besides char16_t/char32_t.
Lack of char16_t/char32_t support is seen as a problem by some users,
plus I'm working on that portion of the code now in a effort to clear
tickets released to locale, codecvt, and character encoding issues.

--Beman

Re: [boost] [C++0x] Emulate C++0x char16_t, char32_t, std::u16string, and std::u32string

Beman Dawes