[boost] Re: Any interest in adding unicode support to boost?

19 Oct 2004

      "Vladimir Prus" <ghost@cs.msu.su> wrote in message
news:cl2d2p$7a3$1@sea.gmane.org...
...
This was discussed extensively before. For example, Miro has pointed out
that even plain "find" is not suitable for unicode strings because some
characters can be represeted with several wchar_t values.
Then, there's an issue of proper collation. Given that Unicode can contain
accents and various other "marks", it is not obvious that
string::operator<
will always to the right thing.
My reference (Stroustrup, The C++ Programming language) shows the locale
class containing a function

template<class Ch, class Tr, class A> // compare strings using this locale
bool operator()(const basic_string<Ch, Tr, A> & const basic_string<Ch, Tr,
A> & ) const;

So I always presumed that there was a "unicode" locale that implemented this
as well all other required information.  Now that I think about it I realize
that it was only a presumption that I never really checked.  Now I wonder
what facitlities do most libraries do provide for unicode facets.  I know
there are ansi functions for translating between multi-byte and wide
character strings.  I've used these functions and they did what I expected
them to do.  I presumed they worked in accordance with the currently
selected locale and its related facets.  If the
basic_string<wchar_t>::operator<(...) isn't doing "the right thing" wouldn't
it be just a bug in the implementation of the standard library rather than a
candidate for a boost library?

Robert Ramey