
Emery De Nuccio wrote:
I have been using a set of Unicode classes that I wrote that follow the STL common conventions for a while. These include unicode::ustring to replace std::string, unicode::uifstream to replace std::uifstream, and unicode::uofstream to replace std::ofstream. Is there any interest in adding them to Boost? I was pretty shocked to find that boost doesn't already have something like this (or maybe it does and I just haven't found it).
Each of the classes mimics the standard equivalents' design like member names and iterator sub-classes. They also bidirectionally support all of UTF-8, UTF-16, UTF-32/UCS-4, UCS-2, and traditional ASCII.
The only thing that I think is lacking is regex - not sure how to go about that. Maybe collaborate with the current boost.regex author(s) to add non-PCRE Unicode support?
See http://www.boost.org/libs/regex/doc/icu_strings.html for the existing Unicode support. Adding support for more string/character types is the easy part: providing Unicode aware character classification and sorting is much harder which is why I currently rely on ICU for this. John.