Re: [Boost-users] [regex] Working with wchar_t on olderUNIXplatforms
As I understand using this feature requires ICU. Unfortunately, this is not an option for us :-( Andrei -----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of John Maddock Sent: Tuesday, March 21, 2006 20:54 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [regex] Working with wchar_t on olderUNIXplatforms
I am intrigued with what you said about converting data from UTF-8 to UTF-32 on the fly. It is absolutely not a problem to convert my Unicode strings to UTF-8 encoded strings. Where could I read about those on the fly conversions and what limitations do they have (e.g. how locale settings are handled)?
What locale settings? UTF-8 is mostly locale-independent (as an encoding), the only locale specific code is in the traits class to handle collation: and it only sees UTF-32 code points. The on-the-fly conversions are performed by iterator adapters in boost/regex/pending/unicode_iterator.hpp and the docs for the Unicode aware code is here: http://www.boost.org/libs/regex/doc/icu_strings.html John. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
As I understand using this feature requires ICU. Unfortunately, this is not an option for us :-(
Yes, sorry, understood. You're probably back to writing a traits class then, it's honestly not that hard :-) I sugest you take c_regex_traits as a starting point, change basic_string to vector and work from there. John.
participants (2)
-
Andrei Tarassov
-
John Maddock