
Sebastian Pipping wrote:
It seems to me like Boost.Regex is much more powerful than QRegExp of QT. Can Boost.Regex work with QChar [1] as charT? The text "i.e. either char or wchar_t" on [2] made me unsure of this. Any limitations I should know about?
This is going to be a "yes but you will need to do some small amount of
work" kind of answer, sorry:-(
Basically, Boost.Regex needs to know some things about the character type -
like which code points are upper case, and how to convert cases etc in order
to do it's stuff. If QChar is a typedef for unsigned short, then
Boost.Regex won't have that information - by default it will be trying to
get it from the std::locales facets which won't be specialised for unsigned
short.
So..... either if you don't mind using IBM's ICU library for unicode support
(Maybe QT use this already for it's Uniocde support?), then you can use
boost::u32regex to scan either utf8, utf16, or utf32 encoded text, see
http://www.boost.org/libs/regex/doc/icu_strings.html
However, if QT doesn't use ICU already, that's probably a dumb idea: having
two unicode lib's doesn't sound like a sensible idea to me :-( So that
leaves you needing to write your own traits class for QChar so you can use
basic_regex