
What do you think? Could boost regex make usage of such traits_class or you would not like to include it into the distribution?
I don't know, it depends what it does: how do you plan to handle character classification in a portable manner for unsigned short?
There are too many developers involved in the process, that we force all to recompile Xerces-C with specific settings. I don't think this would be an option for us. In our case it can also lead to unpredictable results, if one replaces xerces-c with freshly compiled xerces-c without icu support. I am a little bit sceptical about this.
OK let me try one more time: if you compile regex *only* with ICU support, and use the iterator based u32regex_match/u32regex_search algorithms (or their equivalent regex iterators) then it doesn't matter what character type Xerces or anything else uses as long as: It's an 8-bit type: then it'll be treated as an [unsigned] UTF-8 encoded string. Or: It's a 16-bit type, then it'll be treated as an [unsigned] UTF-16 encoded string. Or: It's a 32-bit type, then it'll be treated as an [unsigned] UTF-32 encoded string. Is that generic enough for you? :-) John.