
The most pressing point for level 1 support is section 1.5 Caseless Matching: "Supported, note that at this level, case transformations are 1:1, many to many case folding operations are not supported (for example "ß" to "SS"). "
I forgot to mention: this is part of a larger digraph problem - in some languages more than one character may collate as a single unit - in some case Unicode may provide predefined ligatures for these, but they don't do so for every case combination of every ligature. Boost.Regex supports things like [[.ae.]-[.ll.]] (match anything that collates in the range "ae" to "ll"), and currently this should work reasonably well in case insensitive mode as well (it fails where a many-to-one case transformation is required). Also, since there is no way tell which digraphs (if any) are supported by the current locale, expressions such as [a-z] will only ever match one character, and never match say "ae", even if the current locale does regard "ae" as a single unit. I believe this is the only sensible option, particularly as in many cases whether the next two characters are regarded as a digraph is dependent upon the meaning of the word (which is to say you need a dictionary to work it out, as Martin Bonner pointer out). Re ICU: this appears to case folding (convert everything to a case insensitive form) for caseless comparisons, I would assume their regex component does the same, but haven't had a chance to try it out. John.