
John Maddock wrote:
4) Punt the decision to the traits type :-) For xpressive, I added a in_range_nocase(Char a, Char b) member to the traits concept. By default the traits provided by xpressive do *not* do proper case folding. They just use toupper and tolower, and are documented as such. An ambitious person can write their own trait to do proper Unicode case folding and get the right behavior.
Right, but the question is: is it actually *possible* to do proper Unicode case folding with this interface?
Trivially, yes. (The actual interface is in_range_nocase(Char from, Char to, Char ch) -- there's a typo above.) The algorithm is: 1) Build a table such that for every Unicode character, you can get a list of its case-folded equivalents. I wrote a script that does this, using http://www.unicode.org/Public/UNIDATA/CaseFolding.txt as input. 2) In in_range_nocase(), look up the list of case-folded equivalent chars for ch, the char to test. 3) For each char in the list, see if it's in the range specified. -- Eric Niebler Boost Consulting www.boost-consulting.com