
Hugh Wimberly wrote:
My advise: implement a simple DFA and use it only for patterns without captures (regex_constants::nosubs) or fancy assertions (eg. look-ahead). If you need captures and leftmost-biased semantics, just use the backtracking NFA.
My intention all along was to see what functionality I could get the DFA to achieve while still being very fast; I hadn't even intended to add look-ahead assertions unless time looked favorable. I'd certainly like to see if I can make captures fast, and from what I've seen, for common regexes without assertions, the DFA really is faster, often considerably. Could you send me the tests you performed? I'd be really interested in it.
Oh, my poor addled brain. Too long ago, I promised to make my Boost port of Russ Cox's O(N) Perl regex engine -- a modified Thompson NFA -- available. It does captures and Perl's leftmost biased rule (as opposed to the POSIX leftmost longest rule). As I mentioned, I found the performance to be pretty bad in the common case, but without exponential blow-up in the pathological cases. YMMV. You can find the code in thompson-nfa-perl-regex.cpp at http://www.boost-consulting.com/vault/index.php?directory=Strings%20-%20Text%20Processing& -- Eric Niebler Boost Consulting www.boost-consulting.com