
-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Vladimir Pozdyayev Sent: Monday, October 25, 2004 2:09 AM To: boost@lists.boost.org Subject: [boost] [regex] any interest in finite automata-based regex engine?
Hello.
Are there any plans to add finite automata-based subengine to the regex library? If there are, I can submit the algorithm/implementation of a modified nondeterministic FA which takes care of * greedy and lazy quantifiers, * eagerness of the alternation operator, * non-longest matches (`best' in the Perl-ish sense with respect to greediness/laziness/eagerness), * substring capturing. All these features being included, the time cost of a match is O(N(E+C)L), where N is the number of states, E is the average epsilon-closure size per state, C is the number of capturing parentheses, and L is the input length. As far as I know that's the lowest order of time complexity in this area (corrections are welcome)---though, of course, deterministic FA would be even faster.
I suppose there's no point in adding another regex library, but being able to guarantee linear time cost in an existing one would be nice.
Very nice. I've banged up against that one painfully recently (not with Boost's regex engine, be it noted <g>), and frankly, that's quite a guarantee given the features you mention. Reid