
Vladimir Pozdyayev wrote:
Hello.
Are there any plans to add finite automata-based subengine to the regex library? If there are, I can submit the algorithm/implementation of a modified nondeterministic FA which takes care of * greedy and lazy quantifiers, * eagerness of the alternation operator, * non-longest matches (`best' in the Perl-ish sense with respect to greediness/laziness/eagerness), * substring capturing. All these features being included, the time cost of a match is O(N(E+C)L), where N is the number of states, E is the average epsilon-closure size per state, C is the number of capturing parentheses, and L is the input length.
Interesting. Is the algorithm from some published paper that you could link/post? Or do you have a formal description? - Volodya