data:image/s3,"s3://crabby-images/0bff5/0bff58daf84157819fd0f9270ef1f0120effc590" alt=""
Hello, The current "incremental search" code needs to resume search where the a possible match starts, and for certain expressions the worst-case time (i.e. input bytes come in one at a time) complexity becomes O(n^2) where n is the number of characters that match the expression: Regex: omg.*wtf Pass 1 input: o Pass 1 input to matching: o Pass 1 verdict: Partial, resume at offset 0. Pass 2 input: m Pass 2 input to matching: om Pass 2 verdict: Partial, resume at offset 0. Pass 3 input: g Pass 3 input to matching: omg Pass 3 verdict: Partial, resume at offset 0. ... Pass n-1 input: t Pass n-1 input to matching: omg...wt (length n-1) Pass n-1 verdict: Partial, resume at offset 0. Pass n input: f Pass n input to matching: omg...wtf (length n) Pass n verdict: Match found and starts at offset 0. Total number of byte matches = n*(n+1)/2, i.e. O(n^2) I am looking for a way to feed the input bytes only once, in order to reduce the time complexity to approximately O(n). Of course I will lose the ability to recover submatches, but the ability is unnecessary for this particular application that I am considering. I cannot break the regex into two parts ("omg" and "wtf") either because it is specified as an input to the program. Eugene