
John Maddock wrote:
And, of course, there are additional speed reserves in KMP for search problems.
Actually, I think I may remove that code as a dis-optimisation, a simple scan for first character is probably faster in the *average* case, and of course if you can do a Boyer-search that's even better.
Like you, more research is needed!
I have experimented a bit in this area, and you're right. I've had best results with a three-pronged approach to searches: (1) Boyer-Moore is a *huge* win if you can do it. xpressive and GRETA both use BM to great effect when the pattern begins with a string literal. (2) As a fall-back, use the first character if you can. This handles patterns like (a...|b.|c..), where I will scan forward for one of {abc}. (3) Fall back on brute force search at every position. -- Eric Niebler Boost Consulting www.boost-consulting.com