Re: [boost] [regex, xpressive] interesting(?) perf benchmark

7 Jun 2010

      ...
I'm always suspect of benchmarks, but I figured I'd post this anyway.
http://astl.sourceforge.net/bench.7.html.
LOL, nice to see my own benchmark suite used against me!
...
Anyway, it's not my intention to kick off yet another costly escalation
in the regex war. God help me, I don't have the time. I post it here
because I haven't come across a comparison of these four libraries by a
third party before. Some quick observations:
- I'm pleased that on the balance, xpressive fares well here relative to
the other NFAs. In the end, though, not much separates best from worst.
- All the NFA libraries seem to have large overhead that becomes very
noticeable when matching short strings. I don't know why.
I spent some time trying mostly in vain to hammer this down lower with 
current Boost.Regex - it turns out that most of it is down to initialisation 
of state - Perl-style re's typically have a lot of state to initialise.  The 
thing to remember is for that short matches against short texts the cost of 
a match is basically the cost of a function call for DFA's.  Or to put it 
another way, even if the NFA's are 10x slower on these short tests, 10x 
almost nothing is still almost nothing (though I accept there are certain 
outlyer-use-cases where this overhead becomes an issue).
...
- I'm gobsmacked that my VERY OLD library greta handily beats pcre,
boost.regex /and/ xpressive on short matches. Failing to beat yourself
in any perf benchmark is kind of embarrassing.
Probably more features in xpressive to set up?
...
- I'm tickled that for long matches, both xpressive and boost.regex
(NFAs) beat his library (DFA). How'd that happen?
Maybe reflects the preocupations of their authors?
...
- I find it interesting that both xpressive and boost.regex have large
discontinuities in the chart at the bottom; the others don't. Huh.
Weird indeed, can't really imagine why that would happen, unless there's 
some kind of cache miss effect going on?

Cheers, John.

Re: [boost] [regex, xpressive] interesting(?) perf benchmark

John Maddock