
John Maddock wrote:
I wonder why your results are so different from mine. Please post the code for your modified test so I can run it locally. Thanks.
It's in cvs now under the usual libs/regex/performance path: there are still some problems with the html output that I don't understand yet, but I haven't had a chance to look at those.
As for the results being different from yours, I have noticed that the results can differ quite a bit from run to run, particularly the ftp response expression "^([0-9]+)(\-| |$)(.*)$" I've seen either Boost.Regex or xpressive win out by some margin depending on the machine setup.
Right off, I spotted a couple of problems with your performance test: 1) The xpressive functions all have try/catch blocks in them, but the boost functions do not. 2) You are not passing the "optimize" syntax_option_type flag to xpressive's regex constructor. 3) People who care about performance *will* take the time to rewrite their patterns as static regexes, so a perf test that excludes static xpressive is less interesting. I fixed the first two problems and took the liberty of committing my changes. (In retrospect, a patch would have been the polite thing to do. Sorry.) I'll work on adding a test for static xpressive, too. After fixing these problems, the numbers for the short-matches comes out as I expected - dynamic xpressive is ahead by as much as 2x or more. The HTML search surprises me a bit -- xpressive does poorly. It could be related to the fact that this is a case-insensitive search. It's possible I have a bug, or it's possible that the silly things I had to do to make Boyer-Moore work with the regex traits interface make Boyer-Moore more trouble than its worth for case-insensitive matches. I haven't yet run the other tests. Testing: "abc" against "abc" Boost regex (C++ locale): 3.6478e-007s xpressive regex: 1.49012e-007s Testing: "^([0-9]+)(\-| |$)(.*)$" against "100- this is a line of ftp response which contains a message string" Boost regex (C++ locale): 7.59125e-007s xpressive regex: 4.46796e-007s Testing: "([[:digit:]]{4}[- ]){3}[[:digit:]]{3,4}" against "1234-5678-1234-456" Boost regex (C++ locale): 1.13106e-006s xpressive regex: 7.00951e-007s Testing: "^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$" against "john@johnmaddock.co.uk" Boost regex (C++ locale): 1.75667e-006s xpressive regex: 1.34087e-006s Testing: "^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$" against "foo12@foo.edu" Boost regex (C++ locale): 1.48964e-006s xpressive regex: 1.19209e-006s Testing: "^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$" against "bob.smith@foo.tv" Boost regex (C++ locale): 1.52016e-006s xpressive regex: 1.16158e-006s Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$" against "EH10 2QQ" Boost regex (C++ locale): 5.96046e-007s xpressive regex: 3.20435e-007s Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$" against "G1 1AA" Boost regex (C++ locale): 5.80788e-007s xpressive regex: 3.20435e-007s Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$" against "SW1 1ZZ" Boost regex (C++ locale): 5.96046e-007s xpressive regex: 3.35217e-007s Testing: "^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$" against "4/1/2001" Boost regex (C++ locale): 5.36919e-007s xpressive regex: 3.12805e-007s Testing: "^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$" against "12/12/2001" Boost regex (C++ locale): 5.51224e-007s xpressive regex: 3.12805e-007s Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "123" Boost regex (C++ locale): 5.65529e-007s xpressive regex: 2.98023e-007s Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "+3.14159" Boost regex (C++ locale): 5.96046e-007s xpressive regex: 3.49998e-007s Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "-3.14159" Boost regex (C++ locale): 5.96046e-007s xpressive regex: 3.42369e-007s Testing: ^(template[[:space:]]*<[^;:{]+>[[:space:]]*)?(class|struct)[[:space:]]*(\<\w+\>([ ]*\( [^)]*\))?[[:space:]]*)*(\<\w*\>)[[:space:]]*(<[^;:{]+>[[:space:]]*)?(\{|:[^;\{()]*\{) Boost regex (C++ locale): 0.00012207s xpressive regex: 0.000217529s Testing: (^[ ]*#(?:[^\\\n]|\\[^\n_[:punct:][:alnum:]]*[\n[:punct:][:word:]])*)|(//[^\n]*|/\*.*?\*/)|\<([+-]?(?:(?:0x[[:xdigit:]]+)|(?:(?:[[:digit:]]*\.)?[[:digit:]]+(?:[eE][+-]?[[:digit:]]+)?))u?(?:(?:int(?:8|16|32|64))|L)?)\>|('(?:[^\\']|\\.)*'|"(?:[^\\"]|\\.)*")|\<(__asm|__cdecl|__declspec|__export|__far16|__fastcall|__fortran|__import|__pascal|__rtti|__stdcall|_asm|_cdecl|__except|_export|_far16|_fastcall|__finally|_fortran|_import|_pascal|_stdcall|__thread|__try|asm|auto|bool|break|case|catch|cdecl|char|class|const|const_cast|continue|default|delete|do|double|dynamic_cast|else|enum|explicit|extern|false|float|for|friend|goto|if|inline|int|long|mutable|namespace|new|operator|pascal|private|protected|public|register|reinterpret_cast|return|short|signed|sizeof|static|static_cast|struct|switch|template|this|throw|true|try|typedef|typeid|typename|union|unsigned|using|virtual|void|volatile|wchar_t|while)\> Boost regex (C++ locale): 0.00426563s Exception: mismatched parenthesis xpressive regex: -1s Testing: ^[ ]*#[ ]*include[ ]+("[^"]+"|<[^>]+>) Boost regex (C++ locale): 0.000183105s xpressive regex: 0.000213623s Testing: ^[ ]*#[ ]*include[ ]+("boost/[^"]+"|<boost/[^>]+>) Boost regex (C++ locale): 0.000183105s xpressive regex: 0.000213623s Testing: beman|john|dave Boost regex (C++ locale): 0.000251465s xpressive regex: 0.0003125s Testing: <p>.*?</p> Boost regex (C++ locale): 0.00019458s xpressive regex: 0.00074707s Testing: <a[^>]+href=("[^"]*"|[^[:space:]]+)[^>]*> Boost regex (C++ locale): 0.000716797s xpressive regex: 0.00167773s Testing: <h[12345678][^>]*>.*?</h[12345678]> Boost regex (C++ locale): 0.000202148s xpressive regex: 0.00103711s Testing: <img[^>]+src=("[^"]*"|[^[:space:]]+)[^>]*> Boost regex (C++ locale): 0.000206055s xpressive regex: 0.000533203s Testing: <font[^>]+face=("[^"]*"|[^[:space:]]+)[^>]*>.*?</font> Boost regex (C++ locale): 0.000213623s xpressive regex: 0.000465332s -- Eric Niebler Boost Consulting www.boost-consulting.com