
regex = "(Sunday|Sun)|(Monday|Mon) etc.(Saturday|Sat)";
By-hand parser: Elapsed for 10000 loops Ms: 110.583 By-hand parser: Elapsed for 100000 loops Ms: 1107.81
re2c generator Elapsed for 10000 loops Ms: 69.4683 re2c generator Elapsed for 100000 loops Ms: 700.546
Boost::xpressive-static-iterator: 10000 loops Ms: 410.492 Boost::xpressive-static-iterator: 100000 loops Ms: 4164.45
Eric Niebler wrote:
Interesting!
I did some HiResTimer comparisons to strstr and wonder if the results are credible ... the re2c numbers are much closer to strstr than I expected. (also, these reflect some "tweaking" since the previous email that recognized DayOfWeek rather than ZipCode, and the numbers below are about 40% faster than previous) const char* pzStrToScan_Re2cSearch = "12345 at pos=0 and " "another zip-code 12345-6789 at pos=36 and " "another 98765-4321 at pos=69 and " "another at end=113 11223-3445"; const char* pzStrToScan_strstr = "12345 at pos=0 and " "another zip-code 12345-6789 at pos=36 and " "another 12345-4321 at pos=69 and " "another at end=113 12345-3445"; WinXp-Sp2 vc7.1 on AMD-3700 10,000 loops thru above to find 40,000 matches strstr just looking for 12345: 3.2 milliseconds Re2cSearch looking for [0-9]{5}(-[0-9]{4})? : 5.3 ms 100,000 loops thru above to find 400,000 matches (mostly to verify that optimizer isn't distorting the results) strstr just looking for 12345: 31.8 milliseconds Re2cSearch looking for [0-9]{5}(-[0-9]{4})? : 53.8 ms The re2c generated code also passes a relatively extensive cppunit-like test.
Also, do you think you could send around the code that re2c is generating for this expression?
Here is a link to a .zip with the vc6 and vc7.1 projects (vc8 to follow): http://cleanspeech.sf.net/misc/re2c_ZipCodeRe_060419.zip Feedback appreciated, especially if I've messed up and the numbers are flawed (not unlikely). (Also, there is some "hold your nose" code in this C prototype. The intent is to eventually be able to use the relatively simple ZipCodeRe as a "getting up to speed" example for re2c newbies (like myself), and perhaps as a template/clone for other 'recognizers'.)