[regex_spirit_xpressive] Are timings of search algo's available vs "by-hand"?

<alert comment="boost newbie"> I've been impressed by the functionality provided by the regex-related libraries in boost that I've looked at so far. However, before trekking to far-distant "grok-land" off in the mists, I wanted to get some idea if there were negligible or minor or major performance tradeoffs. I've seen comparisons between regex, spirit, and xpressive (that were from several years ago and probably obsolete .... done by the library developers). I'm wondering how these would compare to a "hand-tuned" state-machine routine, and to an automatically generated state-machine from a FSM utility (tend to be bloated but can be fast). My interest is specialized to finding which pattern in a "group" of patterns was detected, and the offset within the testStr. To illustrate, the regex would be something like detecting the full or abbreviated Day-Of-Week: ((Sunday|Sun)|(Monday|Mon)|(Tuesday|Tue) |(Wednesday|Wed)|(Thursday|Thu) |(Friday|Fri)|(Saturday|Sat)) The testStr is something like: std::string testStr = "Alternate days of the week are Tue and Thursday and Sat and Monday. " "And then Monday and Wed and Friday and Sun. " "Near misses are WeD TuE ThU SuN SaT MoN FrI "; The real application is inputting a batch of 2mb files and generating SGML-like output with embedded tags. (e.g. enclose Tue in <dow=2>Tue</dow> and <dow=4>Thursday</dow>) The above seems like the kind of task for which regex libraries would be appropriate, would be beyond strstr, but wouldn't be excessively difficult to accomplish "by hand". The unknown is whether there is a perfomance trade-off in using a regex library, and whether it is positive, negative, minor, or major. I've started some preliminary timings with vc7.1 release /O2 build with HiResTimer using QueryPerformanceTimer in ABOVE_NORMAL_PRIORITY_CLASS Before proceeding much further, this newbie thought it would be good to check if people with real boost experience have done this kind of benchmarking. A search for "benchmark" and "timings" in boost-user and boost-devel didn't turn up much. Preview: so far the preliminary results look VERY GOOD, but "consider the source". </alert>
participants (1)
-
Lynn Allan