Adding to what John has said, if I may I'll offer some anecdotal advice. Efficient use of regular expressions with /any/ implementation of a regex library has a lot more to do with you writing an intelligent regular expression, and knowing when it's appropriate to use them. Certainly the engine helps, but if you write a stupid regular expression, or attempt to use an RE for a task REs aren't well-suited for, you won't get great results. You can find corner cases where perl-compatible REs have poor performance, but there are features offered that make it possible to tweak that regular expression to get better performance. I've used regular expressions from perl, ruby, python, C#, PCRE, boost::regex, grep, awk, sed, mysql, postgres, and probably a few others. Most commonly I prefer grep for one-liners, and perl-compatible engines in code; I don't run into the corner cases in my work for the simple fact that they're corner cases -- that is to say, you don't run into them unless you're looking for them, or you do something otherwise out-of-the-ordinary. -Brian On Fri, Oct 28, 2011 at 2:50 AM, John Maddock <boost.regex@virgin.net> wrote:
Thanks John. I would be interested in seeing comparisons of boost:regex with other regex libraries.
It's wildly out of date, but how about: http://www.boost.org/doc/libs/1_47_0/libs/regex/doc/vc71-performance.html
Yesterday I found a fuzzy logic string regex library. Does the boost::regex library support this?
No sorry,
HTH, John.
On Fri, Oct 28, 2011 at 5:10 AM, John Maddock <boost.regex@virgin.net> wrote:
I am looking for the most efficient open-source C++ regex library.
Reading this article: http://swtch.com/~rsc/regexp/regexp1.html - It seems that GNU awk is the best overall: http://pdos.csail.mit.edu/~rsc/regexp-img/grep1p.png
This is all true, but also completely irrelevant. DFA's have good worst case behaviour, but can be many times slower for common cases. It's also impossible to implement a DFA matcher that offers the full range of Perl regular expression features (if you think it can be done, congratulations, you've just proved that P==NP).
It's also possible to protect the regex engine against runaway "bad" expressions and bail out in those cases (this is what Boost.Regex does, it throws an exception if the complexity of obtaining a match grows too fast).
How does the boost::regex library compare?
Would you recommend boost::regex as the most efficient one, or would you suggest another?
There's no such thing as best - it all depends on the data being searched and the particular regular expression. In addition since most Perl-compatible libraries use much the same algorithm they're all broadly similar albeit with different quirks.
HTH, John. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users