Re: [Boost-users] How efficient is the boost::regex library?

28 Oct 2011

      Adding to what John has said, if I may I'll offer some anecdotal advice.

 Efficient use of regular expressions with /any/ implementation of a
regex library has a lot more to do with you writing an intelligent
regular expression, and knowing when it's appropriate to use them.
Certainly the engine helps, but if you write a stupid regular
expression, or attempt to use an RE for a task REs aren't well-suited
for, you won't get great results.

 You can find corner cases where perl-compatible REs have poor
performance, but there are features offered that make it possible to
tweak that regular expression to get better performance.

 I've used regular expressions from perl, ruby, python, C#, PCRE,
boost::regex, grep, awk, sed, mysql, postgres, and probably a few
others.  Most commonly I prefer grep for one-liners, and
perl-compatible engines in code; I don't run into the corner cases in
my work for the simple fact that they're corner cases -- that is to
say, you don't run into them unless you're looking for them, or you do
something otherwise out-of-the-ordinary.

-Brian

On Fri, Oct 28, 2011 at 2:50 AM, John Maddock <boost.regex@virgin.net> wrote:
...
...
Thanks John. I would be interested in seeing comparisons of
boost:regex with other regex libraries.
It's wildly out of date, but how about:
http://www.boost.org/doc/libs/1_47_0/libs/regex/doc/vc71-performance.html
...
Yesterday I found a fuzzy logic string regex library. Does the
boost::regex library support this?
No sorry,
HTH, John.
On Fri, Oct 28, 2011 at 5:10 AM, John Maddock <boost.regex@virgin.net>
wrote:
...
...
I am looking for the most efficient open-source C++ regex library.
Reading this article: http://swtch.com/~rsc/regexp/regexp1.html - It
seems that GNU awk is the best overall:
http://pdos.csail.mit.edu/~rsc/regexp-img/grep1p.png
This is all true, but also completely irrelevant. DFA's have good worst
case behaviour, but can be many times slower for common cases. It's also
impossible to implement a DFA matcher that offers the full range of Perl
regular expression features (if you think it can be done, congratulations,
you've just proved that P==NP).
It's also possible to protect the regex engine against runaway "bad"
expressions and bail out in those cases (this is what Boost.Regex does, it
throws an exception if the complexity of obtaining a match grows too
fast).
...
How does the boost::regex library compare?
Would you recommend boost::regex as the most efficient one, or would
you suggest another?
There's no such thing as best - it all depends on the data being searched
and the particular regular expression. In addition since most
Perl-compatible libraries use much the same algorithm they're all broadly
similar albeit with different quirks.
HTH, John.
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users

Re: [Boost-users] How efficient is the boost::regex library?

Brian Vandenberg