
My reply to Darren seems to have been eaten by the GMane monster. Resending.... Eric Niebler wrote:
Answers inline...
Darren Cook wrote:
Do you think the library should be accepted as a Boost library? Yes, but conditional on having benchmarks showing a worthwhile speed improvement over boost.regex. (Or alternatively over spirit.) Without that there is no strong reason to have it in Boost along with both Spirit and boost.regex.
There are results from of performance benchmarks of static xpressive vs. dynamic xpressive vs. Boost.Regex in the Appendix of xpressive's documentation. You must have missed it. See:
http://boost-sandbox.sf.net/libs/xpressive/doc/html/xpressive/perf.html
In short, xpressive comes out consistently ahead of Boost.Regex on short matches, and roughly on par for longer matches (with wide variation). Results are shown for both gcc 3.4 and vc7.1. The xpressive download includes the code for the perf test, so you can run it yourself, if you like.
* user_s_guide.html As I read I assumed "sregex" meant static (compile-time) regex. I then thought compile() must be very clever and wondered why bother with the alternative ">>" syntax. So I think you need to make it clearer on this page that sregex means std::string regex, and that compile() is for a run-time regex, and the ">>" syntax is for a compile-time regex.
Agreed.
* creating_a_regex_object.html 1. Either the meaning of Perl's /s modifier needs to be defined clearly, or the difference between "_" and "~_n" needs to be shown with an example (incidentally none of your examples at examples.html match strings with carriage-returns).
Agreed. FYI, "_" matches any one character. ~_n matches any character that is not '\n'. I also need to describe _ln which matches a logical newline (eg., "\n" or "\r" or "\r\n" or other line separators) and ~_ln which matches any one character that is not a line separator. This all needs to be documented better.
2. I see I can use icase("Abc") but is there a way to say the whole regex should be case-insensitive? I.e. the equivalent of: "/match something/i"
You can just wrap the whole regex in icase(). I need to show an example of that.
* grammars_and_nested_matches.html In the example that starts: sregex parentheses; parens = '('
should "parens" actually be "parentheses" ?
Yes. My bad.
2. In Filtering Nested Results, I wasn't clear what the purpose was. Is it to show all the name matches before all the id matches? If so, choosing a less regular example string would help, e.g. with more names than ids, names following names some of the time, etc.
I'm not at all sure of the utility of the nested results filter, and I may just cut it. After matching a regex that contains nested regexes, the match_results object contains nested results. Figuring out which results correspond to which regex can be difficult. The filter lets you see only those results corresponding to a particular nested regex. But I've yet to need it in practice. *shrug*
3. "See the definition of output_nested_results in the Examples section." I think that function should be moved to grammars_and_nested_matches.html; it seemed out of place in examples.html.
You're right it doesn't belong in Examples. But I didn't want to clutter the user doc with what is really an implementation detail. I'll think about it.
* Other 1. I'd like to see some fuller examples, that show the I/O part as well. E.g. a full program that takes a list of email addresses on stdin, one per line, and spits out a list of the illegal ones.
Haha! Have you /seen/ the regex that matches email addresses? It's 5 pages long. But I get the idea -- examples are important. I'll see what I can come up with.
2. Benchmarking. I wanted to see the relative speed of compile-time vs. run-time vs. boost::regex (and ideally vs. PCRE or a scripting language) on some realistic application.
It's in there.
-- Eric Niebler Boost Consulting www.boost-consulting.com