
Thanks for the feedback. Answers inline... Rob Stewart wrote:
From: "Eric Niebler" <eric@boost-consulting.com>
I was reading through a portion of the docs and a few issues came to mind.
This one applies to Boost.RegEx, too, but I'll ask you: Why have both regex_match() and regex_search() when the latter can behave like the former by adding two anchors?
This is true. I'm following the lead of the regex std proposal here, but I've never felt comfortable with regex_match, to be honest. A common noobie mistake is to use regex_match instead of regex_search. Perl, for instance, doesn't distinguish between "search" and "match" operations, and "search" is the default. What makes it worse is that in Perl circles, the semantic equivalent of regex_search is called /matching/, hence the disconnect. Not sure what to do. Perhaps John could comment.
Why does the regex_token_iterator<> ctor use a magic number like -1 to indicate behavior rather than a named value? (I just clicked through to the reference and see that it takes a regex_constants::match_flag_type, but http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/xpressive/examp... shows passing -1 -- with an explanatory comment -- instead. This leads to confusion.)
Again, I'm just following the standard here, but providing a named constant would be a nice addition. The -1 is an optional 4th parameter, and the match_flag_type is an optional 5th parameter -- so there should be no confusion.
The following items are from the "Perl syntax vs. Static xpressive syntax" table in http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/xpressive/creat...:
You seem to suggest that the xpressive equivalent of Perl's "a|b" must be spelled "a | b" but as far as I can see, the whitespace is irrelevant, so calling attention to it suggests a difference that doesn't exist.
Naturally whitespace is irrelevant. That's how C++ works. I don't think this should be a source of confusion for people.
"bos" and "eos" are a little odd. First, it seems like "sequence" should be "input." Second, I usually think of SOF/EOF and SOL/EOL pairs rather than BOF/EOF and BOL/EOL. Thus, I'd have gone with "soi" and "eoi" at the least. Unfortunately, in an effort to keep them short, they aren't terribly mnemonic. How about "start" and "end" (or "beg" and "end" if you want to go with just three letters)?
The regex std proposal has match flags match_not_bol and match_not_eol, so I'm reusing this terminology. Boost.Regex also has match_not_bob for "beginning of buffer". This is not proposed for standardization, and I don't think the term "buffer" is appropriate anyway. You like "input" but I prefer "sequence". I dislike "input" becauase it might suggest to people that input iterators are acceptable to the regex algorithms, where as a bidirectional sequence is what is required.
. appears twice in the table with two different equivalences. It may be that the two are effectively the same, but they aren't grouped and the "Meaning" doesn't point out their equivalence.
Yes the docs are misleading here. In perl, . can have two meanings, depending on the /s modifier. xpressive's docs should be more specific.
Considering how much you compare xpressive to Perl's REs, I'm surprised you opted for ~_d instead of _D, for example. I'm not saying that would be better, but the disconnect from Perl didn't seem necessary in this case.
It is necessary. _D is an illegal identifier, reserved to the implementation. All identifiers that begin with an underscore and a capital letter are illegal in user code. Even if that were not the case, ALL CAPS is reserved for macros by convention. That's how I ended up with ~_d.
For "[abc]," you show to different xpressive equivalents, each in its own row of the table. Why not combine them into a single row? (Same for any other cases like that.)
Sure.
A tool that converts a Perl-style RE to xpressive (static notation certainly, and dynamic if there are any differences) would be quite helpful (for those that know Perl's REs).
Total agreement. It's on my list, but reaching v1.0 is a higher priority for me right now. Thanks! -- Eric Niebler Boost Consulting www.boost-consulting.com