Thank you very much, the two points below cleared a lot up for me, though I should have realized the construction issue on my own. The construction and negation sets from characters does raise one other inconvenient point, which is maybe misleading in the documentation: All of the examples which refer to matching "tags" whether parentheses or html, have the following limitations, if I am understanding correctly. They either: a) are limited to single character "tags" or b) the examples will not work as desired in the case of nested tags Am I correct about this? And if so, can you suggest any purely regex magic to deal with multi-character nested tags? Thanks, Zak
I see how my docs led you astray. Although they match the same strings, there actually is a difference between (set='a','b','c') and (as_xpr('a')|'b'|'c'). The first is a set and can have its complement taken (e.g., operator~). The second is just a bunch of regexes in alternate, and cannot have its complement taken.
Consider this a doc bug, which I'll fix.
Additionally, combining the original lines into one construction, like so:
sregex parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles but running the code fails with:
<snip>
This is an invalid use. It's no different than:
int i = i + 1; // oops!
You may be surprised to find this compiles, but invokes undefined behavior. It's a sad fact in C++ that objects are in scope before they are initialized, leading to these sorts of nasty situations.
If you would like to put this all in one line, use xpressive::self, as:
sregex parentheses = '(' >> *(keep(+~(set='(',')'))|self) >> ')';