xpressive: set vs standard alternation compile problems
Hi,
I'm a new user of xpresssive and I'd like to first give a big thank
you for this wonderful tool.
I'm having problems with the syntax for static regexes. The cheat
sheet in the documentation says that |(set= 'a','b','c')| and
|as_xpr('a') | 'b' |'c'| are equivalent yet the example
sregex parentheses;
parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles while
sregex parentheses;
parentheses = '(' >> *(keep(+~(as_xpr('(')|')'))|by_ref(parentheses)) >> ')';
fails to do so (code and abbreviated errors attached). Additionally,
combining the original lines into one construction, like so:
sregex parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles but running the code fails with:
regnestest: /usr/include/boost/xpressive/detail/utility/tracking_ptr.hpp:457:
const boost::shared_ptr<T>& boost::xpressive::detail::tracking_ptr<Type>::get_(bool) const [with Type = boost::xpressive::detail::regex_impl<__gnu_cxx::__normal_iterator
Zachary wrote:
Hi, I'm a new user of xpresssive and I'd like to first give a big thank you for this wonderful tool.
Thanks!
I'm having problems with the syntax for static regexes. The cheat sheet in the documentation says that |(set= 'a','b','c')| and |as_xpr('a') | 'b' |'c'| are equivalent yet the example
sregex parentheses;
parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles while
sregex parentheses;
parentheses = '(' >> *(keep(+~(as_xpr('(')|')'))|by_ref(parentheses)) >> ')';
fails to do so (code and abbreviated errors attached).
I see how my docs led you astray. Although they match the same strings, there actually is a difference between (set='a','b','c') and (as_xpr('a')|'b'|'c'). The first is a set and can have its complement taken (e.g., operator~). The second is just a bunch of regexes in alternate, and cannot have its complement taken. Consider this a doc bug, which I'll fix.
Additionally, combining the original lines into one construction, like so:
sregex parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles but running the code fails with: <snip>
This is an invalid use. It's no different than: int i = i + 1; // oops! You may be surprised to find this compiles, but invokes undefined behavior. It's a sad fact in C++ that objects are in scope before they are initialized, leading to these sorts of nasty situations. If you would like to put this all in one line, use xpressive::self, as: sregex parentheses = '(' >> *(keep(+~(set='(',')'))|self) >> ')'; HTH, -- Eric Niebler Boost Consulting www.boost-consulting.com
Thank you very much, the two points below cleared a lot up for me, though I should have realized the construction issue on my own. The construction and negation sets from characters does raise one other inconvenient point, which is maybe misleading in the documentation: All of the examples which refer to matching "tags" whether parentheses or html, have the following limitations, if I am understanding correctly. They either: a) are limited to single character "tags" or b) the examples will not work as desired in the case of nested tags Am I correct about this? And if so, can you suggest any purely regex magic to deal with multi-character nested tags? Thanks, Zak
I see how my docs led you astray. Although they match the same strings, there actually is a difference between (set='a','b','c') and (as_xpr('a')|'b'|'c'). The first is a set and can have its complement taken (e.g., operator~). The second is just a bunch of regexes in alternate, and cannot have its complement taken.
Consider this a doc bug, which I'll fix.
Additionally, combining the original lines into one construction, like so:
sregex parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles but running the code fails with:
<snip>
This is an invalid use. It's no different than:
int i = i + 1; // oops!
You may be surprised to find this compiles, but invokes undefined behavior. It's a sad fact in C++ that objects are in scope before they are initialized, leading to these sorts of nasty situations.
If you would like to put this all in one line, use xpressive::self, as:
sregex parentheses = '(' >> *(keep(+~(set='(',')'))|self) >> ')';
participants (2)
-
Eric Niebler
-
Zachary