xpressive: set vs standard alternation compile problems

Hi, I'm a new user of xpresssive and I'd like to first give a big thank you for this wonderful tool. I'm having problems with the syntax for static regexes. The cheat sheet in the documentation says that |(set= 'a','b','c')| and |as_xpr('a') | 'b' |'c'| are equivalent yet the example sregex parentheses; parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')'; compiles while sregex parentheses; parentheses = '(' >> *(keep(+~(as_xpr('(')|')'))|by_ref(parentheses)) >> ')'; fails to do so (code and abbreviated errors attached). Additionally, combining the original lines into one construction, like so: sregex parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')'; compiles but running the code fails with: regnestest: /usr/include/boost/xpressive/detail/utility/tracking_ptr.hpp:457: const boost::shared_ptr<T>& boost::xpressive::detail::tracking_ptr<Type>::get_(bool) const [with Type = boost::xpressive::detail::regex_impl<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >]: Assertion `!this->has_deps_()' failed. and using as_xpr for the first term: sregex parentheses = as_xpr('(') >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')'; also compiles but causes a seg fault when run! All this is using g++ 4.2.3 and boost 1.34 as packaged with Debian testing. These seem like library errors to me, but like I said, I'm new to xpressive, and I'd be even happier to have my understanding improved. Zachary g++ -MMD regnestest.cpp -o regnestest /usr/include/boost/xpressive/detail/static/productions/complement_compiler.hpp: In instantiation of ‘boost::xpressive::detail::complement<boost::proto::binary_op<boost::proto::unary_op<boost::xpressive::detail::literal_placeholder<char, false>, boost::proto::noop_tag>, boost::proto::unary_op<char, boost::proto::noop_tag>, boost::proto::bitor_tag>, boost::xpressive::detail::xpression_visitor<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, mpl_::bool_<false>, boost::xpressive::cpp_regex_traits<char> > >’: /usr/include/boost/xpressive/detail/static/productions/complement_compiler.hpp:227: instantiated from ‘boost::xpressive::detail::complement_transform::apply<boost::proto::unary_op<boost::proto::binary_op<boost::proto::unary_op<boost::xpressive::detail::literal_placeholder<char, false>, boost::proto::noop_tag>, boost::proto::unary_op<char, boost::proto::noop_tag>, boost::proto::bitor_tag>, boost::proto::complement_tag>, boost::xpressive::detail::static_xpression<boost::xpressive::detail::true_matcher, boost::xpressive::detail::no_next>, boost::xpressive::detail::xpression_visitor<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, mpl_::bool_<false>, boost::xpressive::cpp_regex_traits<char> > >’ /usr/include/boost/xpressive/proto/compiler/transform.hpp:65: instantiated from ‘boost::proto::transform_compiler<boost::xpressive::detail::complement_transform, boost::xpressive::detail::seq_tag, void>::apply<boost::proto::unary_op<boost::proto::binary_op<boost::proto::unary_op<boost::xpressive::detail::literal_placeholder<char, false>, boost::proto::noop_tag>, boost::proto::unary_op<char, boost::proto::noop_tag>, boost::proto::bitor_tag>, boost::proto::complement_tag>, boost::xpressive::detail::static_xpression<boost::xpressive::detail::true_matcher, boost::xpressive::detail::no_next>, boost::xpressive::detail::xpression_visitor<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, mpl_::bool_<false>, boost::xpressive::cpp_regex_traits<char> > >’ /usr/include/boost/xpressive/proto/compiler/transform.hpp:74: instantiated from ‘boost::proto::transform_compiler<boost::proto::arg_transform, boost::xpressive::detail::seq_tag, void>::apply<boost::proto::unary_op<boost::proto::unary_op<boost::proto::binary_op<boost::proto::unary_op<boost::xpressive::detail::literal_placeholder<char, false>, boost::proto::noop_tag>, boost::proto::unary_op<char, boost::proto::noop_tag>, boost::proto::bitor_tag>, boost::proto::complement_tag>, boost::proto::unary_plus_tag>, boost::xpressive::detail::static_xpression<boost::xpressive::detail::true_matcher, boost::xpressive::detail::no_next>, boost::xpressive::detail::xpression_visitor<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, mpl_::bool_<false>, boost::xpressive::cpp_regex_traits<char> > >’ /usr/include/boost/xpressive/proto/compiler/branch.hpp:37: instantiated from ‘boost::proto::branch_compiler<boost::xpressive::detail::simple_repeat_branch<true, 1u, 4294967294u>, boost::xpressive::detail::ind_tag>::apply<boost::proto::unary_op<boost::proto::unary_op<boost::proto::binary_op<boost::proto::unary_op<boost::xpressive::detail::literal_placeholder<char, false>, boost::proto::noop_tag>, boost::proto::unary_op<char, boost::proto::noop_tag>, boost::proto::bitor_tag>, boost::proto::complement_tag>, boost::proto::unary_plus_tag>, boost::xpressive::detail::static_xpression<boost::xpressive::detail::true_matcher, boost::xpressive::detail::no_next>, boost::xpressive::detail::xpression_visitor<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, mpl_::bool_<false>, boost::xpressive::cpp_regex_traits<char> > >’ /usr/include/boost/xpressive/proto/compiler/conditional.hpp:40: instantiated from ‘boost::proto::conditional_compiler<boost::xpressive::detail::use_simple_repeat_predicate, boost::proto::branch_compiler<boost::xpressive::detail::simple_repeat_branch<true, 1u, 4294967294u>, boost::xpressive::detail::ind_tag>, boost::proto::transform_compiler<boost::proto::compose_transforms<boost::proto::arg_transform, boost::xpressive::detail::repeater_if_transform<true, 1u, 4294967294u> >, boost::xpressive::detail::seq_tag, void> >::apply<boost::proto::unary_op<boost::proto::unary_op<boost::proto::binary_op<boost::proto::unary_op<boost::xpressive::detail::literal_placeholder<char, false>, boost::proto::noop_tag>, boost::proto::unary_op<char, boost::proto::noop_tag>, boost::proto::bitor_tag>, boost::proto::complement_tag>, boost::proto::unary_plus_tag>, boost::xpressive::detail::static_xpression<boost::xpressive::detail::true_matcher, boost::xpressive::detail::no_next>, boost::xpressive::detail::xpression_visitor<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, mpl_::bool_<false>, boost::xpressive::cpp_regex_traits<char> > >’ --snipped for brevity-- regnestest.cpp:10: instantiated from here --subsequent errors omitted for brevity--

Zachary wrote:
Hi, I'm a new user of xpresssive and I'd like to first give a big thank you for this wonderful tool.
Thanks!
I'm having problems with the syntax for static regexes. The cheat sheet in the documentation says that |(set= 'a','b','c')| and |as_xpr('a') | 'b' |'c'| are equivalent yet the example
sregex parentheses;
parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles while
sregex parentheses;
parentheses = '(' >> *(keep(+~(as_xpr('(')|')'))|by_ref(parentheses)) >> ')';
fails to do so (code and abbreviated errors attached).
I see how my docs led you astray. Although they match the same strings, there actually is a difference between (set='a','b','c') and (as_xpr('a')|'b'|'c'). The first is a set and can have its complement taken (e.g., operator~). The second is just a bunch of regexes in alternate, and cannot have its complement taken. Consider this a doc bug, which I'll fix.
Additionally, combining the original lines into one construction, like so:
sregex parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles but running the code fails with: <snip>
This is an invalid use. It's no different than: int i = i + 1; // oops! You may be surprised to find this compiles, but invokes undefined behavior. It's a sad fact in C++ that objects are in scope before they are initialized, leading to these sorts of nasty situations. If you would like to put this all in one line, use xpressive::self, as: sregex parentheses = '(' >> *(keep(+~(set='(',')'))|self) >> ')'; HTH, -- Eric Niebler Boost Consulting www.boost-consulting.com

Thank you very much, the two points below cleared a lot up for me, though I should have realized the construction issue on my own. The construction and negation sets from characters does raise one other inconvenient point, which is maybe misleading in the documentation: All of the examples which refer to matching "tags" whether parentheses or html, have the following limitations, if I am understanding correctly. They either: a) are limited to single character "tags" or b) the examples will not work as desired in the case of nested tags Am I correct about this? And if so, can you suggest any purely regex magic to deal with multi-character nested tags? Thanks, Zak
I see how my docs led you astray. Although they match the same strings, there actually is a difference between (set='a','b','c') and (as_xpr('a')|'b'|'c'). The first is a set and can have its complement taken (e.g., operator~). The second is just a bunch of regexes in alternate, and cannot have its complement taken.
Consider this a doc bug, which I'll fix.
Additionally, combining the original lines into one construction, like so:
sregex parentheses = '(' >> *(keep(+~(boost::xpressive::set='(',')'))|by_ref(parentheses)) >> ')';
compiles but running the code fails with:
<snip>
This is an invalid use. It's no different than:
int i = i + 1; // oops!
You may be surprised to find this compiles, but invokes undefined behavior. It's a sad fact in C++ that objects are in scope before they are initialized, leading to these sorts of nasty situations.
If you would like to put this all in one line, use xpressive::self, as:
sregex parentheses = '(' >> *(keep(+~(set='(',')'))|self) >> ')';
participants (2)
-
Eric Niebler
-
Zachary