Dilts, Daniel D. wrote:
I'm trying to tokenize lines of a file using the included static regex. I only care about the tokens indicated by s* = ... When I use the sregex_token_iterator to parse the lines, I only get the last match for s2 and s3.
How should I change things so that I can get every match for s2 and s3 rather than just the last match?
sregex whitespace_regex = *_s; sregex line_regex = whitespace_regex >> (s1 = +_d) >> whitespace_regex >> ( +( '\"' >> (s2 = *~as_xpr('\"')) >> '\"' >> whitespace_regex >> ':' >> whitespace_regex >> '\"' >> (s3 = *~as_xpr('\"')) >> '\"' >> whitespace_regex ) | +( '\"' >> (s2 = *~as_xpr('\"')) >> '\"' >> whitespace_regex ) );
Hi, Daniel. If you are using the latest version of xpressive from the Boost File Vault, I would solve the problem this way: #include <string> #include <vector> #include <iostream> #include <boost/foreach.hpp> #include <boost/xpressive/xpressive.hpp> #include <boost/xpressive/regex_actions.hpp> using namespace boost; using namespace xpressive; int main() { local<std::vector<ssub_match> > strings; sregex line_regex = skip(_s) // skip whitespace ( (s1 = +_d) >> +( '\"' >> (s2 = *~as_xpr('\"'))[push_back(strings, s2)] >> '\"' >> ':' >> '\"' >> (s3 = *~as_xpr('\"'))[push_back(strings, s3)] >> '\"' ) | +( '\"' >> (s2 = *~as_xpr('\"'))[push_back(strings, s2)] >> '\"' ) ) ; std::string input(" 42 \"The answer to\" : \"Life\" \"The Universe\" : \"And Everything!\" "); if(regex_match(input, line_regex)) { BOOST_FOREACH(ssub_match s, strings.get()) { std::cout << s << std::endl; } } } The above uses semantic actions (the parts in []) to push sub-matches into a vector for reference later. (It also uses skip(_s) to skip whitespace.) If you are using xpressive 1.0, which is part of Boost 1.34.1, it would be a little trickier. There is no skip(), and no semantic actions. If that's the case, you can define a nested sregex quoted_string=*~as_xpr('\"');, and use that in your line_regex. Then every quoted string that matches will cause a nested result to be added to your match_results. See below: sregex quoted_string = *~as_xpr('\"'); sregex line_regex = keep(*_s) >> (s1 = +_d) >> keep(*_s) >> ( +( '\"' >> quoted_string >> '\"' >> keep(*_s) >> ':' >> keep(*_s) >> '\"' >> quoted_string >> '\"' >> keep(*_s) ) | +( '\"' >> quoted_string >> '\"' >> keep(*_s) ) ); std::string input(" 42 \"The answer to\" : \"Life\" \"The Universe\" : \"And Everything!\" "); smatch what; if(regex_match(input, what, line_regex)) { BOOST_FOREACH(smatch const &str, what.nested_results()) { std::cout << str[0] << std::endl; } } This is less efficient, but gets the job done. HTH, -- Eric Niebler Boost Consulting www.boost-consulting.com