Dilts, Daniel D. wrote:
I'm trying to tokenize lines of a file using the included static regex. I only care about the tokens indicated by s* = ... When I use the sregex_token_iterator to parse the lines, I only get the last match for s2 and s3.
How should I change things so that I can get every match for s2 and s3 rather than just the last match?
sregex whitespace_regex = *_s;
sregex line_regex =
whitespace_regex >>
(s1 = +_d) >> whitespace_regex >>
(
+(
'\"' >> (s2 = *~as_xpr('\"')) >> '\"' >>
whitespace_regex >> ':' >> whitespace_regex >>
'\"' >> (s3 = *~as_xpr('\"')) >> '\"' >>
whitespace_regex
)
|
+(
'\"' >> (s2 = *~as_xpr('\"')) >> '\"' >>
whitespace_regex
)
);
Hi, Daniel. If you are using the latest version of xpressive from the
Boost File Vault, I would solve the problem this way:
#include <string>
#include <vector>
#include <iostream>
#include
#include
#include
using namespace boost;
using namespace xpressive;
int main()
{
local > strings;
sregex line_regex =
skip(_s) // skip whitespace
(
(s1 = +_d) >>
+(
'\"' >> (s2 = *~as_xpr('\"'))[push_back(strings, s2)] >> '\"' >>
':' >>
'\"' >> (s3 = *~as_xpr('\"'))[push_back(strings, s3)] >> '\"'
)
|
+(
'\"' >> (s2 = *~as_xpr('\"'))[push_back(strings, s2)] >> '\"'
)
)
;
std::string input(" 42 \"The answer to\" : \"Life\" \"The Universe\"
: \"And Everything!\" ");
if(regex_match(input, line_regex))
{
BOOST_FOREACH(ssub_match s, strings.get())
{
std::cout << s << std::endl;
}
}
}
The above uses semantic actions (the parts in []) to push sub-matches
into a vector for reference later. (It also uses skip(_s) to skip
whitespace.)
If you are using xpressive 1.0, which is part of Boost 1.34.1, it would
be a little trickier. There is no skip(), and no semantic actions. If
that's the case, you can define a nested sregex
quoted_string=*~as_xpr('\"');, and use that in your line_regex. Then
every quoted string that matches will cause a nested result to be added
to your match_results. See below:
sregex quoted_string = *~as_xpr('\"');
sregex line_regex =
keep(*_s) >>
(s1 = +_d) >> keep(*_s) >>
(
+(
'\"' >> quoted_string >> '\"' >>
keep(*_s) >> ':' >> keep(*_s) >>
'\"' >> quoted_string >> '\"' >>
keep(*_s)
)
|
+(
'\"' >> quoted_string >> '\"' >>
keep(*_s)
)
);
std::string input(" 42 \"The answer to\" : \"Life\" \"The
Universe\" : \"And Everything!\" ");
smatch what;
if(regex_match(input, what, line_regex))
{
BOOST_FOREACH(smatch const &str, what.nested_results())
{
std::cout << str[0] << std::endl;
}
}
This is less efficient, but gets the job done.
HTH,
--
Eric Niebler
Boost Consulting
www.boost-consulting.com