boost::regex - I do not understand why this test fails.

Hi,
Below it is a self contained example and the output it produces.
I thought I would get a match for REGEX2 but not for REGEX1 in string s.
However REGEX1 matches and produces weird (for me) results.
I would expect REGEX1 only matches for strings with 6 terms (3 words
and 3 numbers).
I know this probably is just a regex (lack of ) knowledge problem on
my part and not a boost::regex problem but I could not find a better
forum than this one to ask for help.
Thank you in advance for your attention !
Mau.
#define BOOST_TEST_MODULE test_module_example
#include

AMDG Mauricio Gomes wrote:
I thought I would get a match for REGEX2 but not for REGEX1 in string s. However REGEX1 matches and produces weird (for me) results. I would expect REGEX1 only matches for strings with 6 terms (3 words and 3 numbers).
I know this probably is just a regex (lack of ) knowledge problem on my part and not a boost::regex problem but I could not find a better forum than this one to ask for help.
static const std::string& REGEX1 = "\\s*(\\w+)\\s+([\\d,\\.]+)\\s+([\\d,\\.]+)\\s([\\d,\\.]+)\\s*$"; static const std::string& REGEX2 = "\\s*(\\w+){1}\\s*(\\w+){1}\\s*(\\w+){1}\\s+([\\d,\\.]+){1}\\s+([\\d,\\.]+){1}\\s([\\d,\\.]+){1}\\s*$";
It would probably work better if you used \\s+ consistently for internal spaces. In Christ, Steven Watanabe

It would probably work better if you used \\s+ consistently for internal spaces.
In Christ, Steven Watanabe
Thank you Steven, the test below works fine now. Best regards, Mau. BOOST_AUTO_TEST_CASE(example_test) { boost::cmatch res; std::string s = " JETIX 2,957,081 643,225.51 0.00"; static const std::string& REGEX1 = "\\s*(\\w+){1}\\s+([\\d,\\.]+){1}\\s+([\\d,\\.]+){1}\\s+([\\d,\\.]+)\\s*$"; static const std::string& REGEX2 = "\\s*(\\w+){1}\\s+(\\w+){1}\\s+(\\w+){1}\\s+([\\d,\\.]+){1}\\s+([\\d,\\.]+){1}\\s+([\\d,\\.]+){1}\\s*$"; boost::regex rx(REGEX1); if (boost::regex_match (s.c_str(), res, rx)) { BOOST_CHECK_EQUAL ("JETIX" , res[1]); BOOST_CHECK_EQUAL ("2,957,081" , res[2]); BOOST_CHECK_EQUAL ("643,225.51", res[3]); BOOST_CHECK_EQUAL ("0.00" , res[4]); } else { BOOST_FAIL ("It was supposed to match."); } boost::cmatch res2; boost::regex rx2(REGEX2); bool match = boost::regex_match (s.c_str(), res2, rx2); for (unsigned i = 0; i < res2.size (); ++i) std::cout << res2[i] << std::endl; if (match) BOOST_FAIL ("It was NOT supposed to match."); }

Below it is a self contained example and the output it produces.
I thought I would get a match for REGEX2 but not for REGEX1 in string s. However REGEX1 matches and produces weird (for me) results. I would expect REGEX1 only matches for strings with 6 terms (3 words and 3 numbers).
static const std::string& REGEX1 = "\\s*(\\w+)\\s+([\\d,\\.]+)\\s+([\\d,\\.]+)\\s([\\d,\\.]+)\\s*$";
Ugh? REGEX1 will match a string containing *one* word (the "(\\w+)" part), followed by three numbers which may also contain any number of "."'s or ","'s. And that's exactly what it's doing isn't it? Maybe I've missed something? John.

static const std::string& REGEX1 =
"\\s*(\\w+){1}\\s+([\\d,\\.]+){1}\\s+([\\d,\\.]+){1}\\s+([\\d,\\.]+)\\s*$";
2010/6/23 John Maddock
Ugh? REGEX1 will match a string containing *one* word (the "(\\w+)" part), followed by three numbers which may also contain any number of "."'s or ","'s. And that's exactly what it's doing isn't it?
Maybe I've missed something? John.
Hi John, my idea was to match numbers with US and Brazilian decimal and grouping chars. So it should match 2,000,000.00 and 2.000.000,00. Again, I am a beginner with regex so if you have a suggestion to do that in a more efficient or simpler way please feel free to do so. Obs.: in the previous message I posted the code fixed by Steven suggestion. Thank you for your attention, Mau. -- Mau
participants (3)
-
John Maddock
-
Mauricio Gomes
-
Steven Watanabe