Re: [Boost-users] [regex] matchinf info with merged regex

12 Jul 2006

      Line Oddskool wrote:
...
Hi boost.regex gurus,
I'm stuck with a problem dealing with some kind of regex merging
(using boost 1.33). I don't know if the way I took is viable, so any
ideas and advice will be appreciated.
To give you some insight, i have a set of a hundred matching (ei) and
formating (ri) "rules" e.g.
e1 : (a)(?=ll)
r1 : (?1o)
e1/r1 should mean "matching 'a' of some string like 'all' should be
replaced by 'o'"
I merge all my e/r into one big regex using regex_merge (for
performance), so the resulting matching/formating regex is like :
e : e1|e2|...|en
r: r1r2...rn
I'm getting weird behaviour with this, as the resulting string is
sometimes filled with sequences like 'u4u5u6u7u8u9u' or other "trash".
So to debug this, I'd like to know which rule (i.e. which ei) matched
on what part of the string.
I'm unsure if it's possible to get some kind of iterator on the rules
that have matched using regex_merge ?
I also looked at the match_results returned by the simpler method
regex_match(), but I can't figure out how to know which part of my
matching regex matched (i.e. which ei) ?
Unless you really meant it, regex_search would be analogous to regex_replace 
(the new name for regex_merge).

The way to find out which sub-expression matched is simply:

match_results<something> what;
...
for(unsigned i = 1; i < what.size(); ++i)
{
  if(what[i].matched)
    std:cout << "sub-expression " << i << " matched " << what[i] << 
std::endl;
}
...
Otherwise, is there a way to analyse or dump the matching/replacing
behaviour of such a complex regex ?
'Fraid not, you would likely be swamped with so much data that it probably 
wouldn't be that useful in anycase :-(

You could also try a binary-search-reduction on the problem: split the regex 
in two and find which half has the issue, then split again and so on...

HTH,
John.

Re: [Boost-users] [regex] matchinf info with merged regex

John Maddock