Hi all, I'm sure this is just a configuration issue, but I'm not sure where my config is wrong. I'm using boost_regex_search, and capturing subgroups. When I use this regular expression: ^(*.)a(*.) On this string: "Management" I end up with the following matches: 0==> Management 1==> Man 2==> gement Should this be returning the following matches? 0==>Management 1==>M 2==>nagement 3==>Management 4==>Man 5==>gement Or am I getting exactly what I'm supposed to be getting here? Also, the default syntax is Perl syntax, correct? Thanks!
admin@geocodenet.com wrote :
When I use this regular expression:
^(*.)a(*.)
Isn't it ^(.*)a(.*) rather ?
Or am I getting exactly what I'm supposed to be getting here?
This is the normal behaviour. You don't get all possibilities of how the string could be matched, you only get one. In your case you have two couples of parentheses, so you capture two strings. Your "shouldn't I get 5 strings" doesn't really make sense.
1==>M 2==>nagement
This is what you get if you use (.*?) (ungreedy)
1==>Man 2==>gement
This is what you get if you use (.*) (greedy)
Yes, it was supposed to be ^(.*)a(.*), my bad. I was typing that as I was leaving work and not paying attention. Ah I forgot that without the trailing question marks it's greedy. Thanks for the explanation. -----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of loufoque Sent: Monday, April 03, 2006 6:13 PM To: boost-users@lists.boost.org Subject: Re: [Boost-users] Perl Regex Question admin@geocodenet.com wrote :
When I use this regular expression:
^(*.)a(*.)
Isn't it ^(.*)a(.*) rather ?
Or am I getting exactly what I'm supposed to be getting here?
This is the normal behaviour. You don't get all possibilities of how the string could be matched, you only get one. In your case you have two couples of parentheses, so you capture two strings. Your "shouldn't I get 5 strings" doesn't really make sense.
1==>M 2==>nagement
This is what you get if you use (.*?) (ungreedy)
1==>Man 2==>gement
This is what you get if you use (.*) (greedy) _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
admin@geocodenet.com wrote:
Hi all,
I'm sure this is just a configuration issue, but I'm not sure where my config is wrong. I'm using boost_regex_search, and capturing subgroups. When I use this regular expression:
^(*.)a(*.)
On this string: "Management"
You'll end up with a syntax error, because the asterisk goes after the dot. But that's probably just a typo in the post.
I end up with the following matches:
0==> Management 1==> Man 2==> gement
Should this be returning the following matches?
0==>Management 1==>M 2==>nagement 3==>Management 4==>Man 5==>gement
Or am I getting exactly what I'm supposed to be getting here?
You're getting exactly what you're supposed to get. Neither regex_search nor any other regex function attempts to find more than one way a given string could match the regex. Thus, since * is greedy, the first .* will always match "Man", never just "M" unless the remainder of the regex cannot be satisfied otherwise.
Also, the default syntax is Perl syntax, correct?
Correct. Sebastian Redl
participants (4)
-
adminīŧ geocodenet.com
-
loufoque
-
Michael Coles, MCDBA
-
Sebastian Redl