
Hi all, I'm using libboost-regex1.34.1 from gutsy. here is a sample for my problem: ------- CUT ------- #include <boost/regex.hpp> #include <string> #include <iostream> using namespace std; using namespace boost; int main() { string input = "1111"; string a = "(.*)"; string b = "0$1"; regex reg (a, regex::perl); string out = regex_replace(input, reg, b); cout << out << endl; return 0; } ------- CUT ------- the expected result would be "01111" but it gives "011110"... why does it work like this ? if I use anchors (^ and $) it works well, but the problem is I will not get anchors from the other applications :( the problem is that I get regexes from different perl and php applications... will the upgrade to 1.35 solve my problem ? Thanks, Misi

Mihaly Zachar wrote:
I'm using libboost-regex1.34.1 from gutsy.
here is a sample for my problem:
------- CUT ------- #include <boost/regex.hpp> #include <string> #include <iostream>
using namespace std; using namespace boost;
int main() { string input = "1111"; string a = "(.*)"; string b = "0$1";
regex reg (a, regex::perl);
string out = regex_replace(input, reg, b);
cout << out << endl;
return 0; }
------- CUT -------
the expected result would be "01111" but it gives "011110"... why does it work like this ?
By design: Perl does the same thing, after the first match against "1111" there is a second match against the empty string at the end of the text.
if I use anchors (^ and $) it works well, but the problem is I will not get anchors from the other applications :(
the problem is that I get regexes from different perl and php applications...
will the upgrade to 1.35 solve my problem ?
Nope, and as I say, Perl does the same thing. You could set the match flag match_not_null as the last argument to regex_replace, but that would disable matches against all zero-length strings which may not be what you want. Was this a real use case, or just a "getting to know the library" test case? HTH, John.

John Maddock wrote:
Mihaly Zachar wrote:
I'm using libboost-regex1.34.1 from gutsy.
here is a sample for my problem:
------- CUT ------- #include <boost/regex.hpp> #include <string> #include <iostream>
using namespace std; using namespace boost;
int main() { string input = "1111"; string a = "(.*)"; string b = "0$1";
regex reg (a, regex::perl);
string out = regex_replace(input, reg, b);
cout << out << endl;
return 0; }
------- CUT -------
the expected result would be "01111" but it gives "011110"... why does it work like this ?
By design: Perl does the same thing, after the first match against "1111" there is a second match against the empty string at the end of the text.
if I use anchors (^ and $) it works well, but the problem is I will not get anchors from the other applications :(
the problem is that I get regexes from different perl and php applications...
will the upgrade to 1.35 solve my problem ?
Nope, and as I say, Perl does the same thing. You could set the match flag match_not_null as the last argument to regex_replace, but that would disable matches against all zero-length strings which may not be what you want.
Was this a real use case, or just a "getting to know the library" test case?
unfortunately this is a real problem... I'm interworking with an existing perl system and it provides me rexes what I sohuld use. We should have the same result for the same regexes. mitya@stamford:~$ perl -e '$in = "1111"; $in =~ s/(.*)/0$1/; print $in."\n"' 01111 mitya@stamford:~$ to set match_not_null is not good for me :( for now I think I will put an anchor in a front of it in some cases, but that would be nice if there would be a "more official" way to do this... if you have any other ideas, please let me know. Thanks, Misi

mitya@stamford:~$ perl -e '$in = "1111"; $in =~ s/(.*)/0$1/; print $in."\n"' 01111 mitya@stamford:~$ The s/... construct by default replaces only the first match. You need to set the corresponding flag in your code: http://www.boost.org/doc/libs/1_35_0/libs/regex/doc/html/boost_regex/ref /regex_replace.html "If the flag format_first_only is set then only the first occurrence is replaced [...]"

Mihaly Zachar wrote:
unfortunately this is a real problem... I'm interworking with an existing perl system and it provides me rexes what I sohuld use.
We should have the same result for the same regexes.
mitya@stamford:~$ perl -e '$in = "1111"; $in =~ s/(.*)/0$1/; print $in."\n"' 01111 mitya@stamford:~$
to set match_not_null is not good for me :(
OK so the Perl code doesn't set the /g modifier so it only replaces the first match found: you can get the same behaviour in Boost.Regex by passing the format_first_only flag as the last argument to regex_replace. In other words the defaults are different: Boost.Regex defaults to "replace all occurences" but you can choose to replace just the first one if you want, and Perl is the other way around. HTH, John.

John Maddock wrote:
Mihaly Zachar wrote:
unfortunately this is a real problem... I'm interworking with an existing perl system and it provides me rexes what I sohuld use.
We should have the same result for the same regexes.
mitya@stamford:~$ perl -e '$in = "1111"; $in =~ s/(.*)/0$1/; print $in."\n"' 01111 mitya@stamford:~$
to set match_not_null is not good for me :(
OK so the Perl code doesn't set the /g modifier so it only replaces the first match found: you can get the same behaviour in Boost.Regex by passing the format_first_only flag as the last argument to regex_replace.
In other words the defaults are different: Boost.Regex defaults to "replace all occurences" but you can choose to replace just the first one if you want, and Perl is the other way around.
great, I know why does it work in this way !!! thanks very much for for you and Stephen Nuchia too Regards, Misi
participants (3)
-
John Maddock
-
Mihaly Zachar
-
Stephen Nuchia