libboost-regex-1.33.1 problem with regex_merge

Hi all, I have a problem with an application build with the current debian/etch libboost-regex lib. I am not sure if the problem results from an intended behaviour change of the regex lib or if its a bug of either the regex lib or the debian package. The following test app worked flawlessly with several older boost regex versions and I would like to understand why it doesn't work anymore using the lib from debian/etch. Do I have to adjust the invocation parameters of boost_merge for the current version? I do not unnecessarily want to force the users of my production apps to have to update their match patterns over dozens of cfg files from "(.*)" to "^(.*)$". TIA for any hints, Bruno -- bruno.voigt@ic3s.de // debian/etch: dpkg -l *boost* // ii libboost-dev 1.33.1-10 Boost C++ Libraries development files // ii libboost-regex-dev 1.33.1-10 regular expression library for C++ // ii libboost-regex1.33.1 1.33.1-10 regular expression library for C++ // ii libboost-thread-dev 1.33.1-10 portable C++ multi-threading // ii libboost-thread1.33.1 1.33.1-10 portable C++ multi-threading // // build: g++ -lboost_regex regex_merge_test.cpp // #include <cstdlib> #include <string> #include <iostream> #include <boost/shared_ptr.hpp> #include <boost/regex.hpp> using namespace std; int main(void) { string my_regex("(.*)"); // doesnt't work (worked with previous boost regex versions) //string my_regex("^(.*)$"); // works string my_replace("\\+$1"); string my_string("123"); try { boost::regex my_regex_comp(my_regex); try { string my_result = boost::regex_merge(my_string, my_regex_comp, my_replace); cout << "boost::regex_merge result: " << my_result << endl; // I get a trailing + character: +123+ // I expected as in earlier liiboost-regex versions: +123 exit(0); } catch(...) { cout << "boost::regex_merge exception" << endl; } } catch(...) { cout << "boost::regex()exception" << endl; } exit(1); }

Bruno.Voigt@ic3s.de wrote:
Hi all, I have a problem with an application build with the current debian/etch libboost-regex lib. I am not sure if the problem results from an intended behaviour change of the regex lib or if its a bug of either the regex lib or the debian package.
The following test app worked flawlessly with several older boost regex versions and I would like to understand why it doesn't work anymore using the lib from debian/etch.
Do I have to adjust the invocation parameters of boost_merge for the current version?
I do not unnecessarily want to force the users of my production apps to have to update their match patterns over dozens of cfg files from "(.*)" to "^(.*)$".
TIA for any hints,
I'm not sure you're going to like the answer: this is a deliberate change and a bug fix to make the library do the same thing as Perl etc. When matching "(.*)" against "abc" there are *two* matches found: one matches "abc" then a second match is possible against the zero-length-string at the end of the text. You can prevent matching null-strings altogether by passing match_not_null to regex_merge: boost::regex_merge(my_string, my_regex_comp, my_replace, boost::regex_constants::match_not_null); But that prevents all zero-length matches, which your users might not appreciate either :-( I can't think of an easy way to emulate the previous buggy behaviour I'm afraid. Regards, John.

John Maddock wrote:
Do I have to adjust the invocation parameters of boost_merge for the current version?
I do not unnecessarily want to force the users of my production apps to have to update their match patterns over dozens of cfg files from "(.*)" to "^(.*)$".
TIA for any hints,
I'm not sure you're going to like the answer: this is a deliberate change and a bug fix to make the library do the same thing as Perl etc. When matching "(.*)" against "abc" there are *two* matches found: one matches "abc" then a second match is possible against the zero-length-string at the end of the text. You can prevent matching null-strings altogether by passing match_not_null to regex_merge:
boost::regex_merge(my_string, my_regex_comp, my_replace, boost::regex_constants::match_not_null);
But that prevents all zero-length matches, which your users might not appreciate either :-(
I can't think of an easy way to emulate the previous buggy behaviour I'm afraid.
Hi John, At first thanks again for fast response, your lib is one of my most precious tools. To follow your argumentation I wrote two perl oneliners to reproduce it and saw that only the first version with the greedy flag matches the boost-1.33 behaviour whilest the second version matches that of the <boost-regex-1.33 behaviour. Is there a way to get the non greedy style with boost-regex-1.33 ? $ perl -e '$s="123";$s=~s/(.*)/+$1/g;print $s;' +123+ $ perl -e '$s="123";$s=~s/(.*)/+$1/;print $s;' +123 TIA, Bruno

Bruno Voigt wrote:
Hi John, At first thanks again for fast response, your lib is one of my most precious tools.
To follow your argumentation I wrote two perl oneliners to reproduce it and saw that only the first version with the greedy flag matches the boost-1.33 behaviour whilest the second version matches that of the <boost-regex-1.33 behaviour. Is there a way to get the non greedy style with boost-regex-1.33 ?
$ perl -e '$s="123";$s=~s/(.*)/+$1/g;print $s;' +123+
$ perl -e '$s="123";$s=~s/(.*)/+$1/;print $s;' +123
The /g modifier signifies that search and replace should be global (replace all occurances), without it only the first occurance is replaced. If that's what you want to happen pass "format_first_only" as a match-flag to regex_replace/regex_merge. HTH, John.
participants (3)
-
Bruno Voigt
-
Bruno.Voigt@ic3s.de
-
John Maddock