
Hi, I'm seeking help for the following problem, using the regex library: the following line of code is meant to assist in parsing urls: boost::regex exp("(?is)(https?)://([^:/?#]+):?(\\d{1,5})?/?([^?#]*)?\ \??([^#]+)?#?(\\w*)"); It runs fine in a linux environment, however if compiled with VC9 on windows, it fails with: First-chance exception at 0x7c812aeb in test.exe: Microsoft C++ exception: boost ::exception_detail ::clone_impl <boost::exception_detail::error_info_injector<boost::regex_error> > at memory location 0x0012f768.. I thought that the expression might be too long, and indeed if I shorten the above line to: boost regex exp("(?is)(https?)://([^:/?#]+):?(\\d{1,5})?/?"); it runs fine on windows as well. Any help on what the cause of this can be, would be appreciated. Regards, ==Adam

My experience with Boost.regex on Win32 is that crashes like that are from either initialization ordering or syntax errors. I would look at the section with "?\ \??" -- is that right? Shouldn't that be "?\\ \\??"?. P.S. You didn't write, but I presume the crash happens on start up. You might also look at it in the VC9 debugger and see if the execption object has any further information. At 04:13 PM 2/21/2009, you wrote:
the following line of code is meant to assist in parsing urls:
boost::regex exp("(?is)(https?)://([^:/?#]+):?(\\d{1,5})?/?([^?#]*)?\ \??([^#]+)?#?(\\w*)");
It runs fine in a linux environment, however if compiled with VC9 on windows, it fails with:
First-chance exception at 0x7c812aeb in test.exe: Microsoft C++ exception: boost ::exception_detail ::clone_impl <boost::exception_detail::error_info_injector<boost::regex_error> > at memory location 0x0012f768..

boost::regex exp("(?is)(https?)://([^:/?#]+):?(\\d{1,5})?/?([^?#]*)?\ \??([^#]+)?#?(\\w*)");
It runs fine in a linux environment, however if compiled with VC9 on windows, it fails with:
The character sequence '??(' is a trigraph that VC++ replaces with '[' - see http://en.wikipedia.org/wiki/C_trigraph - and this of course breaks your regex :-( I'm trying to think of an alternative and failing at present, as splitting into 2 strings doesn't help (apparently VC++ performs trigraph substitution after string catenation). Ah this page: http://msdn.microsoft.com/en-us/library/bt0y4awe.aspx describes the workaround. HTH, John.

Alan, Steven, John: indeed the presence of the trigraph was the source of the problem. The compiler on linux even warned me, but at that time (few months ago) I didn't take note of it, since it did not cause any trouble on linux. Anyways, a little restructure: boost::regex const rex("(?is)(https?)://([^:/?#]+):?(\\d{1,5})?(/[^? #]*)?(\\?[^#]+)?(#.*)?"); has helped to come around the problem. Thanks for the help! Regards, ==Adam On Feb 22, 2009, at 11:00 AM, John Maddock wrote:
boost::regex exp("(?is)(https?)://([^:/?#]+):?(\\d{1,5})?/?([^?#]*)?\ \??([^#]+)?#?(\\w*)");
It runs fine in a linux environment, however if compiled with VC9 on windows, it fails with:
The character sequence '??(' is a trigraph that VC++ replaces with '[' - see http://en.wikipedia.org/wiki/C_trigraph - and this of course breaks your regex :-(
I'm trying to think of an alternative and failing at present, as splitting into 2 strings doesn't help (apparently VC++ performs trigraph substitution after string catenation). Ah this page: http://msdn.microsoft.com/en-us/library/bt0y4awe.aspx describes the workaround.
HTH, John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (4)
-
Adam Kornafeld
-
Alan M. Carroll
-
John Maddock
-
Steven Watanabe