question on boost::regex expression(query);
Hi,
I'm new to boost and not all that familiar with Perl regex. However, I have
my own
code for using Perl regex libraries for finding interesting features in DNA
(
http://www.mail-archive.com/bio_bulletin_board@bioinformatics.org/msg01382.h...
)
I recently added boost to work along with Microsoft's greta code. However, I
now
have a problem with regex exploding.
The questionable regex is presumably,
(?<=GU.*?TACTAAC.{20,40}AG|^).*?(?=GU.*?TACTAAC.{20,40}AG|$)
( which doesn't explode with greta AFAIK)
and the code in question is ( MM_MARK is debug macro, es is error stream
other stuff omitted).
try {
es<
Mike Marchywka wrote:
Hi, I'm new to boost and not all that familiar with Perl regex. However, I have my own code for using Perl regex libraries for finding interesting features in DNA ( http://www.mail-archive.com/bio_bulletin_board@bioinformatics.org/msg01382.h... )
I recently added boost to work along with Microsoft's greta code. However, I now have a problem with regex exploding. The questionable regex is presumably,
(?<=GU.*?TACTAAC.{20,40}AG|^).*?(?=GU.*?TACTAAC.{20,40}AG|$)
Variable length look-behind isn't support by Boost.Regex (or by Perl for that matter). Sorry I can't be more helpful at present: although it seems as though lookbehind isn't really needed in this case - you could remove the lookbehind and use a marked sub-expression to identify the section you want instead. So I think (?GU.*?TACTAAC.{20,40}AG|^)(.*?(?=GU.*?TACTAAC.{20,40}AG|$)) would be equivalent, with $1 containing the section you're interested in? HTH John.
Variable length look-behind isn't support by Boost.Regex (or by Perl for that matter).
Sorry I can't be more helpful at present: although it seems as though lookbehind isn't really needed in this case - you could remove the lookbehind and use a marked sub-expression to identify the section you want instead.
The subexpression was the whole point of trying boost so I will assume you are right and continue. However, the inability to catch the problem with (...) combined with the linker comment made me a little concerned about the build accuracy. Thanks.
From: "John Maddock"
Reply-To: boost-users@lists.boost.org To: Subject: Re: [Boost-users] question on boost::regex expression(query); Date: Tue, 2 Oct 2007 10:09:56 +0100 Mike Marchywka wrote:
Hi, I'm new to boost and not all that familiar with Perl regex. However, I have my own code for using Perl regex libraries for finding interesting features in DNA (
http://www.mail-archive.com/bio_bulletin_board@bioinformatics.org/msg01382.h...
)
I recently added boost to work along with Microsoft's greta code. However, I now have a problem with regex exploding. The questionable regex is presumably,
(?<=GU.*?TACTAAC.{20,40}AG|^).*?(?=GU.*?TACTAAC.{20,40}AG|$)
Variable length look-behind isn't support by Boost.Regex (or by Perl for that matter).
Sorry I can't be more helpful at present: although it seems as though lookbehind isn't really needed in this case - you could remove the lookbehind and use a marked sub-expression to identify the section you want instead.
So I think (?GU.*?TACTAAC.{20,40}AG|^)(.*?(?=GU.*?TACTAAC.{20,40}AG|$)) would be equivalent, with $1 containing the section you're interested in?
HTH John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_________________________________________________________________ Test your celebrity IQ. Play Red Carpet Reveal and earn great prizes! http://club.live.com/red_carpet_reveal.aspx?icid=redcarpet_hotmailtextlink2
Mike Marchywka wrote:
Variable length look-behind isn't support by Boost.Regex (or by Perl for that matter).
Sorry I can't be more helpful at present: although it seems as though lookbehind isn't really needed in this case - you could remove the lookbehind and use a marked sub-expression to identify the section you want instead.
The subexpression was the whole point of trying boost so I will assume you are right and continue. However, the inability to catch the problem with (...) combined with the linker comment made me a little concerned about the build accuracy.
Sorry, what linker comment? The thrown boost::regex_error is caught OK for me (on VC8), which compiler are you using? You could try catching by const reference instead to see if that helps at all... HTH, John.
Sorry, I posted this later: Info: resolving vtable for boost::regex_errorby linking to __imp___ZTVN5boost11regex_errorE (auto-import) while trying to link against boost_regex-gcc-mt-1_33_1 and cygboost_regex-gcc-mt-1_33_1.dll using $ g++ -v Reading specs from /usr/lib/gcc-lib/i686-pc-cygwin/3.3.3/specs Configured with: /gcc/gcc-3.3.3-3/configure --verbose --prefix=/usr --exec-prefi x=/usr --sysconfdir=/etc --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/s hare/man --infodir=/usr/share/info --enable-languages=c,ada,c++,d,f77,java,objc, pascal --enable-nls --without-included-gettext --enable-libgcj --with-system-zli b --enable-interpreter --enable-threads=posix --enable-java-gc=boehm --enable-sj lj-exceptions --disable-version-specific-runtime-libs --disable-win32-registry Thread model: posix gcc version 3.3.3 (cygwin special)
From: "John Maddock"
Reply-To: boost-users@lists.boost.org To: Subject: Re: [Boost-users] question on boost::regex expression(query); Date: Tue, 2 Oct 2007 15:44:38 +0100 Mike Marchywka wrote:
Variable length look-behind isn't support by Boost.Regex (or by Perl for that matter).
Sorry I can't be more helpful at present: although it seems as though lookbehind isn't really needed in this case - you could remove the lookbehind and use a marked sub-expression to identify the section you want instead.
The subexpression was the whole point of trying boost so I will assume you are right and continue. However, the inability to catch the problem with (...) combined with the linker comment made me a little concerned about the build accuracy.
Sorry, what linker comment?
The thrown boost::regex_error is caught OK for me (on VC8), which compiler are you using? You could try catching by const reference instead to see if that helps at all...
HTH, John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_________________________________________________________________ Kick back and relax with hot games and cool activities at the Messenger Café. http://www.cafemessenger.com?ocid=TXT_TAGHM_SeptHMtagline1
Mike Marchywka wrote:
Sorry, I posted this later:
Info: resolving vtable for boost::regex_errorby linking to __imp___ZTVN5boost11regex_errorE (auto-import)
while trying to link against boost_regex-gcc-mt-1_33_1 and cygboost_regex-gcc-mt-1_33_1.dll
Why linking against two different regex lib versions? I suspect that may be the cause of the trouble? John.
Well, I wanted a static build but couldn't get that to work and then I realized it needed a dll only when I tried to run in different location. These are the lib's I have. $ ls ./boost/lib | grep regex libboost_regex-gcc-mt-1_33_1.a libboost_regex-gcc-mt-s-1_33_1.a libboost_regex-gcc-mt-s.a libboost_regex-gcc-mt.a Linking against the static produced undefines such as ./boost/lib/libboost_regex-gcc-mt-s-1_33_1.a(cpp_regex_traits.o):cpp_regex_trait s.cpp:(.text+0x4cf): undefined reference to `__gnu_cxx::__exchange_and_add(int v olatile*, int)' so it looked like the non-static was working. I'm also trying to use all the prebuilds as I didn't want to try to build this from scratch but maybe it is worth giving it a shot. http://www.boost.org/more/getting_started/unix-variants.html#link-your-progr... Thanks. I'm sure this is something simple in a FAQ list but I haven't found it yet.
From: "John Maddock"
Reply-To: boost-users@lists.boost.org To: Subject: Re: [Boost-users] question on boost::regex expression(query); Date: Tue, 2 Oct 2007 17:29:14 +0100 Mike Marchywka wrote:
Sorry, I posted this later:
Info: resolving vtable for boost::regex_errorby linking to __imp___ZTVN5boost11regex_errorE (auto-import)
while trying to link against boost_regex-gcc-mt-1_33_1 and cygboost_regex-gcc-mt-1_33_1.dll
Why linking against two different regex lib versions? I suspect that may be the cause of the trouble?
John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_________________________________________________________________ Can you find the hidden words? Take a break and play Seekadoo! http://club.live.com/seekadoo.aspx?icid=seek_hotmailtextlink1
Mike Marchywka wrote:
Well, I wanted a static build but couldn't get that to work and then I realized it needed a dll only when I tried to run in different location. These are the lib's I have.
$ ls ./boost/lib | grep regex libboost_regex-gcc-mt-1_33_1.a libboost_regex-gcc-mt-s-1_33_1.a libboost_regex-gcc-mt-s.a libboost_regex-gcc-mt.a
Linking against the static produced undefines such as
./boost/lib/libboost_regex-gcc-mt-s-1_33_1.a(cpp_regex_traits.o):cpp_regex_trait s.cpp:(.text+0x4cf): undefined reference to `__gnu_cxx::__exchange_and_add(int v olatile*, int)'
Looks like you need -lpthread or maybe -rt (I can't remember which symbols are in which cygwin dll).
so it looked like the non-static was working.
It probably should as long as you're only linking to *one* regex library :-) John.
Same problem during link and runtime with only dll. The static choice should avoid dll problems so maybe I'll give that another look. Thanks again. $ make boostd_rules_annotater g++ -Wall -ggdb -mthreads -msse -O3 -I . \ -o rules_annotater boost_rules_annotater.o \ -lpthread -L./boost -lmyboost -L./boost/lib -L. -lcygboost_regex-gcc-mt-1_33_1 -L./greta -lmygreta -ljpeg -lm -Lgreta bio_lo_level_string.o Info: resolving vtable for boost::regex_errorby linking to __imp___ZTVN5boost11r egex_errorE (auto-import) Administrator@TESTBED01 /cygdrive/c/mydocs/scripts/cc/affx $ /cygdrive/e/new/exp/brew/msbin/DUMPBIN /EXPORTS cygboost_regex-gcc-mt-1_33_1. dll | grep regex_error 33 20 0001B2E0 _ZN5boost11regex_errorC1ENS_15regex_constants10error_t ypeE 34 21 0001AE60 _ZN5boost11regex_errorC1ERKSsNS_15regex_constants10err or_typeEi 35 22 0001B450 _ZN5boost11regex_errorC2ENS_15regex_constants10error_t ypeE 36 23 0001AE20 _ZN5boost11regex_errorC2ERKSsNS_15regex_constants10err or_typeEi 37 24 0001AEE0 _ZN5boost11regex_errorD0Ev 38 25 0001AEC0 _ZN5boost11regex_errorD1Ev 39 26 0001AEA0 _ZN5boost11regex_errorD2Ev 444 1BB 0001B1F0 _ZNK5boost11regex_error5raiseEv 545 220 00093950 _ZTIN5boost11regex_errorE 558 22D 00093DA8 _ZTSN5boost11regex_errorE 571 23A 00094868 _ZTVN5boost11regex_errorE $ cygcheck rules_annotater.exe Found: c:\mydocs\scripts\cc\affx\rules_annotater.exe c:/mydocs/scripts/cc/affx/rules_annotater.exe C:\WINNT\cygwin1.dll C:\WINNT\system32\ADVAPI32.DLL C:\WINNT\system32\KERNEL32.dll C:\WINNT\system32\ntdll.dll C:\WINNT\system32\RPCRT4.dll c:\mydocs\scripts\cc\affx\cygboost_regex-gcc-mt-1_33_1.dll http://www.cygwin.com/ml/cygwin/2004-08/msg00753.html --enable-auto-import Do sophisticated linking of _symbol to __imp__symbol for DATA imports from DLLs, and create the necessary thunking symbols when building the import libraries with those DATA exports. Note: Use of the 'auto-import' extension will cause the text section of the image file to be made writable. This does not conform to the PE-COFF format specification published by Microsoft. Using 'auto-import' generally will 'just work' -- but sometimes you may see this message: "variable '<var>' can't be auto-imported. Please read the documentation for ld's --enable-auto-import for details." This message occurs when some (sub)expression accesses an address ultimately given by the sum of two constants (Win32 import tables only allow one). Instances where this may occur include accesses to member fields of struct variables imported from a DLL, as well as using a constant index into an array variable imported from a DLL. Any multiword variable (arrays, structs, long long, etc) may trigger this error condition. However, regardless of the exact data type of the offending exported variable, ld will always detect it, issue the warning, and exit. There are several ways to address this difficulty, regardless of the data type of the exported variable:
From: "John Maddock"
Reply-To: boost-users@lists.boost.org To: Subject: Re: [Boost-users] question on boost::regex expression(query); Date: Tue, 2 Oct 2007 18:45:30 +0100 Mike Marchywka wrote:
Well, I wanted a static build but couldn't get that to work and then I realized it needed a dll only when I tried to run in different location. These are the lib's I have.
$ ls ./boost/lib | grep regex libboost_regex-gcc-mt-1_33_1.a libboost_regex-gcc-mt-s-1_33_1.a libboost_regex-gcc-mt-s.a libboost_regex-gcc-mt.a
Linking against the static produced undefines such as
./boost/lib/libboost_regex-gcc-mt-s-1_33_1.a(cpp_regex_traits.o):cpp_regex_trait
s.cpp:(.text+0x4cf): undefined reference to `__gnu_cxx::__exchange_and_add(int v olatile*, int)'
Looks like you need -lpthread or maybe -rt (I can't remember which symbols are in which cygwin dll).
so it looked like the non-static was working.
It probably should as long as you're only linking to *one* regex library :-)
John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_________________________________________________________________ A place for moms to take a break! http://www.reallivemoms.com?ocid=TXT_TAGHM&loc=us
Mike Marchywka wrote:
Same problem during link and runtime with only dll. The static choice should avoid dll problems so maybe I'll give that another look. Thanks again.
Well it works for me with the Boost-1.33.1 install supplied with current
cygwin, the test program is:
#include
Thanks. If I have to guess at this point, I think it is a quirk of my cygwin install or version- it is also possible I have intermixed boost files but still a bit confusing. AFAIK, everything is working right now except the try/catch so maybe I can let it go until I can safely upgrade my tools. $ g++ -v Reading specs from /usr/lib/gcc-lib/i686-pc-cygwin/3.3.3/specs Configured with: /gcc/gcc-3.3.3-3/configure --verbose --prefix=/usr --exec-prefi x=/usr --sysconfdir=/etc --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/s hare/man --infodir=/usr/share/info --enable-languages=c,ada,c++,d,f77,java,objc, pascal --enable-nls --without-included-gettext --enable-libgcj --with-system-zli b --enable-interpreter --enable-threads=posix --enable-java-gc=boehm --enable-sj lj-exceptions --disable-version-specific-runtime-libs --disable-win32-registry Thread model: posix gcc version 3.3.3 (cygwin special) $ g++ -I. t.cpp -L./lib -lboost_regex-gcc-mt Info: resolving vtable for boost::regex_errorby linking to __imp___ZTVN5boost11r egex_errorE (auto-import) Administrator@TESTBED01 /cygdrive/c/mydocs/scripts/cc/affx/boost $ ./a [ <<<<<<<<<< This pops up a dialog box about missing dll ] Administrator@TESTBED01 /cygdrive/c/mydocs/scripts/cc/affx/boost $ cygcheck a.exe Found: c:\mydocs\scripts\cc\affx\boost\a.exe c:/mydocs/scripts/cc/affx/boost/a.exe Error: could not find cygboost_regex-gcc-mt-1_33_1.dll C:\WINNT\cygwin1.dll C:\WINNT\system32\ADVAPI32.DLL C:\WINNT\system32\KERNEL32.dll C:\WINNT\system32\ntdll.dll C:\WINNT\system32\RPCRT4.dll Error: could not find cygboost_regex-gcc-mt-1_33_1.dll Administrator@TESTBED01 /cygdrive/c/mydocs/scripts/cc/affx/boost $ cp ../cygboost_regex-gcc-mt-1_33_1.dll . Administrator@TESTBED01 /cygdrive/c/mydocs/scripts/cc/affx/boost $ ./a Aborted (core dumped) Administrator@TESTBED01 /cygdrive/c/mydocs/scripts/cc/affx/boost $
From: "John Maddock"
Reply-To: boost-users@lists.boost.org To: Subject: Re: [Boost-users] question on boost::regex expression(query); Date: Wed, 3 Oct 2007 09:46:35 +0100 Mike Marchywka wrote:
Same problem during link and runtime with only dll. The static choice should avoid dll problems so maybe I'll give that another look. Thanks again.
Well it works for me with the Boost-1.33.1 install supplied with current cygwin, the test program is:
#include
#include <iostream> int main() { try{ boost::regex e("(?<=GU.*?TACTAAC.{20,40}AG|^)?(GU.*?TACTAAC.{20,40}AG|$)"); } catch(boost::regex_error e) { std::cout << "Caught a regex_error" << std::endl; std::cout << e.what() << std::endl; } catch(std::exception e) { std::cout << "Caught a std::exception" << std::endl; std::cout << e.what() << std::endl; } }
and then:
$ g++ -I /usr/include/boost-1_33_1 t.cpp -lboost_regex-gcc-mt Info: resolving vtable for boost::regex_errorby linking to __imp___ZTVN5boost11regex_errorE (auto-import)
$ ./a Caught a regex_error Invalid regular expression
Which is what I would expect in that case.
HTH, John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_________________________________________________________________ Boo! Scare away worms, viruses and so much more! Try Windows Live OneCare http://onecare.live.com/standard/en-us/purchase/trial.aspx?s_cid=wl_hotmailn...
Mike Marchywka wrote:
Thanks. If I have to guess at this point, I think it is a quirk of my cygwin install or version- it is also possible I have intermixed boost files but still a bit confusing. AFAIK, everything is working right now except the try/catch so maybe I can let it go until I can safely upgrade my tools.
$ g++ -I. t.cpp -L./lib -lboost_regex-gcc-mt
What Boost version are you #including: if you want to link with the Cygwin binaries then it should be /usr/include/boost-1_33_1.
Administrator@TESTBED01 /cygdrive/c/mydocs/scripts/cc/affx/boost $ cygcheck a.exe Found: c:\mydocs\scripts\cc\affx\boost\a.exe c:/mydocs/scripts/cc/affx/boost/a.exe Error: could not find cygboost_regex-gcc-mt-1_33_1.dll C:\WINNT\cygwin1.dll C:\WINNT\system32\ADVAPI32.DLL C:\WINNT\system32\KERNEL32.dll C:\WINNT\system32\ntdll.dll C:\WINNT\system32\RPCRT4.dll Error: could not find cygboost_regex-gcc-mt-1_33_1.dll
For me that dll is in /bin/ in the cygwin install. HTH, John.
Thanks. Do you have the 3.3.3 g++? I'll try to upgrade my tools when I can as there have been a few bug fixes etc but I don't want to risk making a big mess right now. I downloaded the most recent boost packages on my machine but didn't install them, just grabbed what I thought I needed.
What Boost version are you #including: if you want to link with the Cygwin binaries then it should be /usr/include/boost-1_33_1.
$ more boost/version.hpp // Boost version.hpp configuration header file ------------------------------/ / // (C) Copyright John maddock 1999. Distributed under the Boost // Software License, Version 1.0. (See accompanying file // LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) // See http://www.boost.org/libs/config for documentation #ifndef BOOST_VERSION_HPP #define BOOST_VERSION_HPP // // Caution, this is the only boost header that is guarenteed // to change with every boost release, including this header // will cause a recompile every time a new boost version is // released. // // BOOST_VERSION % 100 is the sub-minor version // BOOST_VERSION / 100 % 1000 is the minor version // BOOST_VERSION / 100000 is the major version #define BOOST_VERSION 103301 $ ld -version GNU ld version 2.16.90 20050520 Copyright 2005 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License. This program has absolutely no warranty.
From: "John Maddock"
Reply-To: boost-users@lists.boost.org To: Subject: Re: [Boost-users] question on boost::regex expression(query); Date: Wed, 3 Oct 2007 10:27:22 +0100 Mike Marchywka wrote:
Thanks. If I have to guess at this point, I think it is a quirk of my cygwin install or version- it is also possible I have intermixed boost files but still a bit confusing. AFAIK, everything is working right now except the try/catch so maybe I can let it go until I can safely upgrade my tools.
$ g++ -I. t.cpp -L./lib -lboost_regex-gcc-mt
What Boost version are you #including: if you want to link with the Cygwin binaries then it should be /usr/include/boost-1_33_1.
Administrator@TESTBED01 /cygdrive/c/mydocs/scripts/cc/affx/boost $ cygcheck a.exe Found: c:\mydocs\scripts\cc\affx\boost\a.exe c:/mydocs/scripts/cc/affx/boost/a.exe Error: could not find cygboost_regex-gcc-mt-1_33_1.dll C:\WINNT\cygwin1.dll C:\WINNT\system32\ADVAPI32.DLL C:\WINNT\system32\KERNEL32.dll C:\WINNT\system32\ntdll.dll C:\WINNT\system32\RPCRT4.dll Error: could not find cygboost_regex-gcc-mt-1_33_1.dll
For me that dll is in /bin/ in the cygwin install.
HTH, John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_________________________________________________________________ Peek-a-boo FREE Tricks & Treats for You! http://www.reallivemoms.com?ocid=TXT_TAGHM&loc=us
participants (2)
-
John Maddock
-
Mike Marchywka