Problems matching parantheses in a string using boost::regex 1.42.0 on a amd64 debian system
' what(): Found a closing ) with no corresponding openening
' what(): Unmatched marking parenthesis ( or \(. The error occured while parsing the regular expression fragment: '[0-9]*?);$>>>HERE>>>'.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA224 Hi, I'm fairly new to boost libraries but searched the documentation and googled for an answer to my problem but couldn't come up with something. Please bear with me. I am using boost::regex (version 1.42.0 on debian amd64 machine) to parse a vhdl-like file. To this end, I am trying to use the regular expression boost::regex("^SIGNAL ([A-Z0-9_]*?): STD_LOGIC_VECTOR\( ([0-9]*?) DOWNTO ([0-9]*?)\);$"); to match strings like: SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0); My problem is that matching those parantheses in the string doesn't work - - I've tested the regexp under perl and it works fine but boost::regexp and C++ works only if I remove the brackets from the string and from the regular expression. Issues I observed & tested: 1) '\(' ... '\)' no match but g++ complains about an unknown escape sequence '\)'. 1a) Using '\(' ... ')' doesn't match. (I would have expected an exception to be thrown but none whatsoever.) 2) '0x28' ... '0x29' doesn't match. (0x28 = ASCII '(', 0x29 = ASCII ')') 3) '\Q(\E' ... '\Q)\E' no match but g++ complains about an unknown escape sequence '\Q'. 4) If I try to match only one of the brackets and remove the other from the string & regex I get one of the following: '\(' ... gives me: terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error> parenthesis. The error occured while parsing the regular expression fragment: '?([0-9]*?)>>>HERE>>>);$'. ... '\)' gives me: terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error> 5) Even boost::regex("^SIGNAL W07: STD_LOGIC_VECTOR\( 255 DOWNTO 0\);$"); fails to match SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0); Points 1a and 4 suggests that boost::regex treats even escaped parantheses as special characters which is quite surprising to me. I am sure a lot of people successfully used boost::regex to match parantheses so I do wonder which part of the documentation I have missed. I'd be happy to provide more debugging information if you tell me how to produce it. Many thanks, Simon Hoerder - -- /*** * Dipl. Ing. Simon Hoerder * Work: | Private: * Department of Computer Science | First Floor Flat * Merchant Venturers Building, 2.01 | 7 Whatley Road * Woodland Road | * Bristol, BS8 1UB | Bristol, BS8 2PS * United Kingdom | United Kingdom * * http://www.cs.bris.ac.uk/Research/CryptographySecurity/ * UK mobile: +44 7564 035925 * DE mobile: +49 179 7906117 * Skype: aloisius_hingerl ***/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iFYEARELAAYFAk5SilEACgkQE8ykjYCSVs7wGQDgmE795i+lC/qNHlwYVQvlxZtm SyiB7JreAuxQogDfUY0VLkfvNOcj9q41/43U5D8WxwljzL71+TkW2g== =wN0G -----END PGP SIGNATURE-----
On 8/22/2011 11:56 AM, Simon Hoerder wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA224
Hi,
I'm fairly new to boost libraries but searched the documentation and googled for an answer to my problem but couldn't come up with something. Please bear with me.
I am using boost::regex (version 1.42.0 on debian amd64 machine) to parse a vhdl-like file. To this end, I am trying to use the regular expression boost::regex("^SIGNAL ([A-Z0-9_]*?): STD_LOGIC_VECTOR\( ([0-9]*?) DOWNTO ([0-9]*?)\);$"); to match strings like: SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0);
My problem is that matching those parantheses in the string doesn't work - - I've tested the regexp under perl and it works fine but boost::regexp and C++ works only if I remove the brackets from the string and from the regular expression.
' what(): Found a closing ) with no corresponding openening
' what(): Unmatched marking parenthesis ( or \(. The error occured while parsing the regular expression fragment: '[0-9]*?);$>>>HERE>>>'.
Issues I observed& tested: 1) '\(' ... '\)' no match but g++ complains about an unknown escape sequence '\)'. 1a) Using '\(' ... ')' doesn't match. (I would have expected an exception to be thrown but none whatsoever.) 2) '0x28' ... '0x29' doesn't match. (0x28 = ASCII '(', 0x29 = ASCII ')') 3) '\Q(\E' ... '\Q)\E' no match but g++ complains about an unknown escape sequence '\Q'. 4) If I try to match only one of the brackets and remove the other from the string& regex I get one of the following: '\(' ... gives me: terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error> parenthesis. The error occured while parsing the regular expression fragment: '?([0-9]*?)>>>HERE>>>);$'. ... '\)' gives me: terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error> 5) Even boost::regex("^SIGNAL W07: STD_LOGIC_VECTOR\( 255 DOWNTO 0\);$"); fails to match SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0);
Points 1a and 4 suggests that boost::regex treats even escaped parantheses as special characters which is quite surprising to me. I am sure a lot of people successfully used boost::regex to match parantheses so I do wonder which part of the documentation I have missed.
I'd be happy to provide more debugging information if you tell me how to produce it.
Many thanks, Simon Hoerder
You need to escape the slash character for normal C++ processing, i.e. double slash. It should be "\\(" and "\\)" to catch a parentheses. -- Bill
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA224 On 22/08/11 18:11, Bill Buklis wrote: > On 8/22/2011 11:56 AM, Simon Hoerder wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA224 >> >> Hi, >> >> I'm fairly new to boost libraries but searched the documentation and >> googled for an answer to my problem but couldn't come up with something. >> Please bear with me. >> >> I am using boost::regex (version 1.42.0 on debian amd64 machine) to >> parse a vhdl-like file. To this end, I am trying to use the regular >> expression >> boost::regex("^SIGNAL ([A-Z0-9_]*?): STD_LOGIC_VECTOR\( ([0-9]*?) DOWNTO >> ([0-9]*?)\);$"); >> to match strings like: >> SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0); >> >> My problem is that matching those parantheses in the string doesn't work >> - - I've tested the regexp under perl and it works fine but boost::regexp >> and C++ works only if I remove the brackets from the string and from the >> regular expression. >> >> Issues I observed& tested: >> 1) '\(' ... '\)' no match but g++ complains about an unknown escape >> sequence '\)'. >> 1a) Using '\(' ... ')' doesn't match. (I would have expected an >> exception to be thrown but none whatsoever.) >> 2) '0x28' ... '0x29' doesn't match. (0x28 = ASCII '(', 0x29 = ASCII ')') >> 3) '\Q(\E' ... '\Q)\E' no match but g++ complains about an unknown >> escape sequence '\Q'. >> 4) If I try to match only one of the brackets and remove the other from >> the string& regex I get one of the following: >> '\(' ... gives me: >> terminate called after throwing an instance of >> 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error> >> >>> ' >> what(): Found a closing ) with no corresponding openening >> parenthesis. The error occured while parsing the regular expression >> fragment: '?([0-9]*?)>>>HERE>>>);$'. >> ... '\)' gives me: >> terminate called after throwing an instance of >> 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error> >> >>> ' >> what(): Unmatched marking parenthesis ( or \(. The error occured >> while parsing the regular expression fragment: '[0-9]*?);$>>>HERE>>>'. >> 5) Even >> boost::regex("^SIGNAL W07: STD_LOGIC_VECTOR\( 255 DOWNTO 0\);$"); >> fails to match >> SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0); >> >> Points 1a and 4 suggests that boost::regex treats even escaped >> parantheses as special characters which is quite surprising to me. I am >> sure a lot of people successfully used boost::regex to match parantheses >> so I do wonder which part of the documentation I have missed. >> >> I'd be happy to provide more debugging information if you tell me how to >> produce it. >> >> Many thanks, >> Simon Hoerder >> > > You need to escape the slash character for normal C++ processing, i.e. > double slash. > It should be "\\(" and "\\)" to catch a parentheses. > Works now, thanks. :-) Cheers, Simon - -- /*** * Dipl. Ing. Simon Hoerder * Work: | Private: * Department of Computer Science | First Floor Flat * Merchant Venturers Building, 2.01 | 7 Whatley Road * Woodland Road | * Bristol, BS8 1UB | Bristol, BS8 2PS * United Kingdom | United Kingdom * * http://www.cs.bris.ac.uk/Research/CryptographySecurity/ * UK mobile: +44 7564 035925 * DE mobile: +49 179 7906117 * Skype: aloisius_hingerl ***/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iFYEARELAAYFAk5SkdgACgkQE8ykjYCSVs5M+gDcDVbVT2kga464e0CVrlONzZSA sHjcLVW5pSBKdgDgmXnvlJGq9AGJxuR4RTZi5CY0JMblmRgYA24xiw== =WdbJ -----END PGP SIGNATURE-----
participants (2)
-
Bill Buklis
-
Simon Hoerder