regexp: New problem with trailing - in character class
Hi, I've been using Boost regexp library forever, and it's great. But I just grabbed 1.33.0 (trying to get it to compile on 64-bit Windows), and I'm getting an error I never got before: it refuses to allow this regular expression: [a-z-] which is supposed to mean "a lower case letter or a dash". Previous versions of Boost regexp allowed this, and grep allows it, and version 1.33.0 does allow this one: [-a-z] Is this a deliberate restriction? It's causing me lots of problems, because my application has a lot of existing regular expressions which use this syntax. This occurs both on x64 with VC8, and on a pretty standard (and old) 32-bit Linux system. I've hacked around it horribly for now by adding this to the top of regcompA (I use the A posix interface exclusively): // HACK by GMF to fix problem that [a-z-] is not accepted as valid, but [-a-z] is. Fix it by moving the dash. char *p = (char*) ptr; while (*p) { printf("p: %c\n", *p); if (*p == '\\') p++; else if (*p == '[') { char *q = p+1; while (*q && (*q != ']')) { printf("*q: %c\n", *q); q++; } if (*q == ']') { q--; if (*q == '-') { memmove(p+2, p+1, (q-p)-1); *(p+1) = '-'; p = q; } } } p++; } but that's very nasty, and probably doesn't work properly anyway, and I'd sure like to get that out of my production code. Help! Greg
I've been using Boost regexp library forever, and it's great. But I just grabbed 1.33.0 (trying to get it to compile on 64-bit Windows), and I'm getting an error I never got before: it refuses to allow this regular expression:
[a-z-]
which is supposed to mean "a lower case letter or a dash". Previous versions of Boost regexp allowed this, and grep allows it, and version 1.33.0 does allow this one:
[-a-z]
Is this a deliberate restriction? It's causing me lots of problems, because my application has a lot of existing regular expressions which use this syntax.
Confirmed as a bug, here's the patch going into cvs for 1.33.1: Index: boost/regex/v4/basic_regex_parser.hpp =================================================================== RCS file: /cvsroot/boost/boost/boost/regex/v4/basic_regex_parser.hpp,v retrieving revision 1.9.2.4 diff -u -r1.9.2.4 basic_regex_parser.hpp --- boost/regex/v4/basic_regex_parser.hpp 16 Oct 2005 18:12:58 -00001.9.2. 4 +++ boost/regex/v4/basic_regex_parser.hpp 31 Oct 2005 11:07:09 -0000 @@ -1220,6 +1220,17 @@ char_set.add_range(start_range, end_range); if(this->m_traits.syntax_type(*m_position) == regex_constants::syntax_dash) { + if(m_end == ++m_position) + { + fail(regex_constants::error_brack, m_position - m_base); + return; + } + if(this->m_traits.syntax_type(*m_position) == regex_constants::syntax_close_set) + { + // trailing - : + --m_position; + return; + } fail(regex_constants::error_range, m_position - m_base); return; } John.
participants (2)
-
Greg Ferrar
-
John Maddock