[Regex] This pattern compiles fine for Perl and PCRE but not for Boost Regex
Boost Regex fails to compile a regular expression such as "{abc}" with the error "Invalid content of repeat range". However, both Perl and PCRE accept the regular expression, treating the "{" as a literal character. The PCRE man page at http://www.pcre.org/pcre.txt clearly states: An opening curly bracket that appears in a position where a quantifier is not allowed, or one that does not match the syntax of a quantifier, is taken as a literal character. The Perl regular expression man page at http://perldoc.perl.org/perlre.html is similar, though not quite as clear: If a curly bracket occurs in any other context, it is treated as a regular character. In particular, the lower bound is not optional. Nevertheless, the behaviour of both PCRE and Perl are the same. I am currently using Boost 1.39, but I have seen no mention of a change in this area for more recent versions of Boost Regex. And finally ... my question ... Is there a way in Boost Regex to interpret a "{" as a literal character in the above context in a Perl regular expression, while still allowing the "{n,m}" bounded repeat in places where it is syntactically valid? Thanks.
On 10/18/2010 1:23 PM, David Dawe wrote:
Boost Regex fails to compile a regular expression such as "{abc}" with the error
"Invalid content of repeat range". However, both Perl and PCRE accept the regular
expression, treating the "{" as a literal character.
The PCRE man page at http://www.pcre.org/pcre.txt clearly states:
An opening curly bracket that appears in a position where a quantifier
is not allowed, or one that does not match the syntax of a quantifier,
is taken as a literal character.
The Perl regular expression man page at http://perldoc.perl.org/perlre.html
is similar, though not quite as clear:
If a curly bracket occurs in any other context, it is treated as a regular
character. In particular, the lower bound is not optional.
Nevertheless, the behaviour of both PCRE and Perl are the same.
I am currently using Boost 1.39, but I have seen no mention of a change in
this area for more recent versions of Boost Regex.
And finally … my question …
Is there a way in Boost Regex to interpret a "{" as a literal character in
the above context in a Perl regular expression, while still allowing the
"{n,m}" bounded repeat in places where it is syntactically valid?
Use the backslash escape character, as in "\{" to match "{" as a literal character.
On 10/18/2010 10:23 AM, David Dawe wrote:
Boost Regex fails to compile a regular expression such as "{abc}" with the error "Invalid content of repeat range". However, both Perl and PCRE accept the regular expression, treating the "{" as a literal character.
FYI, xpressive (the other regex engine in boost) compiles this pattern just fine. -- Eric Niebler BoostPro Computing http://www.boostpro.com
Is there a way in Boost Regex to interpret a "{" as a literal character in the above context in a Perl regular expression, while still allowing the "{n,m}" bounded repeat in places where it is syntactically valid?
Yep, just escape the { and } and use \{abc\} which will work the same in Perl/PCRE and Boost. I hadn't realised Perl worked like that, can you file a bug report at svn.boost.org so I don't forget this? Thanks, John.
On 10/19/2010 5:48 AM, John Maddock wrote:
Is there a way in Boost Regex to interpret a "{" as a literal character in the above context in a Perl regular expression, while still allowing the "{n,m}" bounded repeat in places where it is syntactically valid?
Yep, just escape the { and } and use \{abc\} which will work the same in Perl/PCRE and Boost.
I had already been using \{ and \} to temporarily work around the difference with PCRE (and Perl) but was hoping to use Boost Regex as a drop-in replacement. This is because some of the regular expressions are specified outside the software, in existing configuration files.
I hadn't realised Perl worked like that, can you file a bug report at svn.boost.org so I don't forget this?
Certainly. Will do. As suggested by Eric Niebler, I will also try out Boost Xpressive. Thanks everyone for your timely responses. DD
participants (4)
-
David Dawe
-
Edward Diener
-
Eric Niebler
-
John Maddock