Regex Documentation Issue (and two pings)

I was just bitten by an extremely nasty bug, caused in part by my own foolishness, but also by a subtlety of Boost Regex that probably should be documented explicitly. http://boost.org/libs/regex/doc/format_syntax.html gives a list of "The following Perl like expressions" and says that $N "Expands to the text that matched sub-expression N." HOWEVER, Boost Regex is sufficiently general that $10 will match the tenth subexpression, $25 the twenty fifth, and so forth. Perl is not that general - it recognizes only $1 through $9 here. I believed that Boost behaved like Perl. I concatenated "$1" with a random hexadecimal string, occasionally producing "$10", "$15", etc. before the letter hexits appeared. This led to many hours of misery. The documentation should note Boost Regex's generality. The generality is a good thing - $1 can be separated from later digits with empty () parentheses - but it needs to be made clear. Does this also affect the regex standardization proposal? It'd be bad if some implementers thought they only had to support $1 - $9. On an unrelated note, the following issues have not yet been addressed: Lambda Doc Bug, http://lists.boost.org/MailArchives/boost/msg64450.php Boost Array Documentation, http://lists.boost.org/MailArchives/boost/msg64502.php Stephan T. Lavavej http://nuwen.net

Will do.
Does this also affect the regex standardization proposal? It'd be bad if some implementers thought they only had to support $1 - $9.
Yep, the proposal uses the ECMA standard by reference - and that allows for $n or $nn to be recognised as refering to sub-expressions, so you access $99 but $100 is really ${10}0. John.
participants (2)
-
John Maddock
-
Stephan T. Lavavej