[regex] ECMAScript conformance
[boost 1.34.0, with last update to regex sources on 28th february 2007] Hello, first of all I must tell that I am very happy with boost.regex, it's fantastic. though, there are a few inconsistencies though with the ECMAScript spec. 1. it seems that re-matching an atom does not first clear its sub-matches, as "step 4" in page 135 of ECMA 262 3rd edition tells to. that is, it does what is in the 'and not' part here: /(z)((a+)?(b+)?(c))*/.exec("zaacbbbcac") which returns the array ["zaacbbbcac", "z", "ac", "a", undefined, "c"] and not ["zaacbbbcac", "z", "ac", "a", "bbb", "c"] 2. the {n,m} notation should not accept spaces in ECMAScript mode (though it's practical, it's not conformant). I easily admit that its not really important. Best regards Armel
Armel Asselin wrote:
though, there are a few inconsistencies though with the ECMAScript spec. 1. it seems that re-matching an atom does not first clear its sub-matches, as "step 4" in page 135 of ECMA 262 3rd edition tells to.
that is, it does what is in the 'and not' part here:
/(z)((a+)?(b+)?(c))*/.exec("zaacbbbcac") which returns the array ["zaacbbbcac", "z", "ac", "a", undefined, "c"] and not ["zaacbbbcac", "z", "ac", "a", "bbb", "c"]
I'll look into it.
2. the {n,m} notation should not accept spaces in ECMAScript mode (though it's practical, it's not conformant). I easily admit that its not really important.
I regard that as a compatible extension :-) I do remember that allowing spaces was a deliberate policy, but whether for compatibility with Perl or POSIX I don't recall. Thanks for the feedback, John.
John Maddock wrote:
Armel Asselin wrote:
though, there are a few inconsistencies though with the ECMAScript spec. 1. it seems that re-matching an atom does not first clear its sub-matches, as "step 4" in page 135 of ECMA 262 3rd edition tells to.
that is, it does what is in the 'and not' part here:
/(z)((a+)?(b+)?(c))*/.exec("zaacbbbcac") which returns the array ["zaacbbbcac", "z", "ac", "a", undefined, "c"] and not ["zaacbbbcac", "z", "ac", "a", "bbb", "c"]
I'll look into it.
You might give xpressive (new in 1.34) a shot. It gets this one right. -- Eric Niebler Boost Consulting www.boost-consulting.com
participants (3)
-
Armel Asselin
-
Eric Niebler
-
John Maddock