[regex] begin-/end-line need to be customizable via traits

I think there is a problem in the regex standardization proposal regarding the begin- and end-of-line assertions (^ and $). It seems that there is no way to customize their behavior via the traits. This is inconsistent with the word boundary assertion, which is implementable in terms of the word character class (\w). The author of a traits class should be able to specify which characters are line separators. A simple fix would be to add a character class for line separator characters. Then, ^ and $ could be implemented in terms of lookup_classname and isctype, just as the word boundary assertion is. This leaves out an important corner case, though: \r is a line separator only if it is not immediately followed by a \n. I haven't yet come up with a traits interface that clean enough and general enough to satisfy. I'm open to suggestions. -- Eric Niebler Boost Consulting www.boost-consulting.com

I think there is a problem in the regex standardization proposal regarding the begin- and end-of-line assertions (^ and $). It seems that there is no way to customize their behavior via the traits. This is inconsistent with the word boundary assertion, which is implementable in terms of the word character class (\w). The author of a traits class should be able to specify which characters are line separators.
Understood.
A simple fix would be to add a character class for line separator characters. Then, ^ and $ could be implemented in terms of lookup_classname and isctype, just as the word boundary assertion is.
This leaves out an important corner case, though: \r is a line separator only if it is not immediately followed by a \n. I haven't yet come up with a traits interface that clean enough and general enough to satisfy. I'm open to suggestions.
The corner case you mention is too important to leave out IMO, in Boost-1.33 line boundaries will follow the Unicode recommendations: http://www.unicode.org/reports/tr18/#Line_Boundaries Are there any situations that this does not handle? BTW, it's possible to go on adding "customisation points" to the traits class almost indefinitely - but you have to draw the line somewhere - at the moment I'm not sure which side of the line this one falls on :-) John.
participants (2)
-
Eric Niebler
-
John Maddock