[Regex] Is it possible to match any "linebreak"
Is there a way to have a regex like [[:digit:]]{3}([^\n]+)\n? have \n match any line breaking character? (The processed files might have Unix or DOS line endings.) I know I could modify the regex [[:digit:]]{3}([^\n\r]+)[\n\r]* but I was hoping there might be an easier solution like a character class [[:linebreak:]], or maybe an escaped character like \R or something like that. (I am using Boost 1.43 in the relevant project.) Thank you and best regards Christoph
Is there a way to have a regex like [[:digit:]]{3}([^\n]+)\n? have \n match any line breaking character? (The processed files might have Unix or DOS line endings.)
Use \R, see: http://www.boost.org/doc/libs/1_49_0/libs/regex/doc/html/boost_regex/syntax/... HTH, John.
John Maddock wrote:
Is there a way to have a regex like [[:digit:]]{3}([^\n]+)\n? have \n match any line breaking character? (The processed files might have Unix or DOS line endings.)
Use \R, see:
http://www.boost.org/doc/libs/1_49_0/libs/regex/doc/html/boost_regex/syntax/...
HTH, John.
Thank you John.
Somehow I must have overlooked \R in the docs.
However, I still have one issue with \R.
In the following code everything works fine if I use r2.
If r1 the matches' captures do contain the newlines.
In short: ([^\\R+]) does not seem to capture all non-linebreak characters.
Should it? Or is this just a misunderstanding on my part?
#include
Is there a way to have a regex like [[:digit:]]{3}([^\n]+)\n? have \n match any line breaking character? (The processed files might have Unix or DOS line endings.)
Use \R, see:
http://www.boost.org/doc/libs/1_49_0/libs/regex/doc/html/boost_regex/syntax/...
HTH, John.
Thank you John. Somehow I must have overlooked \R in the docs.
However, I still have one issue with \R. In the following code everything works fine if I use r2. If r1 the matches' captures do contain the newlines.
In short: ([^\\R+]) does not seem to capture all non-linebreak characters. Should it? Or is this just a misunderstanding on my part?
No you can't do that, \R isn't a character class, it's much more complex than that - look at the link above to see what it maps to - something like [^\x0A-\x0D\x85\x{2028}\x{2029}] would be closer to the inverse. HTH, John.
John Maddock wrote: [snip]
In short: ([^\\R+]) does not seem to capture all non-linebreak characters. Should it? Or is this just a misunderstanding on my part?
No you can't do that, \R isn't a character class, it's much more complex than that - look at the link above to see what it maps to - something like [^\x0A-\x0D\x85\x{2028}\x{2029}] would be closer to the inverse.
HTH, John. Ok, Thank you.
participants (2)
-
Christoph Duelli
-
John Maddock