How do I use regex to match two consecutive newlines?
I've tried matching like this: std::string input = filecontents; boost::regex re_paragraphs("\\n\\n"); boost::sregex_token_iterator input_token_it(input.begin(), input.end(), re_paragraphs, -1); boost::sregex_token_iterator end_token_it; for( ; input_token_it != end_token_it; input_token_it++ ) { output += "TOKEN FOUND" } return output; But this never works. It matches when I use a part of a word as a token. The problem is that it's just not matching the newlines. I've looked into the input's hex and there are indeed two newline chars consecutively. I think I'm missing something obvious. Anyone have any idea?
Tommy Li wrote:
I've tried matching like this:
std::string input = filecontents; boost::regex re_paragraphs("\\n\\n");
boost::sregex_token_iterator input_token_it(input.begin(), input.end(), re_paragraphs, -1); boost::sregex_token_iterator end_token_it;
for( ; input_token_it != end_token_it; input_token_it++ ) { output += "TOKEN FOUND" } return output;
But this never works. It matches when I use a part of a word as a token. The problem is that it's just not matching the newlines. I've looked into the input's hex and there are indeed two newline chars consecutively.
I think I'm missing something obvious. Anyone have any idea?
By the way, I can't even match one newline char with "\\n" though they are obviously there.
Tommy Li wrote:
I think I'm missing something obvious. Anyone have any idea?
By the way, I can't even match one newline char with "\\n" though they are obviously there.
Well here's a sample program that obviously shows it does work :-)
#include
John Maddock wrote:
Tommy Li wrote:
I think I'm missing something obvious. Anyone have any idea? By the way, I can't even match one newline char with "\\n" though they are obviously there.
Well here's a sample program that obviously shows it does work :-)
#include
#include <iostream> int main(int,char**) { boost::regex e("\\n"); std::string s("one\ntwo\nthree"); boost::sregex_token_iterator i(s.begin(), s.end(), e, -1), j; while(i != j) { std::cout << "<" << *i << ">" << std::endl; ++i; } }
which outputs:
<one> <two> <three>
Just as expected.
John.
Great. Thanks a ton. The problem I was having was that I was using getline to read from stdin, which stripped the newlines (duh). Changing it to while(!cin.eof()) input.append(1, cin.get()); fixed it. By the way, do you have a better way of reading the entire stream, including newlines, into a string?
Tommy Li wrote:
Great. Thanks a ton. The problem I was having was that I was using getline to read from stdin, which stripped the newlines (duh). Changing it to while(!cin.eof()) input.append(1, cin.get()); fixed it.
By the way, do you have a better way of reading the entire stream, including newlines, into a string?
How's this work for you: void f() { // ... int const buff_size = 4096; char buff[buff_size]; string str; while( !cin.eof() ) { cin.read(buff, buff_size); str.append(buff, cin.gcount()); } // ... } HTH, Pablo
Tommy Li wrote:
Great. Thanks a ton. The problem I was having was that I was using getline to read from stdin, which stripped the newlines (duh). Changing it to while(!cin.eof()) input.append(1, cin.get()); fixed it.
By the way, do you have a better way of reading the entire stream, including newlines, into a string?
If you grep the Boost.Regex examples for "file" you'll find some possible examples. John.
Tommy Li wrote:
I've tried matching like this:
std::string input = filecontents; boost::regex re_paragraphs("\\n\\n");
boost::sregex_token_iterator input_token_it(input.begin(), input.end(), re_paragraphs, -1); boost::sregex_token_iterator end_token_it;
for( ; input_token_it != end_token_it; input_token_it++ ) { output += "TOKEN FOUND" } return output;
But this never works. It matches when I use a part of a word as a token. The problem is that it's just not matching the newlines. I've looked into the input's hex and there are indeed two newline chars consecutively.
I think I'm missing something obvious. Anyone have any idea?
That certainly should work, let me have a test case if you're really convinced that it doesn't. John.
participants (3)
-
John Maddock
-
Pablo Aguilar
-
Tommy Li