
Hi all, I am trying to use cregex_iterator, but reading the doc doesn't seem to help me understand the usage of the API. If I supply a regular expression, is the following code correctly get the starting position of each occurance of the string "hello world here" in the buffer? sample code like this: int nextPos[100]; char *pBuf = NULL; // pBuf will be point to a buffer NULL terminated ............................ regex expr("hello world here", regex_constants::perl | regex_constants::icase); cregex_iterator h1(pBuf, pBuf + std::strlen(pBuf), expr, 1), h2; int i = 0; while (i < 100 && h1 != h2) { nextPos[i] = (*h1).position(0); h1++; i++; }

Winson Yung wrote:
Hi all, I am trying to use cregex_iterator, but reading the doc doesn't seem to help me understand the usage of the API. If I supply a regular expression, is the following code correctly get the starting position of each occurance of the string "hello world here" in the buffer?
sample code like this:
int nextPos[100]; char *pBuf = NULL; // pBuf will be point to a buffer NULL terminated ............................ regex expr("hello world here", regex_constants::perl | regex_constants::icase); cregex_iterator h1(pBuf, pBuf + std::strlen(pBuf), expr, 1), h2;
Error here:^ The last arg to the iterator constructor is optional, and should be one or more of the match_flag enum's: http://www.boost.org/libs/regex/doc/match_flag_type.html John.

Thanks John, I was going to ask about this magic '1' number thing. I think I found an example somewhere like this, but appearantely it was wrong. I was having so much difficulty using the cregex_iterator, even without the last match flag, the same code was asserting when it was walking through the iterator to get the posistion(0). Here is an example of the calltrace when assert happened: _assert(void * 0x0067a74c `string', void * 0x0067a758 `string', unsigned int 253) line 256 boost::shared_ptr<boost::regex_iterator_implementation<char const *,char,boost::regex_traits<char,boost::w32_regex_traits<char> > >
::operator->() line 253 + 31 bytes boost::regex_iterator<char const *,char,boost::regex_traits<char,boost::w32_regex_traits<char> > ::operator*() line 129 + 37 bytes CDownloadProgressDlg::ExtractReportKeyStats(unsigned char * 0x02470040, unsigned __int64 1292879, unsigned char 205) line 1386 + 8 bytes CDownloadProgressDlg::ExtractReportKeyStats(const char * 0x010ed664, unsigned char 205) line 1333 + 36 bytes
Question here are: 1) do I need to put parensis around like this to instantiate regex obj like the following? expr("(hello world here)", regex_constants::perl | regex_constants::icase); 2) say pBuf has something like the following, will my sample code get the position of the matching on line 1, 3, 5? hello skjdfskldjf hello world here kjsfsdf w3423422 sfsdfsfs hello hello kjkjkjkjkl world world hello world here 23432423 3333333333333332334234244436435353 hello world here ----------------------------------- thanks for the help! /Winson On 8/10/06, John Maddock <john@johnmaddock.co.uk> wrote:
Winson Yung wrote:
Hi all, I am trying to use cregex_iterator, but reading the doc doesn't seem to help me understand the usage of the API. If I supply a regular expression, is the following code correctly get the starting position of each occurance of the string "hello world here" in the buffer?
sample code like this:
int nextPos[100]; char *pBuf = NULL; // pBuf will be point to a buffer NULL terminated ............................ regex expr("hello world here", regex_constants::perl | regex_constants::icase); cregex_iterator h1(pBuf, pBuf + std::strlen(pBuf), expr, 1), h2;
Error here:^
The last arg to the iterator constructor is optional, and should be one or more of the match_flag enum's: http://www.boost.org/libs/regex/doc/match_flag_type.html
John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Winson Yung wrote:
Thanks John, I was going to ask about this magic '1' number thing. I think I found an example somewhere like this, but appearantely it was wrong. I was having so much difficulty using the cregex_iterator, even without the last match flag, the same code was asserting when it was walking through the iterator to get the posistion(0). Here is an example of the calltrace when assert happened:
_assert(void * 0x0067a74c `string', void * 0x0067a758 `string', unsigned int 253) line 256 boost::shared_ptr<boost::regex_iterator_implementation<char const *,char,boost::regex_traits<char,boost::w32_regex_traits<char> > >
operator->() line 253 + 31 bytes boost::regex_iterator<char const *,char,boost::regex_traits<char,boost::w32_regex_traits<char> > operator*() line 129 + 37 bytes CDownloadProgressDlg::ExtractReportKeyStats(unsigned char * 0x02470040, unsigned __int64 1292879, unsigned char 205) line 1386 + 8 bytes CDownloadProgressDlg::ExtractReportKeyStats(const char * 0x010ed664, unsigned char 205) line 1333 + 36 bytes
You are trying to dereference an iterator that doesn't point to anything. You must ensure it is not equal to the end-of-sequence-iterator before dereferencing it. Same as with any other iterator.
Question here are:
1) do I need to put parensis around like this to instantiate regex obj like the following?
expr("(hello world here)", regex_constants::perl | regex_constants::icase);
Not unless you want a marked sub-expression no.
2) say pBuf has something like the following, will my sample code get the position of the matching on line 1, 3, 5?
hello skjdfskldjf hello world here kjsfsdf w3423422 sfsdfsfs hello hello kjkjkjkjkl world world hello world here 23432423 3333333333333332334234244436435353 hello world here -----------------------------------
It will find three matches, the positions will be relative to the start of the buffer, *not* the start of the current line. Boost.Regex generally isn't interested in lines (other than matching ^ and $), it considers \n as just another character. John.

Sure, that's why I have the checking first in the while () statement. Was that not enought? while (i < 100 && h1 != h2) { nextPos[i] = (*h1).position(0); h1++; i++; } On 8/10/06, John Maddock <john@johnmaddock.co.uk> wrote:
Winson Yung wrote:
Thanks John, I was going to ask about this magic '1' number thing. I think I found an example somewhere like this, but appearantely it was wrong. I was having so much difficulty using the cregex_iterator, even without the last match flag, the same code was asserting when it was walking through the iterator to get the posistion(0). Here is an example of the calltrace when assert happened:
_assert(void * 0x0067a74c `string', void * 0x0067a758 `string', unsigned int 253) line 256 boost::shared_ptr<boost::regex_iterator_implementation<char const *,char,boost::regex_traits<char,boost::w32_regex_traits<char> > >
operator->() line 253 + 31 bytes boost::regex_iterator<char const *,char,boost::regex_traits<char,boost::w32_regex_traits<char> > operator*() line 129 + 37 bytes CDownloadProgressDlg::ExtractReportKeyStats(unsigned char * 0x02470040, unsigned __int64 1292879, unsigned char 205) line 1386 + 8 bytes CDownloadProgressDlg::ExtractReportKeyStats(const char * 0x010ed664, unsigned char 205) line 1333 + 36 bytes
You are trying to dereference an iterator that doesn't point to anything.
You must ensure it is not equal to the end-of-sequence-iterator before dereferencing it. Same as with any other iterator.
Question here are:
1) do I need to put parensis around like this to instantiate regex obj like the following?
expr("(hello world here)", regex_constants::perl | regex_constants::icase);
Not unless you want a marked sub-expression no.
2) say pBuf has something like the following, will my sample code get the position of the matching on line 1, 3, 5?
hello skjdfskldjf hello world here kjsfsdf w3423422 sfsdfsfs hello hello kjkjkjkjkl world world hello world here 23432423 3333333333333332334234244436435353 hello world here -----------------------------------
It will find three matches, the positions will be relative to the start of the buffer, *not* the start of the current line. Boost.Regex generally isn't interested in lines (other than matching ^ and $), it considers \n as just another character.
John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (2)
-
John Maddock
-
Winson Yung