New subject: tabs and newlines in Regex character ranges

24 Mar 2005

      In previous emails I complained that if one used an extended syntax regex
on an InputIterator (or a BidirectionalIterator) then regex_search sought to the end
of the iterator before returning the first match.

This has obvious disadvantages if one wants to implement lazy matching of regexes
on a potentially large input stream.

My hypothesis about what is happening is:

The seeking to the end is due to the extended syntax wanting to know which is
the leftmost match. This is done by asking the distance between the matched substring
and the 'end' of the sequence.
If the iterator is an InputIterator then STL (at least the version I'm using from gcc 3.3.2)
says, the only way I can do this, is to seek to the end of the sequence, to find how
far away I am from the end. Thus, the problem.

If I redefine the iterator as a RandomAccessIterator, and define operator-, and operator +=,
then STL uses operator- to find the distance between the match and the end.
Which is better, but still not ideal, as I'm assuming that I dont know at that point
where the end is. I'll know when I get there, but not before.

I suggest that a cleaner conceptual way of doing this would be to ask
how far we are from the start of the iterator.

My apologies for any inaccuracies in the above and my ignorance of
earlier debates on this or related issues.

   David

regex_search: Lazy evaluation of iterators

David McKelvie

John Maddock

David McKelvie

David McKelvie

John Maddock

John Maddock

David McKelvie

John Maddock

David McKelvie

John Maddock

David McKelvie

John Maddock

tags

participants (2)