
Hi- I was wondering what the rationale was for boost::filter_iterator not modeling anything more than a forward traversal iterator. Recently I was trying to use filter_iterator in a situation where I needed bidirectional traversal, but got stymied by this. Is there something about the notion of filtering that precludes bidirectionality? Or is this something that could be made to work? Thanks, Bob

Bob Bell wrote:
Hi-
I was wondering what the rationale was for boost::filter_iterator not modeling anything more than a forward traversal iterator.
Cost.
Recently I was trying to use filter_iterator in a situation where I needed bidirectional traversal, but got stymied by this. Is there something about the notion of filtering that precludes bidirectionality? Or is this something that could be made to work?
In order to be able to be bidirectional the filter iterator would need at least another member, the beginning of the sequence and another ctor parameter, again the beginning of the sequence. This is a non negligible cost that users that are ok with fwd iterators most likely don't want to pay. Regards Thomas -- Thomas Witt witt@acm.org

Thomas Witt wrote:
Bob Bell wrote:
Hi-
I was wondering what the rationale was for boost::filter_iterator not modeling anything more than a forward traversal iterator.
Cost.
Recently I was trying to use filter_iterator in a situation where I needed bidirectional traversal, but got stymied by this. Is there something about the notion of filtering that precludes bidirectionality? Or is this something that could be made to work?
In order to be able to be bidirectional the filter iterator would need at least another member, the beginning of the sequence and another ctor parameter, again the beginning of the sequence.
Is this so that when iterating backwards you can stop before "running off the beginning?" I'm in the unenviable position of not being able to use boost at work, but I needed some of the functionality of the iterator library, so I ended up implementing a subset of it myself. When it came to filter iterators, I implemented them as bidirectional. Initially, I had the extra iterators you mention, but I ended up getting rid of them. My reasoning was like this: Consider this forward loop: while (b != e) { // use *b; ++b; } Incrementing b will, internally, dereference its base iterator and call its filter function to determine when it's encountered a new element in the filtered pseudo-sequence. However, it must be prevented from incrementing past e. So b also carries a second, "sentinel" base iterator (that must be initialized to the same sentinel carried by e); when incrementing, if b reaches the sentinel, it doesn't dereference and test, it just stops. This makes sure that b will become equal to e, and the loop will terminate. Now consider a typical reverse loop, where we decrement backwards through [b, e): while (b != e) { --e; // use *e; } No "begin sentinel" is required to make this loop behave as expected. When b is created, it will point to the first element of the filtered pseudo-sequence; this means that eventually, e will become equal to b, and the loop will terminate. In fact, it seems that without a begin sentinel, uses of b and e are just as well-defined (or undefined) as equivalent ordinary iterators. For example, for an ordinary sequence [b, e), --b is well-defined only if there is something dereferencable there. For a filter iterator lacking a begin sentinel, the same thing is true. What do you think?
This is a non negligible cost that users that are ok with fwd iterators most likely don't want to pay.
If there's still some requirement to have the extra cost for bidirectionality (i.e., if there's a flaw in my analysis above), I wonder if there isn't some metaprogramming trickery that could make the cost exist for only those who need bidirectionality, while those who need only forward iteration don't pay. Not that I'm volunteering to write it. ;-) Bob

Bob Bell wrote:
Now consider a typical reverse loop, where we decrement backwards through [b, e):
while (b != e) { --e; // use *e; }
No "begin sentinel" is required to make this loop behave as expected. When b is created, it will point to the first element of the filtered pseudo-sequence; this means that eventually, e will become equal to b, and the loop will terminate.
Not if the filter predicate returns false for *b.
In fact, it seems that without a begin sentinel, uses of b and e are just as well-defined (or undefined) as equivalent ordinary iterators. For example, for an ordinary sequence [b, e), --b is well-defined only if there is something dereferencable there. For a filter iterator lacking a begin sentinel, the same thing is true.
What do you think?
I think you overlooked something important. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams wrote:
Bob Bell wrote:
Now consider a typical reverse loop, where we decrement backwards through [b, e):
while (b != e) { --e; // use *e; }
No "begin sentinel" is required to make this loop behave as expected. When b is created, it will point to the first element of the filtered pseudo-sequence; this means that eventually, e will become equal to b, and the loop will terminate.
Not if the filter predicate returns false for *b.
In fact, it seems that without a begin sentinel, uses of b and e are just as well-defined (or undefined) as equivalent ordinary iterators. For example, for an ordinary sequence [b, e), --b is well-defined only if there is something dereferencable there. For a filter iterator lacking a begin sentinel, the same thing is true.
What do you think?
I think you overlooked something important.
Well, let me think about this again. Maybe you're right. The issue would be that you can't predict where the beginning of the filtered sequence lies... but that's not true. Imagine you have a std::list<int> l and you build filter iterators b and e over [l.begin(), l.end()). Since b and e form a valid range, I see no reason you can't iterate backwards from e to b. Okay, I buy it. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

On Sat, Dec 11, 2004 at 12:38:27AM -0500, David Abrahams wrote:
Okay, I buy it.
You forgot that you already bought it 18 months ago :) See the thread starting at : http://lists.boost.org/MailArchives/boost/msg47144.php Probably a note in the documentation concerning this "non-intuitive" fact would help. Another note : filter_iterator's complexity may not match the requirements of iterators (increment in constant time amortized), depending on the predicate's behavior. -- Sylvain

Sylvain Pion wrote:
On Sat, Dec 11, 2004 at 12:38:27AM -0500, David Abrahams wrote:
Okay, I buy it.
You forgot that you already bought it 18 months ago :)
See the thread starting at : http://lists.boost.org/MailArchives/boost/msg47144.php
Probably a note in the documentation concerning this "non-intuitive" fact would help.
Probably a simple fix to the library would help more.
Another note : filter_iterator's complexity may not match the requirements of iterators (increment in constant time amortized), depending on the predicate's behavior.
That's a tricky question, which is why it's been left unanswered. You might look at it this way: for any filter and sequence of length N there are some constant number of elements M that will pass the filter. If the cost of traversing the unfiltered sequence is O(N) then so is the cost of traversing the filtered sequence. But N/M is a constant, so it's also N/M O(M) and therefore also O(M). Is that stretching the definition of an iterator? I'm not sure. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

On Sat, Dec 11, 2004 at 09:00:05AM -0500, David Abrahams wrote:
Another note : filter_iterator's complexity may not match the requirements of iterators (increment in constant time amortized), depending on the predicate's behavior.
That's a tricky question, which is why it's been left unanswered. You might look at it this way: for any filter and sequence of length N there are some constant number of elements M that will pass the filter. If the cost of traversing the unfiltered sequence is O(N) then so is the cost of traversing the filtered sequence. But N/M is a constant, so it's also N/M O(M) and therefore also O(M).
But N/M does not have to be bounded by a constant when N grows.
Is that stretching the definition of an iterator? I'm not sure.
That's too much for me. I tend to think that it's stretching what users might expect when they see "constant time amortized operation". -- Sylvain

Sylvain Pion wrote:
On Sat, Dec 11, 2004 at 09:00:05AM -0500, David Abrahams wrote:
Another note : filter_iterator's complexity may not match the requirements of iterators (increment in constant time amortized), depending on the predicate's behavior.
That's a tricky question, which is why it's been left unanswered. You might look at it this way: for any filter and sequence of length N there are some constant number of elements M that will pass the filter. If the cost of traversing the unfiltered sequence is O(N) then so is the cost of traversing the filtered sequence. But N/M is a constant, so it's also N/M O(M) and therefore also O(M).
But N/M does not have to be bounded by a constant when N grows.
Is that the right way to understand the complexity guarantee? I'm really asking. Consider std::set. As the size of the set grows, the average cost of incrementing an iterator grows without bound. In case that's not obvious, think of the middle of a balanced binary tree, where an iterator has to make 2 logN steps through the tree to move between elements.
Is that stretching the definition of an iterator? I'm not sure.
That's too much for me. I tend to think that it's stretching what users might expect when they see "constant time amortized operation".
This is tricky territory. It certainly warrants a note in the documentation, but I'm not exactly sure what to say. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com
participants (4)
-
Bob Bell
-
David Abrahams
-
Sylvain Pion
-
Thomas Witt