Re: [boost] lifetime of ranges vs. iterators

29 Aug 2008

      On Fri, Aug 29, 2008 at 4:48 PM, David Abrahams <dave@boostpro.com> wrote:
...
on Fri Aug 29 2008, Arno Schödl <aschoedl-AT-think-cell.com> wrote:
...
Wouldn't it be more natural to represent a forward range as a single
iterator, with an extra .at_end() check? This would eliminate the
problem, because a difference_range would only need two such
iterators, not four.
As part of
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1873.html one
of the ideas we discussed extensively was allowing the end iterator of a
range to be a different type from the begin iterator.
<snip>
...
Effectively the data representation of the begin iterator would be the
same as what you're proposing, plus comparison operators against the
universal end type above, but the end iterator type could be empty (and
optimized away).  That allows a generic description of algorithms that
are optimally efficient on pairs of pointers, but could simplify many
classes of iterator (ifstream_iterators, for example) where every
iterator needs all the information required to detect whether it can
advance anyway.
Nice!
...
...
Stacking them naively would have 2^N storage
growth, rather than 4^N, and we still don't need to worry about range
lifetime.
I've been trying to say that generic algorithms *can't* worry about the
range lifetime; they have to assume that the iterators they're passed
will not be invalidated before they're destroyed.
I think that the OP is talking about a different problem: let say you
want to name the result of stacked applications of range adaptors:

template<typename Range>
void my_algo(Range& r) {
  auto stacked_range = r | filtered(filter) | mapped(mapper);
  // use stacked_range
}

This may lead iterator lifetime problems, not because r iterators may
be invalid, but because the iterators of the range returned by
filtered(...) may become invalid as that temporary has been
destructed. There are many solutions:

The easiest is splitting apart the expression, but is ugly.

Then you could use expression templates to make the whole pipe
expression return a range only at the end.

Finally, both filtered and mapped could guarantee that the iterators
are not invalidated when the range is destructed, but that may prevent
some space optimizations (like storing the predicate inside the range
instead of the iterator)).
...
...
For other classes of iterators, the situation is
similar. bidirectional ranges have at_end() and at_begin(), and
random_access_ranges a size() and an operator[].
Bleah (sorry).  Iterators in general shouldn't be burdened with the
requirement to know the limits of the sequence, and if you go back and
review the class of problems that iterators were "designed" to address,
I think you'll see why.  Just take a look at the classic quicksort
algorithm, for example, or any other divide-and-conquer algorithm.
I put "designed" in quotes because concepts are discovered, not
designed, but that's a whole different topic.
...
I know that overthrowing such an established concept as ranges is not
terribly popular, but I don't see any other way to combine easy
stacking with no worry about lifetimes.
I'm not sure this is as revolutionary as you think.  Every other
language I've looked at has a get_next()/at_end() iterator abstraction,
which is a lot like what you're proposing.  My advice: make sure you
keep in mind the whole problem domain.  You can't understand the reason
that C++ iterators are unlike those in other languages without looking
at the algorithms.
BTW, ranges are equivalent to GoF iterators: you just spell 'at_end'
as 'empty(r)';
'next()' as 'r = range(next(begin(r)), end(r))'   (equivalently: 'r =
rest(r))'; and '.get()' as
'*begin(r)';

The nice thing about ranges is that you can take them apart into their
iterator components when you need the extra flexibility.

-- 
gpd

Re: [boost] lifetime of ranges vs. iterators

Giovanni Piero Deretta