
On Fri, Aug 29, 2008 at 4:48 PM, David Abrahams <dave@boostpro.com> wrote:
on Fri Aug 29 2008, Arno Schödl <aschoedl-AT-think-cell.com> wrote:
Wouldn't it be more natural to represent a forward range as a single iterator, with an extra .at_end() check? This would eliminate the problem, because a difference_range would only need two such iterators, not four.
As part of http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1873.html one of the ideas we discussed extensively was allowing the end iterator of a range to be a different type from the begin iterator.
<snip>
Effectively the data representation of the begin iterator would be the same as what you're proposing, plus comparison operators against the universal end type above, but the end iterator type could be empty (and optimized away). That allows a generic description of algorithms that are optimally efficient on pairs of pointers, but could simplify many classes of iterator (ifstream_iterators, for example) where every iterator needs all the information required to detect whether it can advance anyway.
Nice!
Stacking them naively would have 2^N storage growth, rather than 4^N, and we still don't need to worry about range lifetime.
I've been trying to say that generic algorithms *can't* worry about the range lifetime; they have to assume that the iterators they're passed will not be invalidated before they're destroyed.
I think that the OP is talking about a different problem: let say you want to name the result of stacked applications of range adaptors: template<typename Range> void my_algo(Range& r) { auto stacked_range = r | filtered(filter) | mapped(mapper); // use stacked_range } This may lead iterator lifetime problems, not because r iterators may be invalid, but because the iterators of the range returned by filtered(...) may become invalid as that temporary has been destructed. There are many solutions: The easiest is splitting apart the expression, but is ugly. Then you could use expression templates to make the whole pipe expression return a range only at the end. Finally, both filtered and mapped could guarantee that the iterators are not invalidated when the range is destructed, but that may prevent some space optimizations (like storing the predicate inside the range instead of the iterator)).
For other classes of iterators, the situation is similar. bidirectional ranges have at_end() and at_begin(), and random_access_ranges a size() and an operator[].
Bleah (sorry). Iterators in general shouldn't be burdened with the requirement to know the limits of the sequence, and if you go back and review the class of problems that iterators were "designed" to address, I think you'll see why. Just take a look at the classic quicksort algorithm, for example, or any other divide-and-conquer algorithm.
I put "designed" in quotes because concepts are discovered, not designed, but that's a whole different topic.
I know that overthrowing such an established concept as ranges is not terribly popular, but I don't see any other way to combine easy stacking with no worry about lifetimes.
I'm not sure this is as revolutionary as you think. Every other language I've looked at has a get_next()/at_end() iterator abstraction, which is a lot like what you're proposing. My advice: make sure you keep in mind the whole problem domain. You can't understand the reason that C++ iterators are unlike those in other languages without looking at the algorithms.
BTW, ranges are equivalent to GoF iterators: you just spell 'at_end' as 'empty(r)'; 'next()' as 'r = range(next(begin(r)), end(r))' (equivalently: 'r = rest(r))'; and '.get()' as '*begin(r)'; The nice thing about ranges is that you can take them apart into their iterator components when you need the extra flexibility. -- gpd