[range] fat iterators?

newer
Is there any interest in a library...

older
Re: [boost] [Functional/Factory]...

Eric Niebler

7 Nov 2013 7 Nov '13

2:20 p.m.

I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe? Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too. The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts? Is it time for Boost.Range 3.0? [1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges -- Eric Niebler Boost.org http://www.boost.org

Show replies by date

Neil Groves

7 Nov 7 Nov

2:31 p.m.

On Thu, Nov 7, 2013 at 2:20 PM, Eric Niebler <eniebler@boost.org> wrote:

...

I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe? Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too.

I'm open to making similar changes once I've established there is a real benefit for each case we propose to make the change. I find it hard to imagine that performance-sensitive code would not be using custom stream implementations to improve performance. I therefore wonder how much real world benefit the work would provide.

...

The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

This is definitely something that needs to be added. The challenge is merely finding sufficient time to do so.

...

Is it time for Boost.Range 3.0?

I guess that all depends on what Boost.Range 3.0 is. Is the current Boost.Range so horribly broken that it requires replacement? I think it probably needs some significant addition, but don't think that the current facilities are a barrier to progress. Therefore I'm not currently of the view that we need to throw away and start again. I'm open to argument on this matter. I think there is a need to fix a number of trac issues with Boost.Range and to extend the library to use C++11 to fix some of the lifetime issues. It would be optimal if there was sufficient time to start work on new range primitives. I am not, despite my best intentions, able to dedicate sufficient time to make much progress in the next few months. I'm open to accept help.

...

[1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges

-- Eric Niebler Boost.org http://www.boost.org

Neil Groves

Eric Niebler

2:52 p.m.

On 11/7/2013 3:31 PM, Neil Groves wrote:

...

On Thu, Nov 7, 2013 at 2:20 PM, Eric Niebler <eniebler@boost.org> wrote:

...
I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe? Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too.

I'm open to making similar changes once I've established there is a real benefit for each case we propose to make the change. I find it hard to imagine that performance-sensitive code would not be using custom stream implementations to improve performance. I therefore wonder how much real world benefit the work would provide.

Forget streams. Think of something like this: rng | filtered([=](xxx y){ I reference local vars; }) Now you have a pair of filter_iterators, each of which has a copy of the lambda, and each lambda has a copy of all the local variables the lambda captures. Copying those iterators makes a copy of all the locals captured by the lambda. Am I wrong? That's pretty bad.

...

...
The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

This is definitely something that needs to be added. The challenge is merely finding sufficient time to do so.

<sigh> I can relate.

...

...
Is it time for Boost.Range 3.0?

I guess that all depends on what Boost.Range 3.0 is. Is the current Boost.Range so horribly broken that it requires replacement?

See the filtered range example above. It's not pretty.

...

I think it probably needs some significant addition

Addition won't cut it unless you want to drop support for C++98. The optimal design I'm suggesting requires the ability to detect rvalues with 100% fidelity. So the current design is as good as it gets for C++98 users, AFAICS. There needs to be a parallel rvalue-aware range library for C++11 where virtually all the range and iterator types are different.

...

, but don't think that the current facilities are a barrier to progress. Therefore I'm not currently of the view that we need to throw away and start again. I'm open to argument on this matter. I think there is a need to fix a number of trac issues with Boost.Range and to extend the library to use C++11 to fix some of the lifetime issues. It would be optimal if there was sufficient time to start work on new range primitives.

What new primitives do you have in mind?

...

I am not, despite my best intentions, able to dedicate sufficient time to make much progress in the next few months. I'm open to accept help.

This is important to me. I'm willing to help if you agree to the direction I've described.

...

...
[1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges

Neil Groves

-- Eric Niebler Boost.org http://www.boost.org

Thorsten Ottosen

7:29 p.m.

On 07-11-2013 15:52, Eric Niebler wrote:

...

On 11/7/2013 3:31 PM, Neil Groves wrote:

...
On Thu, Nov 7, 2013 at 2:20 PM, Eric Niebler <eniebler@boost.org> wrote:

...
I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there

[snip] Historical note: Some years ago, one of the adaptors I wrote returned a range object where the iterators kept a reference to the range. I can't remember if Neil replaced it or if its still in use. It had obvious life-time issues if the iterator was extracted from a locally constructed range, but I thought 99.9% of all use would be in loops and with range adapters. So, as others have pointed out, do we want a design where iterators can't outlive the range, even when the range is constructed from, say, a vector that is kept alive?

...

Forget streams. Think of something like this:

rng | filtered([=](xxx y){ I reference local vars; })

Now you have a pair of filter_iterators, each of which has a copy of the lambda, and each lambda has a copy of all the local variables the lambda captures. Copying those iterators makes a copy of all the locals captured by the lambda. Am I wrong? That's pretty bad.

It seems bad. FWIW, I'd be happy with a range library that required the range to be kept alive as long as the iterator. Remark on your article: did you benchmark your new design in terms of speed and size? regards -Thorsten

Eric Niebler

9:33 p.m.

On 11/7/2013 8:29 PM, Thorsten Ottosen wrote:

...

Remark on your article: did you benchmark your new design in terms of speed and size?

I did the size benchmark. Boost's istream_range<string>, which is a pair of std::istream_iterators, is 24 bytes, and the iterators are 12. My istream_range is 8 bytes, and the iterator is 4. That's with libstdc++ 4.7. The differences can be larger on a platform that uses the small-string optimization. I didn't measure performance, I admit. Non-locality is a valid concern. I have no doubt that it's possible to construct cases where one implementation is faster than the other. Lies, damn lies, and benchmarks, you know. -- Eric Niebler Boost.org http://www.boost.org

Thorsten Ottosen

8 Nov 8 Nov

9:21 a.m.

On 07-11-2013 22:33, Eric Niebler wrote:

...

On 11/7/2013 8:29 PM, Thorsten Ottosen wrote:

...
Remark on your article: did you benchmark your new design in terms of speed and size?

...

I didn't measure performance, I admit. Non-locality is a valid concern. I have no doubt that it's possible to construct cases where one implementation is faster than the other. Lies, damn lies, and benchmarks, you know.

Sometimes. Genereated assembler would be another way to measure. Let's assume a range that generates the iterators in your fashion. Then in template< class Rng > void foo_algo( Rng& rng ) { std::foo_algo( begin(rng), end(end) ); } how much can we expect the optimizer? Both iterators will store a reference to the range, and then might have some additional information. Can the compiler see that the stored reference to the range is the same and avoid one of them? If not, may we should think about how to inprove the language to make this happen? -Thorsten

Nathan Ridge

7 Nov 7 Nov

5:27 p.m.

...

...
The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

This is definitely something that needs to be added. The challenge is merely finding sufficient time to do so.

Jeffrey Yaskin posted a candidate design, implemented for filter_iterator, here: https://svn.boost.org/trac/boost/ticket/7630 What do you think about this design? Regards, Nate

Eric Niebler

5:37 p.m.

On 11/7/2013 6:27 PM, Nathan Ridge wrote:

...

...
...
The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

This is definitely something that needs to be added. The challenge is merely finding sufficient time to do so.

Jeffrey Yaskin posted a candidate design, implemented for filter_iterator, here: https://svn.boost.org/trac/boost/ticket/7630

What do you think about this design?

It's a step in the right direction, but it still is essentially an iterator_range of filter_iterators. Each filter_iterator is fat; they both store a copy of the filter predicate. Rather, filter_range should store the predicate, and the iterators returned from filter_range should just hold a pointer to the range. That's what I'm suggesting. And to answer Neil's objection, it necessarily means the iterators can't outlive the range, but IMO it's wrong to assume otherwise. Nobody expects a vector iterator to outlive the vector, after all. It will most certainly be a breaking change, which is why I suggest we ship a C++11 range library in parallel with the existing C++98 one. -- Eric Niebler Boost.org http://www.boost.org

Neil Groves

6:29 p.m.

On Thu, Nov 7, 2013 at 5:37 PM, Eric Niebler <eniebler@boost.org> wrote:

...

On 11/7/2013 6:27 PM, Nathan Ridge wrote:

...
...
...
The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

This is definitely something that needs to be added. The challenge is merely finding sufficient time to do so.

Jeffrey Yaskin posted a candidate design, implemented for filter_iterator, here: https://svn.boost.org/trac/boost/ticket/7630

What do you think about this design?

It's a step in the right direction, but it still is essentially an iterator_range of filter_iterators. Each filter_iterator is fat; they both store a copy of the filter predicate. Rather, filter_range should store the predicate, and the iterators returned from filter_range should just hold a pointer to the range. That's what I'm suggesting.

I definitely like the idea. I'd like to measure the impact in some use-cases to determine the impact of the worsened locality-of-data. I don't believe it is always a performance win. If I get some performance numbers I think we can combine the approaches optimally. I would have thought that we should, under some scenarios, be using EBO for the filter predicate since they are often stateless.

...

And to answer Neil's objection, it necessarily means the iterators can't outlive the range, but IMO it's wrong to assume otherwise. Nobody expects a vector iterator to outlive the vector, after all. It will most certainly be a breaking change, which is why I suggest we ship a C++11 range library in parallel with the existing C++98 one.

I'm not objecting. I'm concerned that using iterators from a range after a range has died is a significant design change. It isn't at all silly or an unusual use-case. It is commonly required when using ranges in your own code and interfacing to a library that only uses iterators. With the new requirements I would no longer be able to use ranges at all if I wanted to continue using the library. That's a big drawback. We can provide a new interface and define that as "wrong" but it most certainly isn't wrong at the moment. It's useful. However I'm not objecting. I'm pointing out that there are some significant drawbacks to consider. I think we can put a new interface together and perhaps this is the correct trade-off for a new interface. From my perspective the reduction in interoperability with iterator-based libraries is a significant drawback. I wonder if we can be clever and solve this with something cunning? A vector not outliving it's iterators has always been a requirement. This requirement had no practical disadvantage to interoperability on common use-cases. This is not true for range lifetime. The range has often had a shorter lifetime than the container in many practical use-cases. It has always been a disposable wrapper. Adding a dependency from iterator to range has significant implications. I think the idea is worthy of exploration. It's just not completely obvious to me that the initial proposal is obviously exactly the way forward. I think this proposal is a move toward making the ranges primitives in themselves and I wonder if we should be thinking slightly more broadly. In particular Andrei's work might be a source of inspiration.

...

-- Eric Niebler Boost.org http://www.boost.org

I'm attempting to stimulate better answers rather than stifle change. I hope that comes across. Your input has been really useful. Neil Groves

Andrey Semashev

7:19 p.m.

On Thursday 07 November 2013 18:29:58 Neil Groves wrote:

...

On Thu, Nov 7, 2013 at 5:37 PM, Eric Niebler <eniebler@boost.org> wrote:

...
On 11/7/2013 6:27 PM, Nathan Ridge wrote:

...
...
...
The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

This is definitely something that needs to be added. The challenge is merely finding sufficient time to do so.

Jeffrey Yaskin posted a candidate design, implemented for filter_iterator, here: https://svn.boost.org/trac/boost/ticket/7630

What do you think about this design?

It's a step in the right direction, but it still is essentially an iterator_range of filter_iterators. Each filter_iterator is fat; they both store a copy of the filter predicate. Rather, filter_range should store the predicate, and the iterators returned from filter_range should just hold a pointer to the range. That's what I'm suggesting.

I definitely like the idea. I'd like to measure the impact in some use-cases to determine the impact of the worsened locality-of-data. I don't believe it is always a performance win. If I get some performance numbers I think we can combine the approaches optimally. I would have thought that we should, under some scenarios, be using EBO for the filter predicate since they are often stateless.

...
And to answer Neil's objection, it necessarily means the iterators can't outlive the range, but IMO it's wrong to assume otherwise. Nobody expects a vector iterator to outlive the vector, after all. It will most certainly be a breaking change, which is why I suggest we ship a C++11 range library in parallel with the existing C++98 one.

I'm not objecting. I'm concerned that using iterators from a range after a range has died is a significant design change. It isn't at all silly or an unusual use-case. It is commonly required when using ranges in your own code and interfacing to a library that only uses iterators. With the new requirements I would no longer be able to use ranges at all if I wanted to continue using the library. That's a big drawback. We can provide a new interface and define that as "wrong" but it most certainly isn't wrong at the moment. It's useful. However I'm not objecting. I'm pointing out that there are some significant drawbacks to consider. I think we can put a new interface together and perhaps this is the correct trade-off for a new interface. From my perspective the reduction in interoperability with iterator-based libraries is a significant drawback. I wonder if we can be clever and solve this with something cunning?

A vector not outliving it's iterators has always been a requirement. This requirement had no practical disadvantage to interoperability on common use-cases. This is not true for range lifetime. The range has often had a shorter lifetime than the container in many practical use-cases. It has always been a disposable wrapper. Adding a dependency from iterator to range has significant implications. I think the idea is worthy of exploration. It's just not completely obvious to me that the initial proposal is obviously exactly the way forward. I think this proposal is a move toward making the ranges primitives in themselves and I wonder if we should be thinking slightly more broadly. In particular Andrei's work might be a source of inspiration.

I agree with Neil that iterator ranges often have a shorter lifetime than the iterators, if it is known that the range refers to an external storage. That said, the benefit of Eric's suggestion is also obvious, at least in terms of object size and probably performance. So why don't we have both? We already have iterator_range that implements the first approach. Why don't we implement specialized range types for streams/transforms/filters that follow Eric's suggestion?

Jonathan Wakely

8:49 p.m.

On 7 November 2013 17:27, Nathan Ridge wrote:

...

Jeffrey Yaskin posted a candidate design, implemented for filter_iterator, here: https://svn.boost.org/trac/boost/ticket/7630

What do you think about this design?

That's beautiful! My solution to the temporary range problem is simpler, but as mentioned in Jeffrey's patch to the docs, for ranges that store an iterator pair it wastes the space of a reference member in the lvalue case, and doesn't vanish to nothing in C++03 mode: template<typename R> class adaptor { R r; // R is a range or an lvalue-reference-to-range public: explicit adaptor(R&& r) : r(std::forward<R>(r)) { } // ... }; template<typename R> adaptor<R> adapted(R&& r) { return adaptor<R>{std::forward<R>(r)}; }

Luc Danton

8 Nov 8 Nov

2:23 a.m.

On 2013-11-07 21:49, Jonathan Wakely wrote:

...

On 7 November 2013 17:27, Nathan Ridge wrote:

...
Jeffrey Yaskin posted a candidate design, implemented for filter_iterator, here: https://svn.boost.org/trac/boost/ticket/7630

What do you think about this design?

That's beautiful!

My solution to the temporary range problem is simpler, but as mentioned in Jeffrey's patch to the docs, for ranges that store an iterator pair it wastes the space of a reference member in the lvalue case, and doesn't vanish to nothing in C++03 mode:

template<typename R> class adaptor { R r; // R is a range or an lvalue-reference-to-range

public: explicit adaptor(R&& r) : r(std::forward<R>(r)) { }

// ... };

template<typename R> adaptor<R> adapted(R&& r) { return adaptor<R>{std::forward<R>(r)}; }

I generally make use of this solution (i.e. in general and for things other than ranges), but after trying it for ranges I balked and switched to storing decayed copies. The reason is that a range abstraction provides an 'indirection' at least in a moral sense, if not in actuality. So you get aliasing issues: some_range r; auto a = adapted(r); auto b = adapted(r); // consume a: for(auto&& item: a); // now r has been touched! // likely to do the wrong thing for(auto&& item: b); While there is the option of doing adapted(decay(r)) to express 'construct an adapted range from r that stores its own copy' (similarly to uses of ref and reference_wrapper in C++03 code, although inverted), my concern is that aliasing is not the most useful default for ranges. (Strictly from a semantics point of view, I've never checked the impact on the generated code.)

Neil Groves

7 Nov 7 Nov

2:49 p.m.

On Thu, Nov 7, 2013 at 2:20 PM, Eric Niebler <eniebler@boost.org> wrote:

...

I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe? Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too.

While I like the advantages of pushing state into the range this adds a requirement for the range to outlive the iterators. This change I think inevitably leads to backward compatibility problems to poorer interoperation with existing iterator-based interfaces. For cases where we can deprecate iterators and add the lifetime guarantee I see the appeal. In the case of the filter adapter reusing the filter_iterator this was to allow interoperation with iterator-only algorithms. I didn't want to impose the constraint of having to have a range outlive the iterator. There is additionally advantage in locality of data for small predicates (which is my typical use case). I can see that if we could solve the lifetime issues that we could do better by having a mechanism to use the local solution where it is optimal and store the predicate somewhere else when it is sufficiently large or expensive to copy. This can be done now by using a wrapper around a reference counted pointer implementation of the predicate. This is close to optimal without breaking backward compatibility. Indeed if we solve the range lifetime issue we end up with similarly expensive solutions I think.

...

The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

I like the idea. I can't see a solution that leads to better performance than is obtainable currently. This is likely a failure of my imagination!

...

Is it time for Boost.Range 3.0?

[1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges

-- Eric Niebler Boost.org http://www.boost.org

Neil Groves

Evgeny Panasyuk

8:40 p.m.

07.11.2013 18:20, Eric Niebler:

...

I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1].

Note, for some cases "fat" std::istream_iterator is actually good. For example when it is used as CountedRange [i, n) : std::copy_n(istream_iterator<int>(cin), 10, out); "slim" version would do superfluous check on each increment.

...

I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe?

I like your solution for istream_range - it is "slim" indeed. It can be generalized to wrapper around new special concept of range/iterator, which would be similar to InputRange from D language: http://dlang.org/phobos/std_range.html#isInputRange Here is proof-of-concept: http://coliru.stacked-crooked.com/a/f82b76fb6b13521b class NullTerminated { const char *position; public: explicit NullTerminated(const char *position) : position{position} {} NullTerminated &operator++() { ++position; return *this; } const char &operator*() const { return *position; } friend bool empty(const NullTerminated &x) { return *x == 0; } }; int main() { for(auto x : make_range(NullTerminated{"Sentinel iteration"})) cout << x; cout << "!" << endl; } Note, that we can have special algorithms for new kinds of iterators/ranges. Plus, I think we can make reverse adaptor: from STL iterators to new iterator/range.

...

Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too.

boost::adaptors::filtered returns up to BidirectionalRange and boost::adaptors::transformed returns up to RandomAccessRange, which is useful. And "fat" is actually not a problem, because in most cases predicates/transformations are empty or rather small objects, with locality benefits as noted. Do you propose to add special case for InputRanges into present adaptors? Or do you propose to add new "slim" adaptors like boost::adaptors::transformed_slim, which would have shared functors between iterators for all Range categories? -- Evgeny Panasyuk

TONGARI J

8 Nov 8 Nov

3:59 a.m.

2013/11/7 Eric Niebler <eniebler@boost.org>

...

I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe? Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too.

The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

Is it time for Boost.Range 3.0?

[1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges

That reminds me of the question I asked years ago (no one answered): http://lists.boost.org/boost-users/2011/08/70121.php Surely all approaches have tradeoffs... About the lifetime problem, could it be solved(?) with the approach Fusion used? (i.e. the distinction between 'view' & 'sequence'). I'm willing to have a reusable range, rather than a disposable one. The problem of a disposable one also described in my question. And I wonder, if we could have 2-phase construction. phase 1: A lazy range builder w/o begin/end phase 2: A range w/ begin/end and probably do some work at construction, as Eric suggested. And there have to be a trait like: template<class Range> struct range_type { typedef Range& type; }; And specialized for those have 2 phases. To implement the algorithms: template<class Range> void some_algo(Range const& rng, ...) { typename range_type<Range const>::type r(rng); some_algo(begin(r), end(r), ...); } For range adaptors, they also store phase 1 object, and make the underlying phase 2 object on the fly. What do you think?

Eric Niebler

9 Nov 9 Nov

9:39 a.m.

On 11/8/2013 4:59 AM, TONGARI J wrote:

...

2013/11/7 Eric Niebler <eniebler@boost.org>

...
I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe? Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too.

The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

Is it time for Boost.Range 3.0?

[1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges

That reminds me of the question I asked years ago (no one answered):

http://lists.boost.org/boost-users/2011/08/70121.php

That's an interesting example, thanks. I hadn't considered the problem of iterator invalidation.

...

Surely all approaches have tradeoffs...

About the lifetime problem, could it be solved(?) with the approach Fusion used? (i.e. the distinction between 'view' & 'sequence').

I feel that the Fusion approach is a stop-gap, since when Fusion was designed there were no rvalue references. Fusion::is_view is an imperfect heuristic for determining when to store a copy. I think rval refs makes the distinction between views and sequences moot.

...

I'm willing to have a reusable range, rather than a disposable one. The problem of a disposable one also described in my question.

And I wonder, if we could have 2-phase construction. phase 1: A lazy range builder w/o begin/end phase 2: A range w/ begin/end and probably do some work at construction, as Eric suggested.

And there have to be a trait like:

template<class Range> struct range_type { typedef Range& type; };

And specialized for those have 2 phases.

To implement the algorithms:

template<class Range> void some_algo(Range const& rng, ...) { typename range_type<Range const>::type r(rng); some_algo(begin(r), end(r), ...); }

For range adaptors, they also store phase 1 object, and make the underlying phase 2 object on the fly.

What do you think?

I'm not sure what problem you're solving with 2-phase construction of ranges. Can you clarify? -- Eric Niebler Boost.org http://www.boost.org

TONGARI J

1:37 p.m.

2013/11/9 Eric Niebler <eniebler@boost.org>

...

I'm not sure what problem you're solving with 2-phase construction of ranges. Can you clarify?

The point is that your proposed approach would store state in the range object, say, "mutable T obj_;" in istream_range, and the setup is performed only at construction, which makes it infeasible for reentrance. If someone do this: auto&& rng = istream_range<int>(std::cin) | filtered(...); copy(rng, a); std::cin.clear(); copy(rng, b); // wrong result, internal 'next' not called This is a bit contrived, and I'm not sure if there would be more range/adaptor which have state... Maybe the idea of 2-phase construction is just an oversight, and it won't play with range-based for loop :/

Dave Abrahams

12 Nov 12 Nov

1:22 p.m.

Eric Niebler <eniebler@boost.org> writes:

...

I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe? Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too.

The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

Is it time for Boost.Range 3.0?

[1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges

Seems like we've been in this territory before... http://boost.2283326.n4.nabble.com/lifetime-of-ranges-vs-iterators-td2650349...

Eric Niebler

3:35 p.m.

On 11/12/2013 5:22 AM, Dave Abrahams wrote:

...

Eric Niebler <eniebler@boost.org> writes:

...
I wrote an article describing the "fat" std::istream_iterator and how it can be slimmed down by giving it an owning istream_range[1]. I see there already is an istream_range in Boost.Range, but it suffers from the fat-iterator problem I describe there. Is there a compelling reason why the implementation of boost::istream_range shouldn't be changed to be more like the one I describe? Also, as a commented pointed out, the same problem exists for the filtered range and the transformed range, too.

The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

Is it time for Boost.Range 3.0?

[1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges

Seems like we've been in this territory before... http://boost.2283326.n4.nabble.com/lifetime-of-ranges-vs-iterators-td2650349...

That discussion most certainly would have been different had rvalue refs been around. We're having that discussion now... -- Eric Niebler Boost.org http://www.boost.org

Eric Niebler

3:41 p.m.

On 11/7/2013 6:20 AM, Eric Niebler wrote:

...

The trick, of course, will be keeping intermediate temporary ranges alive long enough to avoid lifetime issues when we chain adaptors and assign the result to a local variable. I think a range library that's sensitive to the value category of range objects and makes copies of rvalue ranges would solve this problem. Thoughts?

Is it time for Boost.Range 3.0?

I've now built a filter_range, transform_range, and an istream_range in this vein. Initial results are looking promising, but I haven't benchmarked performance, yet. A commenter on my blog claims to have benchmarked perf of istream_range, with good results, which is encouraging. If anybody wants to take a peek, you can find the code here: https://github.com/ericniebler/range-v3 I'm making no effort to maintain source compatibility, going instead with the design that seems right to me for C++11. -- Eric Niebler Boost.org http://www.boost.org

Evgeny Panasyuk

4:21 p.m.

12.11.2013 19:41, Eric Niebler:

...

I've now built a filter_range, transform_range, and an istream_range in this vein. Initial results are looking promising, but I haven't benchmarked performance, yet. A commenter on my blog claims to have benchmarked perf of istream_range, with good results, which is encouraging.

While new "slim" istream_range is faster than version with "fat" iterators, it is still not the fastest possible. In particular, it does two comparisons per cycle iteration: one is "!rng_->next()", and another is comparison of iterators, like "it != last". But it is possible to do only one comparison per iteration, like http://coliru.stacked-crooked.com/a/b0a989aa0b2e9741 : while(!empty(it)) { cout << *it << " "; ++it; } We can have new fast algorithms for such iterators. Such iterators are not replacement of STL iterators, but complementary. It is OK to have different iterators and different algorithms for different kinds of things - http://www.youtube.com/watch?v=COuHLky7E2Q&t=42m18s . Plus, when backward compatibility matters - we can use adaptors like you showed. -- Evgeny Panasyuk

Eric Niebler

5:13 p.m.

On 11/12/2013 8:21 AM, Evgeny Panasyuk wrote:

...

12.11.2013 19:41, Eric Niebler:

...
I've now built a filter_range, transform_range, and an istream_range in this vein. Initial results are looking promising, but I haven't benchmarked performance, yet. A commenter on my blog claims to have benchmarked perf of istream_range, with good results, which is encouraging.

While new "slim" istream_range is faster than version with "fat" iterators, it is still not the fastest possible. In particular, it does two comparisons per cycle iteration: one is "!rng_->next()", and another is comparison of iterators, like "it != last". <snip>

The commenter on my blog claims this extra check is optimized away by a smart compiler like gcc. He says[^1]:

...

Your istream_range solution is pretty awesome. I tested the generated assembly on gcc.godbolt.org and compared it to plain old while() looping and D-style iteration http://coliru.stacked-crooked.com/a/d22192e264e59715

Incredibly, the conditional nulling-out inside your iterator increment() appears to be optimized away altogether. It’s very surprising that the compiler can prove that a nullptr value will be compared to the other nullptr value lurking inside the end() iterator. Essentially the same assembly is generated for all 3 approaches.

It remains to be seen how well compilers can optimized several stacked range adapters, but at least in the simple cases we don't have to accept compromise. [^1]: http://ericniebler.com/2013/11/07/input-iterators-vs-input-ranges/#comment-1... -- Eric Niebler Boost.org http://www.boost.org

Daniel James

6:44 p.m.

On 12 November 2013 17:13, Eric Niebler <eniebler@boost.org> wrote:

...

On 11/12/2013 8:21 AM, Evgeny Panasyuk wrote:

...
12.11.2013 19:41, Eric Niebler:

...
I've now built a filter_range, transform_range, and an istream_range in this vein. Initial results are looking promising, but I haven't benchmarked performance, yet. A commenter on my blog claims to have benchmarked perf of istream_range, with good results, which is encouraging.

While new "slim" istream_range is faster than version with "fat" iterators, it is still not the fastest possible. In particular, it does two comparisons per cycle iteration: one is "!rng_->next()", and another is comparison of iterators, like "it != last". <snip>

The commenter on my blog claims this extra check is optimized away by a smart compiler like gcc. He says[^1]:

Regardless of efficiency, a range adaptor could make implementing input and forward ranges much easier.

Eric Niebler

6:59 p.m.

On 11/12/2013 10:44 AM, Daniel James wrote:

...

On 12 November 2013 17:13, Eric Niebler <eniebler@boost.org> wrote:

...
On 11/12/2013 8:21 AM, Evgeny Panasyuk wrote:

...
12.11.2013 19:41, Eric Niebler:

...
I've now built a filter_range, transform_range, and an istream_range in this vein. Initial results are looking promising, but I haven't benchmarked performance, yet. A commenter on my blog claims to have benchmarked perf of istream_range, with good results, which is encouraging.

While new "slim" istream_range is faster than version with "fat" iterators, it is still not the fastest possible. In particular, it does two comparisons per cycle iteration: one is "!rng_->next()", and another is comparison of iterators, like "it != last". <snip>

The commenter on my blog claims this extra check is optimized away by a smart compiler like gcc. He says[^1]:

Regardless of efficiency, a range adaptor could make implementing input and forward ranges much easier.

Agreed. I've implemented such an adaptor, but I'm not yet happy with it so I haven't committed it yet. I think we can take a page out of Boost.Iterator's book and have range_adaptor and range_facade. With some thought, we should have a good answer for the folks who just want to write something like the D range interface. -- Eric Niebler Boost.org http://www.boost.org

Evgeny Panasyuk

7:34 p.m.

12.11.2013 21:13, Eric Niebler:

...

The commenter on my blog claims this extra check is optimized away by a smart compiler like gcc. He says[^1]:

[...]

It remains to be seen how well compilers can optimized several stacked range adapters, but at least in the simple cases we don't have to accept compromise.

Thank you for note. For such simple case it is not that surprising (especially when "static single assignment form" is used). Another example where wrapper can add overhead is something like std::copy_n - where it is possible to do zero checks on iterator per loop iteration (though, there will be check on "external" integer). But nevertheless, D-style InputRanges are useful on their own. And still - I think it is the best form to represent InputRanges, it catches semantic of underlying concept very accurately (though, other categories of D ranges are controversial). Speculating a bit further on new concepts, it is possible to make something like: while(!empty(i)) { use(*i); ++i; } // ... --i; while(!empty(i)) { use(*i); --i; } (perhaps "empty" is not the right name) -- Evgeny Panasyuk

Thorsten Ottosen

4:38 p.m.

On 12-11-2013 16:41, Eric Niebler wrote:

...

I've now built a filter_range, transform_range, and an istream_range in this vein. Initial results are looking promising, but I haven't benchmarked performance, yet. A commenter on my blog claims to have benchmarked perf of istream_range, with good results, which is encouraging.

If anybody wants to take a peek, you can find the code here:

https://github.com/ericniebler/range-v3

I'm making no effort to maintain source compatibility, going instead with the design that seems right to me for C++11.

Good. I totally agree that r-value refs change things completely. Some small comments (applies for transform_range as well): A. For filter_range, you should inherit from the predicate if possible to enable EBO. I don't know if lambdas complicate this? B. For the constructor filter_range(Rng && rng, Pred pred) then maybe it would be beneficial that empty predicates are constructed via filter_range(Rng && rng) ? I guess it needs benchmarking though, but the compiler will have to elide a series of moves vs. default construct an empty class. C. I can't quite figure out your use of filterer::filterer1. Do you really have to store the predicate here also? -Thorsten

Eric Niebler

5:19 p.m.

On 11/12/2013 8:38 AM, Thorsten Ottosen wrote:

...

On 12-11-2013 16:41, Eric Niebler wrote:

...
I've now built a filter_range, transform_range, and an istream_range in this vein. Initial results are looking promising, but I haven't benchmarked performance, yet. A commenter on my blog claims to have benchmarked perf of istream_range, with good results, which is encouraging.

If anybody wants to take a peek, you can find the code here:

https://github.com/ericniebler/range-v3

I'm making no effort to maintain source compatibility, going instead with the design that seems right to me for C++11.

Good. I totally agree that r-value refs change things completely.

Some small comments (applies for transform_range as well):

A. For filter_range, you should inherit from the predicate if possible to enable EBO. I don't know if lambdas complicate this?

I should use EBO, but that's not the right place for it. It adds a bunch of associated namespaces to the filter_range type. Rather, the predicate and the adapted range should be in a compressed pair. I have this working locally already. The space savings are worth it: a filtered, filtered, transformed range goes from 24 bytes to 16.

...

B. For the constructor

filter_range(Rng && rng, Pred pred)

then maybe it would be beneficial that empty predicates are constructed via

filter_range(Rng && rng)

? I guess it needs benchmarking though, but the compiler will have to elide a series of moves vs. default construct an empty class.

I don't think this is worth it. I can't imaging any compiler having a hard time eliminating empty moves. There's nothing to move!

...

C. I can't quite figure out your use of filterer::filterer1. Do you really have to store the predicate here also?

filterer1 is the object returned by "filter" in an expression like this: vec | filter(pred) So yes, it needs to store the pred. One of my breaking changes is to allow the "filter" name to be reused such that these two are equivalent: vec | filter(pred) filter(vec, pred) I want this. I never liked "filtered" or the adaptors namespace. -- Eric Niebler Boost.org http://www.boost.org

Thorsten Ottosen

13 Nov 13 Nov

9:31 a.m.

On 12-11-2013 18:19, Eric Niebler wrote:

...

On 11/12/2013 8:38 AM, Thorsten Ottosen wrote:

...
On 12-11-2013 16:41, Eric Niebler wrote:

...

I should use EBO, but that's not the right place for it. It adds a bunch of associated namespaces to the filter_range type. Rather, the predicate and the adapted range should be in a compressed pair. I have this working locally already. The space savings are worth it: a filtered, filtered, transformed range goes from 24 bytes to 16.

Nice.

...

So yes, it needs to store the pred. One of my breaking changes is to allow the "filter" name to be reused such that these two are equivalent:

vec | filter(pred) filter(vec, pred)

I want this. I never liked "filtered" or the adaptors namespace.

Sounds reasonable. Your implementation looks good. regards -Thoorsten

4278

Age (days ago)

4284

Last active (days ago)

List overview

Download

27 comments

11 participants

participants (11)

Andrey Semashev
Daniel James
Dave Abrahams
Eric Niebler
Evgeny Panasyuk
Jonathan Wakely
Luc Danton
Nathan Ridge
Neil Groves
Thorsten Ottosen
TONGARI J