[iterators][range][rangeex] range concatenation
Is it possible with Boost (or RangeEx) to concatenate ranges? Sample usage: template <class Range> void process_range(Range const &); std::vector<foo> v = ...; std::list<foo> l = ...; process_range(concat(v, l));
Dmitry Vinogradov wrote:
Is it possible with Boost (or RangeEx) to concatenate ranges?
Sample usage:
template <class Range> void process_range(Range const &);
std::vector<foo> v = ...; std::list<foo> l = ...; process_range(concat(v, l));
I take it you have a good reason not to simply copy v and l into a new container. iterator_facade might help, and iterator_range to obtain a range.
er skrev:
Dmitry Vinogradov wrote:
Is it possible with Boost (or RangeEx) to concatenate ranges?
Sample usage:
template <class Range> void process_range(Range const &);
std::vector<foo> v = ...; std::list<foo> l = ...; process_range(concat(v, l));
I take it you have a good reason not to simply copy v and l into a new container. iterator_facade might help, and iterator_range to obtain a range.
It will be impossible to implement a generic iterator that allows for fast iteration over the two ranges. The problem is that the iterator would have to check when the new container starts, similar to segments in a std::deque<T>. So I would just call process_range() twice. If you really need to view the two ranges as one range, then you have to pay an abstraction penalty, albeit it might be more efficiant than copying the two containers into a new one (and it is probably not that trivial to write the iterator that depends on two or more other iterator types). If "foo" is a heavy object, then copying the references of the elements to a new container std::vector<boost::reference_wrapper<foo>> is probably the most efficient alternative. -Thorsten
Thorsten Ottosen wrote:
er skrev:
Dmitry Vinogradov wrote:
Is it possible with Boost (or RangeEx) to concatenate ranges?
Sample usage:
template <class Range> void process_range(Range const &);
std::vector<foo> v = ...; std::list<foo> l = ...; process_range(concat(v, l));
I take it you have a good reason not to simply copy v and l into a new container. iterator_facade might help, and iterator_range to obtain a range.
It will be impossible to implement a generic iterator that allows for fast iteration over the two ranges. The problem is that the iterator would have to check when the new container starts, similar to segments in a std::deque<T>.
So I would just call process_range() twice. If you really need to view the two ranges as one range, then you have to pay an abstraction penalty, albeit it might be more efficiant than copying the two containers into a new one (and it is probably not that trivial to write the iterator that depends on two or more other iterator types).
If "foo" is a heavy object, then copying the references of the elements to a new container std::vector<boost::reference_wrapper<foo>> is probably the most efficient alternative.
Calling process_range() twice is not a solution in my case. Copying references to a new container is a better way. Regarding efficiency, can you look thru my rough concat() implementation attached to discover any its disadvantages? #define BOOST_TEST_MODULE ConcatTest #include <boost/test/unit_test.hpp> #include <vector> #include <list> #include "concat.h" template <typename Range1, typename Range2, typename Range3> bool TestConcat(Range1 &r1, Range2 &r2, Range3 &r3) { std::vector<int> v1, v2; v1.insert(v1.end(), r1.begin(), r1.end()); v1.insert(v1.end(), r2.begin(), r2.end()); v2.insert(v2.end(), r3.begin(), r3.end()); return v1 == v2; } template <typename Range1, typename Range2> bool TestConcat(Range1 &r1, Range2 &r2) { return TestConcat(r1, r2, Concatenate(r1, r2)); } BOOST_AUTO_TEST_CASE(Concat) { int a1[] = { 1, 2, 3, 4, 5 }; int a2[] = { 10, 11, 12 }; std::vector<int> v(boost::begin(a1), boost::end(a1)); std::list<int> l(boost::begin(a2), boost::end(a2)); int value = 0; boost::iterator_range<int*> i(&value, &value+1), n((int*)NULL, (int*)NULL); BOOST_CHECK(TestConcat(l, v)); BOOST_CHECK(TestConcat(v, l)); BOOST_CHECK(TestConcat(v, v)); BOOST_CHECK(TestConcat(n, v)); BOOST_CHECK(TestConcat(v, n)); BOOST_CHECK(TestConcat(n, n)); BOOST_CHECK(TestConcat(n, i)); BOOST_CHECK(TestConcat(i, n)); BOOST_CHECK(TestConcat(i, v)); BOOST_CHECK(TestConcat(i, i)); } #pragma once #include <boost/iterator/iterator_facade.hpp> #include <boost/range/iterator_range.hpp> #include <boost/variant.hpp> #include <boost/variant/apply_visitor.hpp> #include <boost/variant/static_visitor.hpp> template <typename reference> struct DereferenceVisitor : public boost::static_visitor<reference> { template <typename T> reference operator()(T Iterator) const { return *Iterator; } }; struct IncrementVisitor : public boost::static_visitor<void> { template <typename T> void operator()(T &Value) const { ++Value; } }; struct DecrementVisitor : public boost::static_visitor<void> { template <typename T> void operator()(T &Value) const { --Value; } }; template <class TIterator1, class TIterator2, class TValue = typename boost::iterator_value<TIterator1>::type > class ConcatIterator : public boost::iterator_facade<ConcatIterator<TIterator1, TIterator2, TValue>, TValue, boost::bidirectional_traversal_tag> { public: template <class TIterator> ConcatIterator(TIterator It, size_t Range /* 0 for the first range, 1 for the second range */, TIterator1 End1, TIterator2 Begin2) : It(It), Range(Range), End1(End1), Begin2(Begin2) { Stabilize(); } private: void Stabilize() { if ((Range == 0) && (It == End1)) { It = Begin2; Range = 1; } } public: reference dereference() const { return boost::apply_visitor(DereferenceVisitor<reference>(), It); } void increment() { boost::apply_visitor(IncrementVisitor(), It); if ((Range == 0) && (It == End1)) { It = Begin2; Range = 1; } } void decrement() { if ((Range == 1) && (It == Begin2)) { It = End1; Range = 0; } boost::apply_visitor(DecrementVisitor(), It); } bool equal(const ConcatIterator& lhs) const { return (Range == lhs.Range) && (It == lhs.It); } private: size_t Range; boost::variant<TIterator1, TIterator2> It, End1, Begin2; }; template <class TRange1, class TRange2> boost::iterator_range<ConcatIterator<typename TRange1::iterator, typename TRange2::iterator> > Concatenate(TRange1 &Range1, TRange2 &Range2) { typedef ConcatIterator<typename TRange1::iterator, typename TRange2::iterator> TIterator; return boost::make_iterator_range( TIterator(Range1.begin(), 0, Range1.end(), Range2.begin()), TIterator(Range2.end (), 1, Range1.end(), Range2.begin())); } template <class TRange1, class TRange2> boost::iterator_range<ConcatIterator<typename TRange1::const_iterator, typename TRange2::const_iterator> > Concatenate(TRange1 const &Range1, TRange2 const &Range2) { typedef ConcatIterator<typename TRange1::const_iterator, typename TRange2::const_iterator> TIterator; return boost::make_iterator_range( TIterator(Range1.begin(), 0, Range1.end(), Range2.begin()), TIterator(Range2.end (), 1, Range1.end(), Range2.begin())); }
Dmitry Vinogradov skrev:
Calling process_range() twice is not a solution in my case. Copying references to a new container is a better way.
Regarding efficiency, can you look thru my rough concat() implementation attached to discover any its disadvantages?
Wow. Nice work! Some comments: 0. I prefer to use unsigned instead of std::size_t. std::size_t can be larger than a word IIRC. 1. You can call Stabilize() in increment(). 2. You should probably check It == End1 before Range == 0, because Range == 0 will be true most of the time during the iteration through the first range. 3. How can you avoid to store the second end iterator? How does the performace compare to creating a vector of references? (remember to call reserve()). If the performance is good,this baby should be included in the range library asap. -Thorsten
On Mon, Apr 6, 2009 at 10:46 PM, Thorsten Ottosen < thorsten.ottosen@dezide.com> wrote:
Dmitry Vinogradov skrev:
Calling process_range() twice is not a solution in my case. Copying
references to a new container is a better way.
Regarding efficiency, can you look thru my rough concat() implementation attached to discover any its disadvantages?
It is undeniably more elegant than calling algorithms n times as a general solution. It is possible to improve performance by reordering some of the boolean expressions and by reducing the use of boost variant.
If the performance is good,this baby should be included in the range library asap.
I would like to implement a new version of this functionality, since I have some code I had prototyped previously. I believe that the idea is good, but the functionality should efficiently support two random access iterators, and that the performance can be improved by reducing the use of boost::variant, although I would have to measure the variant approach to be certain. I can add this to the list of things to do in response to the Boost.RangeEx review.
-Thorsten
Neil
Neil Groves skrev:
On Mon, Apr 6, 2009 at 10:46 PM, Thorsten Ottosen <thorsten.ottosen@dezide.com <mailto:thorsten.ottosen@dezide.com>> wrote:
Dmitry Vinogradov skrev:
Calling process_range() twice is not a solution in my case. Copying references to a new container is a better way.
Regarding efficiency, can you look thru my rough concat() implementation attached to discover any its disadvantages?
It is undeniably more elegant than calling algorithms n times as a general solution. It is possible to improve performance by reordering some of the boolean expressions and by reducing the use of boost variant.
Right on.
If the performance is good,this baby should be included in the range library asap.
I would like to implement a new version of this functionality, since I have some code I had prototyped previously. I believe that the idea is good, but the functionality should efficiently support two random access iterators, and that the performance can be improved by reducing the use of boost::variant, although I would have to measure the variant approach to be certain. I can add this to the list of things to do in response to the Boost.RangeEx review.
Great. I'm *sure* that the variant thing adds overhead: 1. when the iterators have different size 2. because we check two times that we have a certain case: if( range == 0 ) also implies that we should increment a certain iterator, else another. I wonder if we could call this beast join_view(r1,r2) or something. Perhaps support for 3 and 4 ranges would be nice. -Thorsten
On Tue, Apr 7, 2009 at 7:49 AM, Thorsten Ottosen < thorsten.ottosen@dezide.com> wrote:
I wonder if we could call this beast
join_view(r1,r2)
or something. Perhaps support for 3 and 4 ranges would be nice.
How about r1 | join(r2) | join(r3) ? I was thinking that this looks very much like a job for a range adaptor. The support for multiple ranges is almost inherent because you can chain the iterator such that the second iterator type is a join_iterator. Is there any good reason that this should not be a range adaptor that anyone can think of?
-Thorsten ____________________________________________
Best Wishes, Neil Groves
Neil Groves skrev:
On Tue, Apr 7, 2009 at 7:49 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com <mailto:thorsten.ottosen@dezide.com>> wrote:
I wonder if we could call this beast
join_view(r1,r2)
or something. Perhaps support for 3 and 4 ranges would be nice.
How about r1 | join(r2) | join(r3) ?
I was thinking that this looks very much like a job for a range adaptor. The support for multiple ranges is almost inherent because you can chain the iterator such that the second iterator type is a join_iterator.
I suspect that r1 | join(r2) | join(r3) will be less efficient than join(r1,r2,r3), because the former will lead to double checks of which range that is being iterated, unless there is some clever way to avoid that. But the | joined(r1,r2) syntax or whatever is fine with me. -Thorsten
AMDG Thorsten Ottosen wrote:
I suspect that r1 | join(r2) | join(r3) will be less efficient than join(r1,r2,r3), because the former will lead to double checks of which range that is being iterated, unless there is some clever way to avoid that. But the | joined(r1,r2) syntax or whatever is fine with me.
It would be pretty easy to make r1 | join(r2) recognize whether r1 and r1 are themselves joint_views. In Christ, Steven Watanabe
Steven Watanabe skrev:
AMDG
Thorsten Ottosen wrote:
I suspect that r1 | join(r2) | join(r3) will be less efficient than join(r1,r2,r3), because the former will lead to double checks of which range that is being iterated, unless there is some clever way to avoid that. But the | joined(r1,r2) syntax or whatever is fine with me.
It would be pretty easy to make r1 | join(r2) recognize whether r1 and r1 are themselves joint_views.
Right, but is it easy to "unroll" the necessary information inside e.g. increment()? -Thorsten
AMDG Thorsten Ottosen wrote:
Steven Watanabe skrev:
AMDG
Thorsten Ottosen wrote:
I suspect that r1 | join(r2) | join(r3) will be less efficient than join(r1,r2,r3), because the former will lead to double checks of which range that is being iterated, unless there is some clever way to avoid that. But the | joined(r1,r2) syntax or whatever is fine with me.
It would be pretty easy to make r1 | join(r2) recognize whether r1 and r1 are themselves joint_views.
Right, but is it easy to "unroll" the necessary information inside e.g. increment()?
You don't necessarily have to. template<class Sequences> class joint_view; template<class S, class R> joint_view<typename mpl::push_back<S, R>::type> operator|(const joint_view<S>& v, R& r); In Christ, Steven Watanabe
Steven Watanabe skrev:
AMDG
Thorsten Ottosen wrote:
Steven Watanabe skrev:
AMDG
Thorsten Ottosen wrote:
I suspect that r1 | join(r2) | join(r3) will be less efficient than join(r1,r2,r3), because the former will lead to double checks of which range that is being iterated, unless there is some clever way to avoid that. But the | joined(r1,r2) syntax or whatever is fine with me.
It would be pretty easy to make r1 | join(r2) recognize whether r1 and r1 are themselves joint_views.
Right, but is it easy to "unroll" the necessary information inside e.g. increment()?
You don't necessarily have to.
template<class Sequences> class joint_view;
template<class S, class R> joint_view<typename mpl::push_back<S, R>::type> operator|(const joint_view<S>& v, R& r);
The hard part is the implementation of increment()/decrement(). -Thorsten
AMDG Thorsten Ottosen wrote:
Thorsten Ottosen wrote:
I suspect that r1 | join(r2) | join(r3) will be less efficient than join(r1,r2,r3), because the former will lead to double checks of which range that is being iterated, unless there is some clever way to avoid that. But the | joined(r1,r2) syntax or whatever is fine with me.
The hard part is the implementation of increment()/decrement().
Right, but what I was trying to say was that join(r1, r2, r3) is not significantly different from r1 | join(r2) | join(r3). If we can implement one efficiently, we can implement the other efficiently. In Christ, Steven Watanabe
Thorsten Ottosen wrote:
Dmitry Vinogradov skrev:
Regarding efficiency, can you look thru my rough concat() implementation attached to discover any its disadvantages?
Wow. Nice work!
Some comments:
0. I prefer to use unsigned instead of std::size_t. std::size_t can be larger than a word IIRC.
1. You can call Stabilize() in increment().
2. You should probably check It == End1 before Range == 0, because Range == 0 will be true most of the time during the iteration through the first range.
Thanks, I'll use these hints.
3. How can you avoid to store the second end iterator?
I need to store three iterators: current state (It) - it's for sure, and the "gap" between ranges - Range1.end() and Range2.begin() for backward and forward jumping between Range1 and Range2. These iterators is all that I store. I can store Range1.end() and Range2.begin() as usual iterators, not as variants. Currently they are stored as variants just for simple comparision with `It' variant. It's possible to get rid of Range member and use It.which() instead, but I think it's hard to control `which' member of variant while having typeof(TIterator1)==typeof(TIterator2), because "It = SomeIterator;" statement will lead always to It.which()==0. Self-invented variant could help here.
How does the performace compare to creating a vector of references? (remember to call reserve()).
I think it's anyway very useful to have concat(). One can make his own decision between concat() and vector of references for every particular case. Creating vector will always lead to heap allocation and IFAIK it's not time-cheap. Add time for copying references. Summary time will depend a lot of particular algorithm is used for a range, so again I think everyone must have a choice.
If the performance is good,this baby should be included in the range library asap.
In additional for all, my implementation miss one big thing: it doesn't have an auto-detection of ConcatIterator traversal category. So, I bet for Neil's implementation he noticed in this thread ;)
If the performance is good,this baby should be included in the range library asap.
-Thorsten
Hello, I am wondering if perhaps someone responsible for range_ex has implemented concat and would care to kindly share it (apparently it's not included in the vault). Many thanks.
er wrote:
I am wondering if perhaps someone responsible for range_ex has implemented concat and would care to kindly share it (apparently it's not included in the vault).
Someone tells me I overlooked chain which works similarly, albeit for single pass only. Very useful, thanks! #include <iostream> #include <vector> #include <list> #include <iterator> #include <algorithm> #include <boost/range.hpp> #include <boost/range/chain.hpp> #include <boost/typeof/typeof.hpp> #include <boost/next_prior.hpp> int main (int argc, char * const argv[]) { typedef double val_; typedef std::vector<val_> vec_; typedef std::list<val_> list_; vec_ seq0; seq0.push_back(1); list_ seq1; seq1.push_back(2); vec_ seq2; seq2.push_back(3); seq2.push_back(4); vec_ seq3; seq2.push_back(5); BOOST_AUTO( chained, boost::chain(seq0, boost::chain(seq1, boost::chain(seq2,seq3) ) ) ); BOOST_ASSERT( *boost::begin(chained) == 1 ); // Warning : reference to temporary BOOST_ASSERT( *boost::next(boost::begin(chained),1) == 2); BOOST_ASSERT( *boost::next(boost::begin(chained),2) == 3); BOOST_ASSERT( *boost::next(boost::begin(chained),3) == 4); BOOST_ASSERT( *boost::next(boost::begin(chained),4) == 5); std::copy( boost::begin(chained), boost::end(chained), std::ostream_iterator<val_>(std::cout, " ") ); return 0; }
Someone tells me I overlooked chain which works similarly, albeit for single pass only. Very useful, thanks!
Here's a suggestion for generalizing range::chain https://svn.boost.org/svn/boost/sandbox/statistics/detail/range_ex/
participants (5)
-
Dmitry Vinogradov
-
er
-
Neil Groves
-
Steven Watanabe
-
Thorsten Ottosen