Interest in flat_iterator for nested containers?

Hi, is there any interest in an iterator that provides a flat view of nested containers? For example ---------- using namespace boost::assign; typedef std::list<int> inner_container_type; typedef std::vector<inner_container_type> outer_container_type; outer_container_type elements(6); elements[1] += 1,2,3; elements[2] += 4,5; elements[4] += 6; BOOST_CHECK(std::equal( make_flat_iterator(elements.begin(), elements.end()), make_flat_iterator(elements.end(), elements.end()), list_of(1)(2)(3)(4)(5)(6).begin()) ); ---------- If so, I'll be glad to share my code with the boost community. I attach my current state of development and a single test suite. The current state of development contains a class named flat_iterator, a make_flat_iterator function and a set of meta-functions to derive required types. flat_iterator derives from boost::iterator_adaptor. It derives an inner_iterator type from the passed iterator to be adapted (outer_iterator type). The value_type of outer_iterator is required to model STL container concept. The value_type passed to boost::iterator_adaptor is const-qualified if the reference type of the outer_iterator or inner_iterator type is const-qualified (please see question 1 below). The traversal category of flat_iterator is bidirectional if the the outer_iterator and inner_iterator type are at least bidirectional. It models forward traversal else (question 2). flat_iterators can be stacked over each other to recursively flatten a hierarchy of arbitrary levels. ---------- using namespace boost::assign; typedef std::list<int> level_0; typedef std::vector<level_0> level_1; typedef std::vector<level_1> level_2; level_2 elements(10); elements[2] += list_of(1)(2); elements[4] += list_of(3)(4); typedef flat_iterator<level_2::iterator> flat_iterator_outer; typedef flat_iterator<flat_iterator_outer> flat_iterator_inner; flat_iterator_outer outer_begin(elements.begin(), elements.end()); flat_iterator_outer outer_end(elements.end(), elements.end()); flat_iterator_inner inner_begin(outer_begin, outer_end); flat_iterator_inner inner_end(outer_end, outer_end); BOOST_CHECK(std::equal(inner_begin, inner_end, list_of(1)(2)(3)(4).begin())); ---------- Open questions: * Currently the the value_type passed to boost::iterator_adaptor is const qualified based on required reference type of the outer_iterator, which renders the code for input iterators unusable, right? What's the best way to achieve const correctness? * The traversal category of flat_iterator is supposed to be the 'greatest common divisor' of the inner and outer iterator type. Is there a meta-function in boost that provides that type? Currently bidirectional traversal is chosen if both, inner and outer iterator, are at least of bidirectional traversal. Else forward traversal is chosen (which is of course incorrect when incrementable is passed). * Stacking flat_iterators to flatten out a hierarchy of arbitrary levels is possible but currently requires a lot of typing. I think that applying some recursive meta-functions to generate a correct iterator types for such situations should be possible but is beyond my current knowledge of meta-programming. Kind regards, Christoph

Christoph Heindl wrote:
Hi,
is there any interest in an iterator that provides a flat view of nested containers? For example
I've found this pretty useful, and I've coded up and used something like this as well (mine's called flat_multirange_iterator). If interested, I can send you the source to give you some ideas, but it makes generous use of other (relatively small) reusable components I've developed, so it would take a little work to excise it out of my code collection. It allows you to flatten a multirange of arbitrary depth. I'll elaborate below.
Open questions:
* Currently the the value_type passed to boost::iterator_adaptor is const qualified based on required reference type of the outer_iterator, which renders the code for input iterators unusable, right? What's the best way to achieve const correctness?
I derive from iterator_facade, and explicitly provide the reference type (which is the reference type of the deepest-level range) to achieve const-correctness. Well, that's not quite the whole story. When traversing down the hieararchy of ranges, constness is propagated from the outer-most range to all inner ranges, which is an imperfect but for-now-workable solution.
* The traversal category of flat_iterator is supposed to be the 'greatest common divisor' of the inner and outer iterator type. Is there a meta-function in boost that provides that type? Currently bidirectional traversal is chosen if both, inner and outer iterator, are at least of bidirectional traversal. Else forward traversal is chosen (which is of course incorrect when incrementable is passed).
I would describe it as the "widest convertible" type, since the traversal concepts form a linear hiearchy, but that's not to say your description is incorrect ;) I actually have a facility to get by with single-pass traversal by using (the equivalent of) boost::optional's to work around the lack of default-constructibility of the underlying iterators. The user can also explicitly provide the traversal type as a template parameter.
* Stacking flat_iterators to flatten out a hierarchy of arbitrary levels is possible but currently requires a lot of typing. I think that applying some recursive meta-functions to generate a correct iterator types for such situations should be possible but is beyond my current knowledge of meta-programming.
Kind regards, Christoph
Right. I use a boost::fusion::vector of iterator_pack's, where an iterator_pack keeps track of the "current position" at the corresponding level's iteration: template< class It, class Traversal > struct iterator_pack; template< class It > struct iterator_pack< It, boost::bidirectional_traversal_tag > { It begin, current, end; /* ... */ } template< class It > struct iterator_pack< It, boost::forward_traversal_tag > { It current, end; /* ... */ } template< class SinglePassReadableMultirange, int Dimension, class CategoryOrTraversal
struct flat_multirange_iterator { /* ... */ private: // Note: multirange_multiiterator evalutes to a Boost.MPL sequence of the iterators at each level of the range hierarchy. typedef typename boost::fusion::result_of::as_vector< typename boost::mpl::transform< typename multirange_multiiterator< SinglePassReadableMultirange, Dimension >::type, detail_iterator::iterator_pack< boost::mpl::_1, traversal_type > >::type >::type iterator_pack_sequence_type; iterator_pack_sequence_type m_it_packs; }; There is also a single-pass traversal version of iterator_pack (which holds a boost::optional of a forward-traversal iterator_pack). I make gratuitous use of the Boost.Fusion transformation algorithms when possible, and just straight-up iteration over Boost.Fusion sequences otherwise, so (IMHO) the code is not so bad. Using these iterator_pack's to define the current iteration location puts some requirements on the multirange you're flattening, along the lines that iterators at deeper levels remain valid even after the reference to their parent range goes out of scope. This has worked fine for me so far, but there may be weaker requirements if one uses a different strategy...? I'll take a look at your implementation and see how it compares to what I'm currently using. - Jeff

Jeffrey,
I derive from iterator_facade, and explicitly provide the reference type (which is the reference type of the deepest-level range) to achieve const-correctness.
Well, that's not quite the whole story. When traversing down the hieararchy of ranges, constness is propagated from the outer-most range to all inner ranges, which is an imperfect but for-now-workable solution.
Do you have something better in mind?
* The traversal category of flat_iterator is supposed to be the 'greatest common divisor' of the inner and outer iterator type. Is there a meta-function in boost that provides that type? Currently bidirectional traversal is chosen if both, inner and outer iterator, are at least of bidirectional traversal. Else forward traversal is chosen (which is of course incorrect when incrementable is passed).
I would describe it as the "widest convertible" type, since the traversal concepts form a linear hiearchy, but that's not to say your description is incorrect ;) I actually have a facility to get by with single-pass traversal by using (the equivalent of) boost::optional's to work around the lack of default-constructibility of the underlying iterators. The user can also explicitly provide the traversal type as a template parameter.
"widest convertible" sounds good to me :) Do you already have random access traversal support?
* Stacking flat_iterators to flatten out a hierarchy of arbitrary levels is possible but currently requires a lot of typing. I think that applying some recursive meta-functions to generate a correct iterator types for such situations should be possible but is beyond my current knowledge of meta-programming.
Kind regards, Christoph
Right. I use a boost::fusion::vector of iterator_pack's, where an iterator_pack keeps track of the "current position" at the corresponding level's iteration:
Interesting. Didn't throught of such a solution.
Using these iterator_pack's to define the current iteration location puts some requirements on the multirange you're flattening, along the lines that iterators at deeper levels remain valid even after the reference to their parent range goes out of scope. This has worked fine for me so far, but there may be weaker requirements if one uses a different strategy...?
Again please?
I'll take a look at your implementation and see how it compares to what I'm currently using.
Keep in mind that it is an early implementation. Christoph

Christoph Heindl wrote:
I derive from iterator_facade, and explicitly provide the reference type (which is the reference type of the deepest-level range) to achieve const-correctness.
Well, that's not quite the whole story. When traversing down the hieararchy of ranges, constness is propagated from the outer-most range to all inner ranges, which is an imperfect but for-now-workable solution.
Do you have something better in mind?
Nope. I only said "imperfect" because I'm open to there being a more correct determination of the reference type. Like I said, what I have now has worked so far.
Do you already have random access traversal support?
After very little thought when first implementing this, I gave up on trying to smartly implement random-access traversal. For one, you can't make constant-time guarantees without additional knowledge of your multirange instance (e.g., all subranges are the same size) or additional bookkeeping. Do you have any ideas on how to implement this?
Using these iterator_pack's to define the current iteration location puts some requirements on the multirange you're flattening, along the lines that iterators at deeper levels remain valid even after the reference to their parent range goes out of scope. This has worked fine for me so far, but there may be weaker requirements if one uses a different strategy...?
Again please?
I assume you want clarification on the general requirements of a multirange to be able to present a flattened view of it. To reiterate, I require iterators of a subrange to remain valid even after the reference to the parent range has gone out of scope and been destroyed. This usually poses no problems if all reference types are real references, but, as a concrete example, you can imagine something like the following 2-level multirange: make_transform_range(outer_range, inner_range_functor()) make_transform_range returns a lazily-evaluated range (basically the range-equivalent to a boost::transform_iterator) and inner_range_functor is a function object that accepts a value_type of outer_range and returns a range itself. For example struct inner_range_functor { template< class T > boost::array<T,1> operator()(const T& x) const { boost::array<T,1> result = {{ x }}; return result; } }; Trying to flatten this using the iterator_pack implementation will fail miserably (as I found out the hard way) since the results of applying the inner_range_functor to the values of outer_range aren't saved anywhere, so iterators into those results are dangling. Of course, this is a simple example. Real examples where this causes problems are much more opaque, and from the user point-of-view it might not be so obvious what expressions should and should not be flattened. Like I said, I've been bitten by this problem. If there was some way to detect the rvalue-ness of the result of dereferencing an iterator, and subsequently storing the result in the flat_iterator, then the example above might be made to work. Or something else might work. I don't know :/ I haven't given it too much thought since *most of the time*, what I have is sufficient. - Jeff

Jeff,
* The traversal category of flat_iterator is supposed to be the 'greatest common divisor' of the inner and outer iterator type. Is there a meta-function in boost that provides that type? [...]
I would describe it as the "widest convertible" type, since the traversal concepts form a linear hiearchy, but that's not to say your description is incorrect ;)
out of curiosity I've implemented the 'widest convertible' meta-function as reformulated greatest common divisor problem (called it widest_convertible_traversal_cat as you suggested though :). It works by declaring a bidirectional mapping from category to id and vice versa. If ids are chosen to be the powers of two then the widest convertible traversal category is the greatest common divisor of both category ids. The code consists of a few lines of code (reusing boost::math::static_gcd) and should be quite fast as far as compile time is concerned (assuming euclids gcd algorithm is used it takes a single iteration). It might be the fastest solution, but a quite elegant one (I think) :) Please find my code with complete test suite attached. Do you think that piece of code might be of general interest? Best regards, Christoph

Christoph Heindl wrote:
out of curiosity I've implemented the 'widest convertible' meta-function as reformulated greatest common divisor problem (called it widest_convertible_traversal_cat as you suggested though :). It works by declaring a bidirectional mapping from category to id and vice versa. If ids are chosen to be the powers of two then the widest convertible traversal category is the greatest common divisor of both category ids.
Clever! Though in this instance, the minimum is identical to the gcd, at which point you could just assign the traversal tags to consecutive integers ;) But this is something I'll have to keep in mind for any fixed class hierarchy, as the gcd could provide a nice way to compute the "least common ancestor". The widest_convertible metafunction I have is slower than one specialized for a linear hiearchy (O(n^2) instead of O(n)), and based on boost::is_convertible. See below.
The code consists of a few lines of code (reusing boost::math::static_gcd) and should be quite fast as far as compile time is concerned (assuming euclids gcd algorithm is used it takes a single iteration). It might be the fastest solution, but a quite elegant one (I think) :)
Please find my code with complete test suite attached.
Do you think that piece of code might be of general interest?
It would be much more useful if it were variadic, either taking a variable number of template parameters or (my preference) accepting a Boost.MPL sequence of traversal tags. -------- #include <boost/mpl/begin_end.hpp> #include <boost/mpl/deref.hpp> #include <boost/mpl/find_if.hpp> #include <boost/mpl/not.hpp> #include <boost/mpl/placeholders.hpp> #include <boost/type_traits/is_convertible.hpp> #include <boost/type_traits/is_same.hpp> namespace detail_widest_convertible { template< class Sequence, class T > struct all_are_convertible : boost_ext::mpl::satisfies_all< Sequence, boost::is_convertible< boost::mpl::_1, T > > { }; } // namespace detail_widest_convertible template< class Sequence > struct has_widest_convertible : boost::mpl::not_< boost::is_same< typename boost::mpl::end< Sequence >::type, typename boost::mpl::find_if< Sequence, detail_widest_convertible::all_are_convertible< Sequence, boost::mpl::_1 > >::type > > { }; template< class Sequence > struct widest_convertible : boost::mpl::deref< typename boost::mpl::find_if< Sequence, detail_widest_convertible::all_are_convertible< Sequence, boost::mpl::_1 > >::type > { }; -------- where satisfies_all is a Boost.MPL metafunction defined as follows: -------- #include <boost/mpl/begin_end.hpp> #include <boost/mpl/bind.hpp> #include <boost/mpl/find_if.hpp> #include <boost/mpl/lambda.hpp> #include <boost/mpl/not.hpp> #include <boost/mpl/placeholders.hpp> #include <boost/mpl/quote.hpp> #include <boost/type_traits/is_same.hpp> namespace boost_ext { namespace mpl { template< class Sequence, class Pred > struct satisfies_all : boost::is_same< typename boost::mpl::find_if< Sequence, boost::mpl::bind< boost::mpl::quote1< boost::mpl::not_ >, boost::mpl::bind< typename boost::mpl::lambda< Pred >::type, boost::mpl::_1 > > >::type, typename boost::mpl::end< Sequence >::type > { }; } // namespace mpl } // namespace boost_ext -------- - Jeff

Jeff,
Clever! Though in this instance, the minimum is identical to the gcd, at which point you could just assign the traversal tags to consecutive integers ;)
I'm not sure if I understood you correctly: Assigning, for example, 1:readable, 2:single pass and 3:forward iterable would yield 1:readable for the gcd of 2:single pass and 3:foward which is not true (IMHO :).
But this is something I'll have to keep in mind for any fixed class hierarchy, as the gcd could provide a nice way to compute the "least common ancestor".
I'm glad I could help.
The widest_convertible metafunction I have is slower than one specialized for a linear hiearchy (O(n^2) instead of O(n)), and based on boost::is_convertible. See below.
Thanks for your input! It's gonna take a while for me to understand the code as I'm not to familiar with all the meta-functions in place.
Do you think that piece of code might be of general interest?
It would be much more useful if it were variadic, either taking a variable number of template parameters or (my preference) accepting a Boost.MPL sequence of traversal tags.
You are quite challenging my meta-programming knowledge :) Fortunately I found out about mpl::fold and applied it straight forward to my previously created widest_convertible_traversal_cat (although renamed to widest_convertible_binary): ------ template<class Sequence> struct widest_convertible_sequence { typedef typename boost::mpl::if_< boost::mpl::empty<Sequence>, boost::no_traversal_tag, typename boost::mpl::fold< Sequence, boost::random_access_traversal_tag, boost::widest_convertible_binary<boost::mpl::_1, boost::mpl::_2> >::type
::type type; };
I've attached my adopted version. I think your idea of re-using this pattern for "least common ancestor" for a fixed class hierarchy shouldn't require to many adoptions to the current code basis. If the user is able to provide the mapping the rest is already in place (I think). Best regards, Christoph

Christoph Heindl wrote:
Clever! Though in this instance, the minimum is identical to the gcd, at which point you could just assign the traversal tags to consecutive integers ;)
I'm not sure if I understood you correctly: Assigning, for example, 1:readable, 2:single pass and 3:forward iterable would yield 1:readable for the gcd of 2:single pass and 3:foward which is not true (IMHO :).
Oh, I meant that you could replace gcd with min (in a linear hierarchy). - Jeff

Jeff, you are right. I wonder if non-linear hierarchy can be mapped to a GCD problem anyway. I guess it is possible for a finite set of hierarchies, but figuring out is a rather hard job todo. Therefore I'll switch to a min (or max) problem using consecutive integers. Another advantage of this is that the hierarchy can be supplied as an mpl sequence of types and the forward/backward mapping of numbers can happen implicitely inside widest_convertible: ------ template<class Sequence, class LinearHierarchySequence> struct widest_convertible { typedef typename .... type; }; typedef mpl::vector< boost::incrementable_traversal_tag boost::single_pass_traversal_tag, boost::forward_traversal_tag, boost::bidirectional_traversal_tag, boost::random_access_traversal_tag> linear_hierarchy; widest_convertible< mpl::vector<boost::forward_traversal_tag, boost::bidirectional_traversal_tag>, linear_hierarchy>::type t; ----- Best regards, Christoph On Mon, Sep 28, 2009 at 9:48 PM, Jeffrey Hellrung <jhellrung@ucla.edu> wrote:
Christoph Heindl wrote:
Clever! Though in this instance, the minimum is identical to the gcd, at which point you could just assign the traversal tags to consecutive integers ;)
I'm not sure if I understood you correctly: Assigning, for example, 1:readable, 2:single pass and 3:forward iterable would yield 1:readable for the gcd of 2:single pass and 3:foward which is not true (IMHO :).
Oh, I meant that you could replace gcd with min (in a linear hierarchy).
- Jeff
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (2)
-
Christoph Heindl
-
Jeffrey Hellrung