[boost] Re: boost::tree advice?

29 Mar 2005

      Dan Rosen wrote:
...
...
In my design nodes were made an implementation detail but I also
attempted to define some node concepts to allow advanced users to plug
their own node types into pre-defined trees.
I'm curious what one would do with that ability ... what kind of use
cases were you picturing? I've been attempting not to expose nodes in
the interface, but maybe there's a reason why I should?
One possibility would be to define a node that queries another data 
source for the data and children (e.g., a database). This means our tree 
would be like a (possibly mutating) view.

I also like the idea of interchangeable node types and from here to 
defining and exposing the concept it's only a short walk. For example, 
where space constraints are important a node could have two pointers:

     * next sibling
     * first child

Where one will be constantly going up and down the tree, back and forth 
between siblings, the gains from keeping the extra previous sibling and 
parent are important.

In the end, there are many possible node designs, and I can't say that 
one size fits all (or which, for that matter), so I planned to implement 
a few that I might want and let the advanced user write his own.

Having a defined interface for nodes also means that I can start by 
implementing simple, less-than-perfect nodes and later on (when I've got 
things working...) implement other variations.
...
...
Personally, I prefer a tree-iterator concept that allows traversal of
the tree without needing to carry the tree around. Having children as a
member function of the tree prevents this.
Oh, I think I was unclear, sorry. I wanted to provide children() as a
convenience, rather than as a necessary method for traversal and
mutation of the tree. In lieu of having a document prepared, let me
give you a little more detail on my basic idea for the concepts behind
the tree.
The basic building blocks are what I've been calling a "Multiply
Traversable Container" and a "Traversal", which go hand in hand. The
notion is that if there is no single, canonical traversal of a
container from beginning to end (unlike the present standard
containers which are all "flat"), I still want to be able to ask for a
begin() and an end() and use standard algorithms on an iterator range.
So iterators and member functions such as begin() and end() are
parametrized on a traversal policy. In the case of the Tree concept,
I've provided the obvious ones:
X::pre_order_traversal
  X::post_order_traversal
  X::depth_first_traversal
  X::breadth_first_traversal
  X::sibling_traversal
  X::first_descendant_traversal
  X::last_descendant_traversal
Here, I would leave the pre- and post-order traversals outside of the 
tree, but we can agree to disagree ;)

My rationale is that given the traversal categories of tree iterators, 
the pre- and post- order can be provided in a general external way. 
Keeping them out of the tree would keep the tree simpler for me. There 
are applications that don't require the "linearization" of the tree.

As for breadth- and depth-first and I was hoping to leave them to the 
BGL. I'd say they're really graph algorithms.
...
The first four traverse the whole tree, whereas the second three may traverse
only a subset of the tree.
One outcome of having multiple potential traversals is that, although
it still makes sense for a Multiply Traversable Container to be
EqualityComparable, it no longer makes sense to be LessThanComparable.
So I also thought, in the event that you actually want to compare two
trees (or other multiply-traversable containers) via operator <, I
could provide a little wrapper utility (I've called this
"single_traversal") modelling the standard Container concept, which
hangs onto a traversal policy and gives you only a single iterator
type.
I'll admit I didn't consider the possibility of having the tree be 
LessThanComparable. But I think we could agree on having the default 
behaviour to be a recursive node-wise comparison. This would be similar 
to in-order, but structure would matter for the comparison.

Anyway, anyone is free to define their own comparison function for 
specialized purposes, so I wouldn't bother too much with this.
...
Anyway, I'm omitting a lot of detail here (Multiply Traversable
Container is split up into Forward, Reversable and Random Access
variants, and I also have a separate Binary Tree concept, which
provides in-order iterators, etc.), but hopefully this gives you a
better basic idea of what I'm going for.
Yes.
...
...
That's one of the reasons I decided to define a TreeIterator concept
that overlaps with the standard Iterator concept when it comes to
sibling traversal.
[snip]
On the thread Justin Gottschlich started on this issue
(http://tinyurl.com/44e3u), I suggested pre-order and other
linearizations of a tree could be provided externally. I think putting
these on the tree unnecessarily complicates the design of the tree and
the iterators.
So, hopefully my comments above illustrate my opinions on this. I
think the minimum set of iterators necessary to build all other tree
iterators on (with reasonable efficiency) is
first_descendant_iterator, last_descendant_iterator, and
sibling_iterator. If I understand you correctly, I think that jives
with what you're saying, and seems to be in rough agreement with the
general consensus in Justin's thread. But I think it's a good thing to
have the full set of traversals in the tree as well, even if for no
other reason than for consistency with those three "fundamental"
traversals.
I'm not convinced those are "fundamental" traversals. I can imagine many 
cases where all you need is basic tree-traversal: parent to children.

Different traversals can be provided outside the tre based on the 
operations its TreeIterators can provide.
...
...
I solved parent/root issue by having the tree class promise only the
most basic TreeIterator: for traversal from parent to child. [snip]
From the initial root iterator one could get SiblingIterators for
iteration over its children.
So, from your description, I'm picturing two things:
- A TreeIterator is an extension of the standard iterator concept
that provides an additional method, something like children(), which
returns a SiblingIterator, to provide access to a node's children.
I had child_begin, child_end, ascend, descend (overloaded).
...
- The only way to get a SiblingIterator is via a TreeIterator's
children() method.
Something like that.
...
I'm not sure if this is what you were implying or not. But in my own
design, I was thinking two things: first, my iterators do not extend
the standard Iterator concept, and second, all iterator types should
be convertible to each other.
For the same reason that led me to split the iterators, a 
SiblingIterator might always be convertible to a TreeIterator, but the 
reverse wouldn't always be true.
...
For example, if I'm doing a pre-order
traversal of a tree -- maybe as a means of searching -- and I find an
interesting element, I might want to convert my iterator to a sibling
iterator to deal with the siblings of the element I've found. That
implies that even for the root of the tree, I should be able to get a
sibling iterator to point to it (though it would have no previous or
next siblings).
In general, I think this cannot be provided without some ugly hacks. 
Consider for example the following node for an n-ary tree:

     struct node
     {
         // node * parent;
         children_type children;
         value_type data;
     };

children_type is some kind of Sequence (or Range). Here, a sibling 
iterator would be an adaptor on top of the iterator provided by 
children_type. Because the root node is held directly by the tree, we 
have no way to provide a children_type::iterator pointing to it.
...
...
Depending on the underlying node, both iterators could actually be the
same. This could be inspected through separate traits classes.
Hmm, this gets back to the question about providing a node concept and
interface. I'm having trouble picturing what you're describing here.
Consider this node:

     struct node
     {
         // node * parent;
         node * previous_sibling, * next_sibling;
         node * begin_child, * end_child;
     };

Here, we can provide "sibling traversal" for the root node. So the 
tree_iterator exposed by the tree could be the same typedef as the 
sibling_iterator.
...
...
My idea was to write the basic single and double-linked lists
operations from scratch. We don't be need the whole std::list interface
for the nodes anyway.
I think I tend to agree now. I'd like to be able to plug in arbitrary
sequences (including std::list) to represent a node's children, but
that seems to come encumbered with too many penalties. On the other
hand, I'm still soliciting clever approaches to this problem, and I'm
still curious whether (despite having given up on it) you think it's a
good idea conceptually.
Given there are many possible node designs, providing different 
size/traversal tradeoffs, I chose to define the Node concept.

Anyway, I think the main part for a tree library would be the interface 
for the tree and its iterators. I can imagine, or rather hope, others 
would come up with trees conforming to the interface without imposing on 
them the node concepts.
...
I think at this point (lacking a decent solution for the pluggable
child sequence problem) if I were to provide a tree::children() method
to expose a child sequence, the sequence returned still should conform
to the standard Sequence concept, but needn't literally be a
std::vector, std::list, or whatever.
Right. In general you don't need to directly expose children. Remember 
this could be kept as a fixed size array. I inclined for child_begin, 
child_end.

Best,

João

[boost] Re: boost::tree advice?

Joao Abecasis