Boost library inclusion: generic tree containers?

Dear Boost library members - My name is Justin Gottschlich and have just recently joined the boost mailing list. Since this is my first post, I would first like to apologize in advance for any e-mail inequities I manage to include in this e-mail. I'm e-mailing is to ask if there may be interest in a generic tree container for addition to Boost. I have created four generic tree libraries which I am currently using in my online game (Nodeka). Additionally, my former company began using them in a rather small scope in 2002 and has since requested additional use of the libraries throughout their product (QuarkXPress) and my current company is now showing interest in using them and will begin using them within the next couple weeks (Raytheon Company). Due to the increased interest of the tree libraries, I am now bringing them to your attention in case you feel they may be of interest for Boost library inclusion. The four containers I have written are: core::tree - creates a generic tree that follows std::set functionality per tree level core::multitree - creates a generic tree that follows std::multiset functionality per tree level core::tree_pair - creates a key/value generic tree that follows std::map functionality per tree level core::multitree_pair - creates a key/value generic tree that follows std::multimap functionality per tree level In addition, I have four iterators for each tree container. The containers are written in an extremely general manner so while they have ordered behavior through "insert" functionality (like set and map), they can also achieve unordered element behavior through "push_back" and "push_front" behavior. The reason for this is so the generic tree containers can be used as a basic framework to build additional, algorithm or purpose driven behavior (nested sets, compiler based control trees). The containers are by no means up to "standards" quality - so much work would be needed there. =) I have written two articles on my tree containers (one published on gamedev.net) and the second article in the process for publication on gamedev.net that is currently being reviewed. I have posted both articles on my site for review (if you're interested in reading them): http://nodeka.com/TreePart1.pdf http://nodeka.com/TreePart2.pdf The second article really explains the design thoughts behind the tree containers. The first article simple shows how I've seen C++ programmers build trees using maps and why that's bad. =) If you are interested in the reasoning of my design for the trees, I would recommend reading the second article (http://nodeka.com/TreePart2.pdf). The current header source code can be seen here: http://nodeka.com/tree.h http://nodeka.com/multitree.h The core::tree_pair and core::multitree_pair are undergoing some changes so I haven't posted them just yet. If you feel generic tree containers are not suitable for Boost, I will completely understand and my feelings won't be hurt by your honesty. =) Thanks for your time and consideration, Justin Gottschlich ( <mailto:justin@nodeka.com> justin@nodeka.com, jgottschlich@runbox.com)

"Justin Gottschlich" <jgottschlich@runbox.com> wrote in message news:200502220833.j1M8XgX2032369@milliways.osl.iu.edu... | Dear Boost library members - | I'm e-mailing is to ask if there may be interest in a generic tree container | for addition to Boost. My guess is that there probably would be such interrest. | I have written two articles on my tree containers (one published on | gamedev.net) and the second article in the process for publication on | gamedev.net that is currently being reviewed. I have posted both articles on | my site for review (if you're interested in reading them): | | http://nodeka.com/TreePart1.pdf The ad hoc implementation of a tree is not exactly "industrial strength" :-) | http://nodeka.com/TreePart2.pdf One major issue with the design: I don't like that iterators are trees. It must be possible to break the two concepts apart like in tree<T>::level_iterator i = the_tree.level_begin(); tree<T>::recursive_iterator ii = the_tree.recursive_begin(); I do think it would be good to have several iteration strategies over the trees: 1. depth first 2. breadth first 3. on one level (ei, no recursion) For all iterators I would like to be able to say if( i.is_leaf() ). Maybe boost.graph can be used instead of starting from scratch, s.t. we can use a generic algorithm instead of member functions in the tree. br -Thorsten

"Thorsten Ottosen" <nesotto@cs.auc.dk> wrote in message news:cvff74$fi3$1@sea.gmane.org...
The ad hoc implementation of a tree is not exactly "industrial strength" :-)
Ahh yes, it definitely needs some beefing up to be of boost quality. =) If we start getting into serious design discussions and decide on a proper way to do things, I'd be more than happy to spend the required time to really raise the bar on the interfaces, internal implementation, template params (allocators), etc..
One major issue with the design: I don't like that iterators are trees. It must be possible to break the two concepts apart like in
Thorsten, you're absolutely right - that's really the big "question", the one I've struggled with for months really. I could definitely design the tree so iterators are not trees, but I'm not sure that's correct. The big problem by separating them out is that tree functionality is really required within an iterator - not because it can't be done outside of the iterator (as you've pointed out, it surely must be possible), but because nodes within trees are naturally trees themselves. This is not to say I can't be swayed on this, I am definitely up for a dialogue here =). In my second article I spend a good five pages discussing just this point, here are two highlights: 1) Iterator should be trees because Knuth says so =) (I'm only half serious here, but I think Knuth has a very valid point): The term "tree" and "subtree" will follow Knuth's definition from The Art of Computer Programming [2]: A finite set T of one or more nodes such that: a) there is one specially designated node called the root of the tree, root(T); and b) the remaining nodes (excluding the root) are partitioned into m >= 0 disjoint sets T1, ..., Tm, and each of these sets in turn is a tree. The trees T1, ..., Tm are called the subtrees of the root. In short, the above definition says "trees are defined by trees". Truthfully, there is no distinction between a node in a tree and the tree itself (as all nodes in a tree are trees themselves). Knuth notes that his definition of trees is a recursive one, as he defines trees in terms of trees. He further explains that trees can be defined in nonrecursive ways, yet it is most appropriate to define trees recursively as recursion is an innate characteristic of tree structures [3]. 2) This example shows the how inserts are done into subtrees from iterators: core::tree<char> t; *t = 'a'; t.insert('b'); t.insert('c'); core::tree<char>::iterator iter; // find the 'b' node iter = t.find('b'); // insert nodes d and e inside the b tree iter.insert('d'); iter.insert('e'); // find the 'c' node iter = t.find('c'); // insert nodes f and g inside the c tree iter.insert('f'); iter.insert('g'); I think this is an important point simply because it shows that the iterator iter is required to act as a container in order to satisfy "natural" characteristics of a tree. Otherwise, the base tree t would be required for all inserts and natural iteration through a tree (determined by the programmer) would become extremely difficult to "conceptually" understand. Not to say that would be wrong, Kasper Peeter's does this with his tree implementation. However, I personally, see the concept of the iterator functionally changing when dealing with trees. My second article (http://nodeka.com/TreePart2.pdf) explains these points much better with pictures. =)
tree<T>::level_iterator i = the_tree.level_begin(); tree<T>::recursive_iterator ii = the_tree.recursive_begin();
I do think it would be good to have several iteration strategies over the trees:
1. depth first 2. breadth first 3. on one level (ei, no recursion)
I completely agree with you on multiple types of iterators - and they wouldn't be hard to create. Additionally, I already have "finds" on all three points you pointed out above: 1) find_tree_depth() (with or without iterator starting point, with or without predicate) 2) find_tree_breadth() (with or without iterator starting point, with or without predicate) 3) find() (level only, with or without iterator starting point, with or without predicate)
For all iterators I would like to be able to say
if( i.is_leaf() ).
That's a great suggestion too - I don't have that as a direct function call, but can achieve it by this: if (0 == i.size()) // it's a leaf if its size is 0, since it's checking the number of the elements it contains Thanks for the great feedback, Justin

"Justin Gottschlich" <jgottschlich@runbox.com> wrote in message news:cvgkdm$k3u$1@sea.gmane.org... | "Thorsten Ottosen" <nesotto@cs.auc.dk> wrote in message First, I agree with Richard's comments about structure and structure manipulatig operations--this is the important part. | > One major issue with the design: I don't like that iterators are trees. It | > must be possible to break the two concepts apart like in | | Thorsten, you're absolutely right - that's really the big "question", the | one I've struggled with for months really. I could definitely design the | tree so iterators are not trees, but I'm not sure that's correct. [snip] | 2) This example shows the how inserts are done into subtrees from iterators: | | core::tree<char> t; | *t = 'a'; | t.insert('b'); | t.insert('c'); see, this seems just wierd that *t = ... is defined for a tree. I could accept the fact that you iterate over subtrees, but the design would have to be like this template< class T > class tree { T data; std::vector<tree*> children; ... }; then it would be natural that operator->() (not operator.()) of an iterator returns a tree*. So i->insert('c'); would seem natural. You might want to find out if that node-structure is not the best way to implement the tree. The design above does not require any node concept. | > For all iterators I would like to be able to say | > | > if( i.is_leaf() ). | | That's a great suggestion too - I don't have that as a direct function call, | but can achieve it by this: | | if (0 == i.size()) // it's a leaf if its size is 0, since it's checking | the number of the elements it contains yes, it should be fairly easy. I fell in my own trap an implemented is_leaf() as a member function of the iterator. It could make sense to have an is_leaf() member function as part of a tree-iterator concept, but it should actually be a member function is a tree, so i->is_leaf() would be to prefer IMO. br -Thorsten

| "Thorsten Ottosen" <nesotto@cs.auc.dk> wrote in message
First, I agree with Richard's comments about structure and structure manipulatig operations--this is the important part.
Hi Thorsten - I agree with Richard's comments as well (if you have time, please see my comments replying to his post, so we're on the same page).
| 2) This example shows the how inserts are done into subtrees from iterators: | | core::tree<char> t; | *t = 'a'; | t.insert('b'); | t.insert('c');
see, this seems just wierd that *t = ... is defined for a tree.
Yeah, I was playing around with having tree and iterator functionality behave the same, thus I created operator* for the tree to behave as it would the iterator. Pretty silly, although we will need some vehicle to set the data in that root node (whatever that ends up being). =)
I could accept the fact that you iterate over subtrees, but the design would have to be like this
template< class T > class tree { T data; std::vector<tree*> children; ... };
then it would be natural that operator->() (not operator.()) of an iterator returns a tree*. So
i->insert('c');
would seem natural.
You might want to find out if that node-structure is not the best way to implement the tree. The design above does not require any node concept.
Hmmm, yeah, I'll investigate that some.
yes, it should be fairly easy. I fell in my own trap an implemented is_leaf() as a member function of the iterator. It could make sense to have an is_leaf() member function as part of a tree-iterator concept, but it should actually be a member function is a tree, so
i->is_leaf()
would be to prefer IMO.
Roger that. =) At this point, I'm ready to step back and say, "ok, so ... let's figure out what we need to build." As I was discussing with Richard, I think we have two primary trees to develop: 1) trees as containers 2) trees as algorithms I think we need to figure out a design that is inherently inclusive to both groups (so everyone is happy). That being said, what do you think the best format to do that is? I envision a single base class that encapsulates all "basic" tree functionality and with that basic tree functionality, implementations that very clearly demonstrate both types of trees can be used seamlessly with it. Then, to further drive home that point, I think we should implement at least the first most obvious derivation of each type of tree from that basic tree definition. For example: class basic_tree {}; // proof of concept for containment and algorithms class container_tree : public basic_tree {}; class algorithmic_tree: public basic_tree {}; What do you think? And boy is this fun. =) Justin

"Justin Gottschlich" <jgottschlich@runbox.com> wrote in message news:cvjt7a$cn2$1@sea.gmane.org... |> | "Thorsten Ottosen" <nesotto@cs.auc.dk> wrote in message | > | At this point, I'm ready to step back and say, "ok, so ... let's figure out | what we need to build." As I was discussing with Richard, I think we have | two primary trees to develop: | | 1) trees as containers | 2) trees as algorithms | | I think we need to figure out a design that is inherently inclusive to both | groups (so everyone is happy). That being said, what do you think the best | format to do that is? | | What do you think? And boy is this fun. =) I'm not sure I like the expression "trees as algorithms". I would like to see different types of iterators which can then be used to implement algorithms. So I only see trees as containers. br -Thorsten

Thorsten Ottosen ha escrito:
"Justin Gottschlich" <jgottschlich@runbox.com> wrote in message news:cvjt7a$cn2$1@sea.gmane.org... |> | "Thorsten Ottosen" <nesotto@cs.auc.dk> wrote in message | >
| At this point, I'm ready to step back and say, "ok, so ... let's figure out | what we need to build." As I was discussing with Richard, I think we have | two primary trees to develop: | | 1) trees as containers | 2) trees as algorithms | | I think we need to figure out a design that is inherently inclusive to both | groups (so everyone is happy). That being said, what do you think the best | format to do that is?
| | What do you think? And boy is this fun. =)
I'm not sure I like the expression "trees as algorithms". I would like to see different types of iterators which can then be used to implement algorithms.
So I only see trees as containers.
I agree with Thorste here. Thinking about the relation between a tree structure and its various iterators, maybe we can adopt a similar conceptual approach as Boost.MultiIndex: a tree container is the underyling data structure, on top of which several "indices" are provided with different iteration policies: typedef tree<element> tree_t; tree_t tr; tree_t::iterator<inorder>::type it1=tr.get<inorder>().begin; // inorder iteration tree_t::iterator<preorder>::type it2=tr.get<preorder>().begin; // preorder iteration // convertibility between different types of iterators it2=tr.project<preorder>(it1); Apart from classical iteration schemes, the structure could provide a general-purpose interface from which to derive user-defined iterators: here we'd have "cursors" allowing for arbitrary movement across the structure. Maybe this makes little sense, but I felt like bringing it forward :) Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

"Joaquín Mª López Muñoz" <joaquin@tid.es> wrote in message news:421DD6FC.A717DC5@tid.es... Thorsten Ottosen ha escrito:
"Justin Gottschlich" <jgottschlich@runbox.com> wrote in message
I'm not sure I like the expression "trees as algorithms". I would like to see different types of iterators which can then be used to implement algorithms.
So I only see trees as containers.
|I agree with Thorste here. Thinking about the relation between a tree |structure and its various iterators, maybe we can adopt a similar conceptual |approach as Boost.MultiIndex: a tree container is the underyling data structure, |on top of which several "indices" are provided with different iteration policies: | |typedef tree<element> tree_t; |tree_t tr; | |tree_t::iterator<inorder>::type it1=tr.get<inorder>().begin; // inorder iteration |tree_t::iterator<preorder>::type it2=tr.get<preorder>().begin; // preorder |iteration | |// convertibility between different types of iterators |it2=tr.project<preorder>(it1); I'm not to keen on the idea that iterators are parameterized with an iteration policy. It would be more natural just to have tree_t::inorder_iterator i = tr.inorder_begin(); tree_t::preorder_iterator i2 = tr.preorder_begin(); This shouldn't prohibit conversion between the iterators AFAICT, but it will remove the need for a lot of template stuff which is only there to make the stuff harder to use and to make error messages harder to read. br -Thorsten

Joaquín Mª López Muñoz wrote:
|I agree with Thorste here. Thinking about the relation between a tree |structure and its various iterators, maybe we can adopt a similar conceptual |approach as Boost.MultiIndex: a tree container is the underyling data structure, |on top of which several "indices" are provided with different iteration policies: | |typedef tree<element> tree_t; |tree_t tr; | |tree_t::iterator<inorder>::type it1=tr.get<inorder>().begin; // inorder iteration |tree_t::iterator<preorder>::type it2=tr.get<preorder>().begin; // preorder |iteration | |// convertibility between different types of iterators |it2=tr.project<preorder>(it1);
Thorsten Ottosen wrote:
I'm not to keen on the idea that iterators are parameterized with an iteration policy. It would be more natural just to have
tree_t::inorder_iterator i = tr.inorder_begin(); tree_t::preorder_iterator i2 = tr.preorder_begin();
I think a generic tree should only be required to provide basic tree traversal iterators like from parent to child and among children or siblings. From the basic iterators other types of traversal are possible. In my unfinished tree library (see my other post) the same effect would be accomplished with: inorder_iterator<tree_t::tree_iterator> i = tr.root(); preorder_iterator<tree_t::tree_iterator> i2 = tr.root(); This keeps the interface and the implementation of the tree cleaner. One could also use BGL algorithms here if the tree exposes a Graph interface. Best, João

"Joao Abecasis" <jpabecasis@zmail.pt> wrote in message news:421E01EA.6020105@zmail.pt... | From the basic iterators other types of traversal are possible. In my |unfinished tree library (see my other post) the same effect would be |accomplished with: | | inorder_iterator<tree_t::tree_iterator> i = tr.root(); | preorder_iterator<tree_t::tree_iterator> i2 = tr.root(); | | This keeps the interface and the implementation of the tree cleaner. One | could also use BGL algorithms here if the tree exposes a Graph interface. you can still have your clean separation, but I would prefer to expose it as typedefs in the tree. That is going to be much more pleasent in generic code. -Thorsten

Thorsten Ottosen wrote:
"Joao Abecasis" <jpabecasis@zmail.pt> wrote in message news:421E01EA.6020105@zmail.pt...
| From the basic iterators other types of traversal are possible. In my |unfinished tree library (see my other post) the same effect would be |accomplished with: | | inorder_iterator<tree_t::tree_iterator> i = tr.root(); | preorder_iterator<tree_t::tree_iterator> i2 = tr.root(); | | This keeps the interface and the implementation of the tree cleaner. One | could also use BGL algorithms here if the tree exposes a Graph interface.
you can still have your clean separation, but I would prefer to expose it as typedefs in the tree. That is going to be much more pleasent in generic code.
Ok. This seems to be a matter of taste, but I have to ask... Why is a typedef more pleasant in generic code then a separate template? You could have a single algorithm that takes different linearization schemes like preorder and postorder as a template parameter and needn't care what their names are. João

"Joao Abecasis" <jpabecasis@zmail.pt> wrote in message news:421E32DE.9020203@zmail.pt... Thorsten Ottosen wrote:
"Joao Abecasis" <jpabecasis@zmail.pt> wrote in message news:421E01EA.6020105@zmail.pt...
| From the basic iterators other types of traversal are possible. In my |unfinished tree library (see my other post) the same effect would be |accomplished with: | | inorder_iterator<tree_t::tree_iterator> i = tr.root(); | preorder_iterator<tree_t::tree_iterator> i2 = tr.root(); | | This keeps the interface and the implementation of the tree cleaner. One | could also use BGL algorithms here if the tree exposes a Graph interface.
you can still have your clean separation, but I would prefer to expose it as typedefs in the tree. That is going to be much more pleasent in generic code. | |Ok. This seems to be a matter of taste, but I have to ask... Why is a |typedef more pleasant in generic code then a separate template? |
well, if the tree is a dependent name, we will have to write inorder_iterator< typename tree_t::tree_iterator > i = tr.root(); instead of typename tree_t::inorder_iterator i = tr.root(); I guess this would be even better: inorder_iterator<tree_t> i = tr.root(); (So I like that the most) |You could have a single algorithm that takes different linearization |schemes like preorder and postorder as a template parameter and needn't | care what their names are. or you could have two different algorithms sharing a common implementation if that is possible. But given the choice between writing boost::algo<boost::inorder_tag>( the_tree ); boost::algo_inorder( the_tree ); I prefer the last; it will be more pleasant to look at and produce better error messages. -Thorsten

----- Original Message ----- From: "Joao Abecasis" <jpabecasis@zmail.pt>
Thorsten Ottosen wrote:
I'm not to keen on the idea that iterators are parameterized with an iteration policy. It would be more natural just to have
tree_t::inorder_iterator i = tr.inorder_begin(); tree_t::preorder_iterator i2 = tr.preorder_begin();
I think a generic tree should only be required to provide basic tree traversal iterators like from parent to child and among children or siblings.
From the basic iterators other types of traversal are possible. In my unfinished tree library (see my other post) the same effect would be accomplished with:
inorder_iterator<tree_t::tree_iterator> i = tr.root(); preorder_iterator<tree_t::tree_iterator> i2 = tr.root();
This keeps the interface and the implementation of the tree cleaner. One could also use BGL algorithms here if the tree exposes a Graph interface.
Why not have just one kind of iterator: template<typename T, typename Iter_T> struct tree { typedef typename Iter_T iterator; iterator preorder_begin(); iterator postorder_end(); iterator inorder_end(); iterator end(); iterator begin(); // calls inorder_begin() } ? Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org

christopher diggins wrote:
Why not have just one kind of iterator:
template<typename T, typename Iter_T> struct tree { typedef typename Iter_T iterator; iterator preorder_begin(); iterator postorder_end(); iterator inorder_end(); iterator end(); iterator begin(); // calls inorder_begin() }
?
Hm... That looks too much like a Sequence to me ;) I think this is not feasible in the general case. Consider a minimal node containing only a pointer to the next sibling and to the first child. Or a binary tree that keeps pointers to its left and right children. Inorder, preorder and postorder traversals are possible using a stack of ancestors. In the general case, though you shouldn't have to pay for this extra overhead at all times. IMO a tree container should be just that. Preorder, postorder and inorder are ways to turn a tree into a sequence, I see them as views of the tree. This is the main reason I'd prefer to have them as separate entities. Btw, here's my short view of a tree: template <typename T> struct Tree { typedef /**/ tree_iterator; typedef /**/ const_tree_iterator; typedef /**/ sibling_iterator; typedef /**/ const_sibling_iterator; tree_iterator root(); tree_const_iterator root() const; }; sibling_iterators are provided by the tree_iterator (they may be the same type). A set of tree_iterator_traits would describe their capabilities. Here's a taste of what I have in mind and partially in code... typedef tree<int> tree; typedef tree<int>::tree_iterator tree_iterator; typedef inorder_iterator<tree_iterator> inorder_iterator; tree t; ... insert data somehow ... tree_iterator it = t.root(); // traversal of level 1 for_each( it.children_begin(), it.children_end(), print_out("first generation")); it.descend(); assert(it); // implicit bool checks if we're a leaf // inorder traversal of subtree through iterators for_each( inorder_iterator(it), inorder_iterator(), print_out("subtree inorder")); // functional traversal inorder_traversal(it, print_out("subtree inorder (again)")); Best, João

----- Original Message ----- From: "Joao Abecasis" <jpabecasis@zmail.pt> To: <boost@lists.boost.org> Sent: Thursday, February 24, 2005 3:02 PM Subject: [boost] Re: Boost library inclusion: generic tree containers?
christopher diggins wrote:
Why not have just one kind of iterator:
template<typename T, typename Iter_T> struct tree { typedef typename Iter_T iterator; iterator preorder_begin(); iterator postorder_end(); iterator inorder_end(); iterator end(); iterator begin(); // calls inorder_begin() }
?
Hm... That looks too much like a Sequence to me ;)
I think this is not feasible in the general case. Consider a minimal node containing only a pointer to the next sibling and to the first child. Or a binary tree that keeps pointers to its left and right children.
You do bring up very good points, and I am not sure what my position is yet so I will play the Devil's advocate for now. A useful tree iterator, in my view should provide access to the parent node from a given position in the tree. Without that algorithms, etc. are harder to write, and the tree is much harder to manage. Being able to move to a parent node makes the various traversals trivial (and without overhead as far as I can tell). I also do not believe that implementation should dictate interface. Implementations that try to save a pointer by not providing a pointer to the parent, are really just making life complicated for the sake of a few bytes. Arguably if space is that much of a concern perhaps the tree is not the right data structure for the job?
Inorder, preorder and postorder traversals are possible using a stack of ancestors. In the general case, though you shouldn't have to pay for this extra overhead at all times.
I am not convinced that there is *significant* overhead.
IMO a tree container should be just that. Preorder, postorder and inorder are ways to turn a tree into a sequence, I see them as views of the tree.
Yes. However traversing a tree is a rather common and important task for a tree data structure.
This is the main reason I'd prefer to have them as separate entities.
That or the overhead? If there is no *significant* overhead then why not allow a tree to provide a direct treatment as a sequence. Clearly the question becomes, what overhead is significant?
Btw, here's my short view of a tree:
template <typename T> struct Tree { typedef /**/ tree_iterator; typedef /**/ const_tree_iterator; typedef /**/ sibling_iterator; typedef /**/ const_sibling_iterator;
tree_iterator root(); tree_const_iterator root() const; };
This appears to be good code (upon cursory examination) however I am really at only concerned with binary trees at this point. They are in my view very separate things from k-ary trees and mandate separate implementations. Do you agree? As I said before, you make excellent points but I am not *yet* convinced that a tree should not be inherently treatable as a sequence. best, Christopher

I think this is not feasible in the general case. Consider a minimal node containing only a pointer to the next sibling and to the first child. Or a binary tree that keeps pointers to its left and right children.
yet so I will play the Devil's advocate for now. A useful tree iterator, in my view should provide access to the parent node from a given position in the tree.
Some applications never need to go up the tree, and the overhead can be significant. E.g. if your application just needs a pointer to first child and to next sibling, and you store one int at each node then you need 12 bytes; if you store a parent pointer it becomes 16 bytes/node. I.e. a 33% increase in memory usage, which can be a lot if your application data is basically just that tree and it has a few million nodes.
Without that algorithms, etc. are harder to write, and the tree is much harder to manage.
I've been following this discussion but not the code: I hope a high-quality boost tree library would have ways to choose which pointers each node maintains; even if that means separate containers for each. My own attempt at a generic tree container ended up as a mess of compromises that failed to meet any of the design goals well. So I'd be very interested to see what Boost can come up with. Darren

----- Original Message ----- From: "Darren Cook" <darren@dcook.org>
From: Christopher DIggins
Without that algorithms, etc. are harder to write, and the tree is much harder to manage.
I've been following this discussion but not the code: I hope a high-quality boost tree library would have ways to choose which pointers each node maintains; even if that means separate containers for each.
IMO it would be simpler and cleaner to have two separate binary tree implementation. One which inherently supports traversal, and one which doesn't. As you point out the trade-off for inherent traversal would be extra memory usage. As Joao mentioned you can still traverse trees without built-in traversal using adapters. So if I understand you correctly then this is an acceptable compromise for a boost library. Would others be amenable to such a compromise I wonder? Also I can't help but wonder if preorder / postorder traversal is really important as part of a tree? I have only ever used inorder, and I have trouble imagining a scenario where other traversals are important. Anyone have some use cases for me? Thanks, Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org

christopher diggins wrote:
You do bring up very good points, and I am not sure what my position is yet so I will play the Devil's advocate for now. A useful tree iterator, in my view should provide access to the parent node from a given position in the tree. Without that algorithms, etc. are harder to write, and the tree is much harder to manage. Being able to move to a parent node makes the various traversals trivial (and without overhead as far as I can tell). I also do not believe that implementation should dictate interface. Implementations that try to save a pointer by not providing a pointer to the parent, are really just making life complicated for the sake of a few bytes. Arguably if space is that much of a concern perhaps the tree is not the right data structure for the job?
I agree with some of the points you make but perhaps I didn't make myself completely clear in the first place. I was not trying to make space or efficiency constraints limit the interface. My point was that the Concepts themselves should allow different implementations, the same way a you can have single- and double-linked lists implementations of a Sequence. Sure, for many applications you'll want every kind of traversal possible. But not in all important applications. For example, searching a binary search tree only requires parent-to-child traversal. My point is that you can have different trees, all exposing the same basic interface that can provide different traversal possibilities and different complexity tradeoffs. In my design a tree_iterator is categorized according to a VerticalTraversalCategory which can be Forward (only parent-to-child) or Bidirectional. So if a tree_iterator supports it you could: it.ascend(); it.descend(); ++it /* next sibling */; --it /* previous sibling */; A separate traits class tells us the particular traversal category of a tree_iterator. Based on this any algorithm can choose an implementation for the particular traversal categories or require tree_iterators of a specific category.
Inorder, preorder and postorder traversals are possible using a stack of ancestors. In the general case, though you shouldn't have to pay for this extra overhead at all times.
I am not convinced that there is *significant* overhead.
Well, IMO, *if and when* a tree_iterator does not provide access to the parent, keeping a stack of pointers to allow for traversals I may not need is significant overhead for a tree_iterator. If we also allow traversals to be reversible we would end up with an iterator that holds a pointer to every node in the tree. But my point is not that the overhead is significant in all cases. Only that it may be in some, so we cannot assume that every tree implementation will provide a single iterator for inorder/preorder/postorder and general tree traversal.
IMO a tree container should be just that. Preorder, postorder and inorder are ways to turn a tree into a sequence, I see them as views of the tree.
Yes. However traversing a tree is a rather common and important task for a tree data structure.
This is the main reason I'd prefer to have them as separate entities.
That or the overhead? If there is no *significant* overhead then why not allow a tree to provide a direct treatment as a sequence. Clearly the question becomes, what overhead is significant?
IMO, if this can be provided in a general way outside the tree it would keep the interface of a tree simpler. I see no reason why linear traversals cannot be provided in a general way outside trees without loss of efficiency or added overhead. Still, Thorsten makes the point that this will make for more cryptic error messages and less "pleasant" code. I'm not sure I agree with him on that, but only an implementation will tell ;)
Btw, here's my short view of a tree:
template <typename T> struct Tree { typedef /**/ tree_iterator; typedef /**/ const_tree_iterator; typedef /**/ sibling_iterator; typedef /**/ const_sibling_iterator;
tree_iterator root(); tree_const_iterator root() const; };
This appears to be good code (upon cursory examination) however I am really at only concerned with binary trees at this point. They are in my view very separate things from k-ary trees and mandate separate implementations. Do you agree?
I think we can single out binary trees and give them a specific interface (I did!) where it makes sense to do so, but without putting aside the general tree interface. I also think a Boost Tree library should not lose the opportunity to define general concepts that can work with different trees and not only k-ary trees.
As I said before, you make excellent points but I am not *yet* convinced that a tree should not be inherently treatable as a sequence.
If you can have a tree that is only a tree and outside means of linearizing the tree (as in a LinearView or something) can be provided only once in a general way, what keeps us from doing that? I view this as the member/non-member functions debate. Best, João

"Thorsten Ottosen" <nesotto@cs.auc.dk> wrote in message news:cvkf1m$unc$1@sea.gmane.org...
I'm not sure I like the expression "trees as algorithms". I would like to see different types of iterators which can then be used to implement algorithms.
So I only see trees as containers.
Right-o. At this point, I'm going to take the feedback we've gathered so far (thank you everyone =) and begin the redesign of the core::tree to the most simplistic design I can and try to incorporate as many ideas suggested here that make sense for that basic design. I'll also try to improve the code quality while removing any unneeded basic functionality that exists in the tree currently. The suggestions here have been amazing, please keep them coming. Perhaps as we move forward, we can begin to incorporate more and more ideas into the general tree framework. Additionally, I'm going to try to keep the number of template parameters down to the lowest number possible while still allowing all the functionality we need (template <typename T, typename tree_traits, typename allocator>). As far as the generic base tree, I strongly feel that the more simple it is, the better. As we all seem to be pointing out, we really need to nail down this generic base tree structure and its core functionality. Once that's done (and we agree on how it's done) we can move into the extremely challenging realm of how to use that structure to achieve all possible trees. Thanks to all of you for the great feedback, assistance and support, =) Justin

On Fri, Feb 25, 2005 at 02:14:44AM -0700, Justin Gottschlich <jgottschlich@runbox.com> wrote:
As we all seem to be pointing out, we really need to nail down this generic base tree structure and its core functionality. Once that's done (and we agree on how it's done) we can move into the extremely challenging realm of how to use that structure to achieve all possible trees. Yeah, but do not forget to reserve time for Jeff Mirwaisis tree library That library is far beyond everything that has been discussed here, and definitely deserves more attention.
Regards Andreas Pokorny

"Andreas Pokorny" <andreas.pokorny@gmx.de> wrote in message news:20050225185859.GA9589@localhost...
Yeah, but do not forget to reserve time for Jeff Mirwaisis tree library That library is far beyond everything that has been discussed here, and definitely deserves more attention.
Absolutely - I've already started reviewing Jeff's design and am having a side discussion with him regarding his trees (so I can understand what the heck is going on, heheh). If his solution works for the base generic tree solution, then by all means, let's use that. =) We just need to make sure that trees outside of purely algorithmic uses are addressed. With the increasing use of XML, trees simply as storage devices are become more and more popular. It is my believe (which could be completely off) that our basic tree should be generalized in its structure enough to support any kind of tree construction. Lastly, while the internal design of the tree can be as complex as all get out, it is my belief that the client interface should be "extremely" simplistic when using bare-bones functionality (much like a vector is: std::vector<int> v - couldn't be easier than that). This goal of simple use is something I've continually strived towards and part of the reason why I believe much of STL is used so much today - yes it's powerful, but it's also very easy to use. A great example of this is Thorsten's assignment library - it's purpose is solely for simplifying client code. That is a great goal. =) I think we want our generalized tree to be basic enough so novice coders can use it for its basic containment purposes, but powerful enough so advanced coders can stress the heck out of it. Anyways, just random thoughts, =) Justin

On Fri, Feb 25, 2005 at 01:11:29PM -0700, Justin Gottschlich <jgottschlich@runbox.com> wrote:
Absolutely - I've already started reviewing Jeff's design and am having a side discussion with him regarding his trees (so I can understand what the heck is going on, heheh). If his solution works for the base generic tree solution, then by all means, let's use that. =) No it solves all tree wishes, all as in terms related to the word 'god' :). There is no border, nothing is fixed, everything is a mutator, and mutators are minimal implementations of features.
We just need to make sure that trees outside of purely algorithmic uses are addressed. With the increasing use of XML, trees simply as storage devices are become more and more popular. It is my believe (which could be completely off) that our basic tree should be generalized in its structure enough to support any kind of tree construction. You will see that this difference is no problem in that library. In my opinion it is not a good idea to provide a basic tree, if you do that some wishes for tree types are impossbile to fulfill. If you definde a basic tree then there is a fixed contract that you can either reuse or not use. So take it for granted that there will be cases in which that basic type cannot be reused.
By defining the tree as a composition of feature, you will not fall in that pit. Jeff Miriawasis tree library is not policy driven. Policy driven base tree types will fall in that pit either.
I think we want our generalized tree to be basic enough so novice coders can use it for its basic containment purposes, but powerful enough so advanced coders can stress the heck out of it. I doubt that this is possible.
Regards Andreas Pokorny

On Fri, 25 Feb 2005 02:14:44 -0700, Justin Gottschlich <jgottschlich@runbox.com> wrote: [...]
Additionally, I'm going to try to keep the number of template parameters down to the lowest number possible while still allowing all the functionality we need (template <typename T, typename tree_traits, typename allocator>). As far as the generic base tree, I strongly feel that the more simple it is, the better.
I'm not sure the number of template parameters should be the bare minimum; IMO the policies should be orthogonal, a side effect of which is probably a lot of separate parameters. Another suggestion: it might be useful for trees to share subtrees (for example, in an implementation of a SGI-like "rope" class). This makes possible cheap copies and lower memory use. Subtrees could then have multiple parents, so that iterators do need a stack. Regards, Rogier

I haven't studied your proposal in detail yet, but - I think it would be valuable for further discussion if you could compare your tree implementation to Kasper Peeter's version: http://www.damtp.cam.ac.uk/user/kp229/tree/ Kasper did propose his tree to Boost in late 2002: http://lists.boost.org/MailArchives/boost/msg36876.php .. altough it didn't seem to gain momentum at the time. I've been using Kasper's tree in a couple of smaller personal projects and would very much like to see a good generic n-ary tree container in Boost. I also second Thorsten's comment about the need for different iterator categories, (which can be found in Kasper's implementation also). Regards // Fredrik Blomqvist

"Fredrik Blomqvist" <fredrik_blomqvist@home.se> wrote in message news:cvfhbf$n4i$1@sea.gmane.org... |I haven't studied your proposal in detail yet, but - I think it would be | valuable | for further discussion if you could compare your tree implementation to | Kasper Peeter's version: http://www.damtp.cam.ac.uk/user/kp229/tree/ looks nice. | Kasper did propose his tree to Boost in late 2002: | http://lists.boost.org/MailArchives/boost/msg36876.php | .. altough it didn't seem to gain momentum at the time. well, it does show there is great interest in tree containers. | I've been using Kasper's tree in a couple of smaller personal projects and | would | very much like to see a good generic n-ary tree container in Boost. yeah, I guess a reasonble tree container library would start of with namepace boost { template< class T, class Container = std::vector<T> > class tree; template< int N, class T, class Container = std::vector<T> > class n_ary_tree; } the possibility that we can specify the underlying container is important. Given that, we can easily make a tree of heap-allocated objects by typedef boost::tree<T*, boost::ptr_vector<T> > heap_tree; br Thorsten

"Fredrik Blomqvist" <fredrik_blomqvist@home.se> wrote in message news:cvfhbf$n4i$1@sea.gmane.org...
I haven't studied your proposal in detail yet, but - I think it would be valuable for further discussion if you could compare your tree implementation to Kasper Peeter's version: http://www.damtp.cam.ac.uk/user/kp229/tree/
From his webpage example, we can compare how the two of us build and output
Hi Fredrik - thanks for your feedback. =) After quickly looking through Kasper's design, I see the fundamental difference between my tree and Kasper's is that Kasper attempts to cover all possible tree iterations by definition of all possible types of iterators. My tree implements an iterator that functions as a tree and therefore naturally covers all possible types of iterations - given any node within the tree, you can iterate however you like (in, out, ++, --) from that node. As I was saying in response to Thorsten's post, I have defined my iterator as a tree itself (yes, this may end up being a topic of great discussion/contention =), so the natural semantics of trees are seen in the way it works. Kasper's design is very nice (and his code is more refined than mine, that's for sure), but is a bit "difficult" to manage relatively simplistic behavior. Again, this isn't because his design is flawed, rather it is because he is trying to follow the semantics of STL iterators, one which I intentionally deviated from due to my decision that tree nodes are trees and iterators take on different meanings when constructing trees (due to their recursive nature). the same tree: Kasper's tree and output: int main(int, char **) { tree<string> tr; tree<string>::iterator top, one, two, loc, banana; top=tr.begin(); one=tr.insert(top, "one"); two=tr.append_child(one, "two"); tr.append_child(two, "apple"); banana=tr.append_child(two, "banana"); tr.append_child(banana,"cherry"); tr.append_child(two, "peach"); tr.append_child(one,"three"); loc=find(tr.begin(), tr.end(), "two"); if(loc!=tr.end()) { tree<string>::sibling_iterator sib=tr.begin(loc); while(sib!=tr.end(loc)) { cout << (*sib) << endl; ++sib; } cout << endl; tree<string>::iterator sib2=tr.begin(loc); tree<string>::iterator end2=tr.end(loc); while(sib2!=end2) { for(int i=0; i<tr.depth(sib2)-2; ++i) cout << " "; cout << (*sib2) << endl; ++sib2; } } } Performing the same behavior, my tree and output: int main() { core::tree<string> tr; core::tree<string>::iterator iter, inner, two, banana; tr.data("one"); two = tr.push_back("two"); two.push_back("apple"); banana = two.push_back("banana"); banana.push_back("cherry"); two.push_back("peach"); tr.push_back("three"); for (iter = two.begin(); iter != two.end(); ++iter) cout << *iter << endl; cout << endl; for (iter = two.begin(); iter != two.end(); ++iter) { cout << *iter << endl; for (inner = iter.begin(); inner != iter.end(); ++inner) { cout << " " << *inner << endl; } } } The fundamental difference is that Kasper's tree uses iterators that strictly follow STL's understanding of iterators (flat-containers), whereas my tree uses iterators as trees, thus the natural semantics of trees are maintained so the code (at least to me) is more simplistic. Additionally as you suggested above, I can add a number of iterators to my tree to perform any type of iteration - and doing such will allow standard algorithms to be used as they would on flat containers (if that's the goal of the iterator being created). However, that can also be achieved by the programmer which is really what I was shooting for. The fundamental driving force of my tree design was to create a framework where any type of tree could be constructed from its base functionality. I was driving for a front-end that is inherently simplistic to use and conceptualize by the programmer allowing him or her to construct their own trees and that behavior as needed. Lastly, in my second article, I explain the reasoning behind my design much better than I do here. I'd be very curious to know yours and others thoughts on that (http://nodeka.com/TreePart2.pdf). I'm sorry the file is so big - ack! Thanks again for the delightful interchange and insight, =) Justin

Justin Gottschlich wrote:
After quickly looking through Kasper's design, I see the fundamental difference between my tree and Kasper's is that Kasper attempts to cover all possible tree iterations by definition of all possible types of iterators. My tree implements an iterator that functions as a tree and therefore naturally covers all possible types of iterations - given any node within the tree, you can iterate however you like (in, out, ++, --) from that node.
I haven't looked at the code, so please correct me if I am off base. But isn't having iterators which behave like stl iterators the only way to make the tree work with generic algorithms? I think that your multi-purpose iterator could be very convenient for the code that is hardwired to use trees, but I think that it would also be nice if the other iterators were provided, so that you could for_each over a single level in the tree, or over the whole tree, etc... -Jason

"Jason Hise" <chaos@ezequal.com> wrote in message news:421BF5C5.2050507@ezequal.com...
Justin Gottschlich wrote:
I haven't looked at the code, so please correct me if I am off base. But isn't having iterators which behave like stl iterators the only way to make the tree work with generic algorithms? I think that your multi-purpose iterator could be very convenient for the code that is hardwired to use trees, but I think that it would also be nice if the other iterators were provided, so that you could for_each over a single level in the tree, or over the whole tree, etc...
Hi Jason - You're right on target. With my current base tree iterator, I only supply increment and decrement over a single level and in order to get access to depths within deeper levels of the tree, I'd have to use the iterator to step "in()" / "begin()" - thus standard algorithms would only work for a single level based on the single-level iteration functionality of the current iterator (but, keep reading). You could still use for_each for the tree that way, but only a single level would be covered. So, if you had a tree/iterator with 10 direct children, each of which had 10 direct children, if you did this (which would be perfectly legal with my current implementation): ///////////////////////////////////////////////////////////////////////////// void outputBranch(int i) { std::cout << i << std::endl; } ///////////////////////////////////////////////////////////////////////////// int main() { core::tree<int> t; for (int i = 0; i < 10; ++i) { core::tree<int>::iterator iter = t.push_back(i); for (int j = 0; j < 10; ++j) iter.push_back(j); } std::for_each(t.begin(), t.end(), outputBranch); } it would simply output all the elements of the first level - not the entire tree. In order to iterate over the whole tree (your second point), I would simply have to create an additional iterator which iterated recursively through the tree using ++ and it would require a minor change to the above code: ///////////////////////////////////////////////////////////////////////////// int main() { core::tree<int> t; for (int i = 0; i < 10; ++i) { core::tree<int>::iterator iter = t.push_back(i); for (int j = 0; j < 10; ++j) iter.push_back(j); } core::tree<int>::recursive_depth_iterator riter = t.begin(); std::for_each(riter, t.end(), outputBranch); } I don't currently have a recursive_depth_iterator, because in the past I haven't really needed it. But as you pointed out, it'd really do a lot of good having a full tree iterator for standard algorithm usage. The point I was really trying to make in the previous post (which I can see I did a rather poor job of) is that based on the way I designed the tree and its iterator, any type of iterator can be constructed from the current implementation without changing the design of the tree or its iterator - thus, allowing full standard algorithm usage. Since the one basic iterator performs fully as a tree and an iterator, any kind of iterator can be constructed just from that one iterator (because the iterator has the ability to step into itself or out of itself, just like the tree can). My initial driving idea behind the tree was to create a framework generic enough that any kind of tree could be constructed from its base functionality (rather than trying to guess at every kind of tree people would need). Thus, any programmer using the tree could create an iterator for the tree to perform any kind of iteration they like (level iteration, tree-depth iteration, backwards tree-breadth iteration) and have standard algorithms work on that iterator exactly as they had coded. Thus, I wouldn't have to "pre-determine" all possible types of iterations (nor tree types) the end coder needs. Some basic types of tree iteration clearly need to be available that I currently don't have. After the great points made today, I most certainly will add them. =) Thanks for the discussion and point for clarification Jason, Justin

Justin Gottschlich wrote:
Thus, any programmer using the tree could create an iterator for the tree to perform any kind of iteration they like (level iteration, tree-depth iteration, backwards tree-breadth iteration) and have standard algorithms work on that iterator exactly as they had coded. Thus, I wouldn't have to "pre-determine" all possible types of iterations (nor tree types) the end coder needs.
Some basic types of tree iteration clearly need to be available that I currently don't have. After the great points made today, I most certainly will add them. =)
When I read that it immediately made me think "policy based design". Would there perhaps be some way to make the iterator policy based? Then the user could use a core::tree::iterator<core::recursive>, or a core::tree::iterator<core::level_based>, or even a core::tree::iterator<core::reverse_level_based>. You could provide the more common iterator policies with your library. I don't know mechanically how this would be implemented, its just a thought. -Jason

When I read that it immediately made me think "policy based design". Would there perhaps be some way to make the iterator policy based? Then the user could use a core::tree::iterator<core::recursive>, or a core::tree::iterator<core::level_based>, or even a core::tree::iterator<core::reverse_level_based>. You could provide the more common iterator policies with your library. I don't know mechanically how this would be implemented, its just a thought.
That's a very interesting idea Jason, one I hadn't thought of (due to my own perceptions of how the iterators would be designed). Using a policy based design for the different iterator types might just work and that actually would fit more inline with "enabling the tree creator" than anything I've thought of. I've done a couple policy designs in the past, allowing different encryption algorithms to be replaced as the policy for this cool licensing thingy and they've worked wonderfully. At the moment, I can imagine your idea of a policy based design working very well for the iterator implementation. I'm going to ponder over Rene's commentary some more and see if we can't figure out some high level plan of attack before going back to more low level thoughts. Great stuff. =) Justin

Justin Gottschlich wrote:
Additionally as you suggested above, I can add a number of iterators to my tree to perform any type of iteration - and doing such will allow standard algorithms to be used as they would on flat containers (if that's the goal of the iterator being created).
Oops, sorry, hadn't read the whole message. You can disregard my previous post. -Jason

"Justin Gottschlich" wrote: [snip]
As I was saying in response to Thorsten's post, I have defined my iterator as a tree itself (yes, this may end up being a topic of great discussion/contention =), so the natural semantics of trees are seen in the way it works. [snip]
Similar problems are being solved by Dave Handley in Composite library (in development, zip is in snadbox files). http://www.codeproject.com/vcpp/stl/typed_iteration.asp http://www.codeproject.com/vcpp/stl/functor_iteration.asp http://www.codeproject.com/gen/design/composite_visitor.asp Maybe some inspiration could be taken here or common functionality factored out or reused. /Pavel

"Pavel Vozenilek" <pavel_vozenilek@hotmail.com> wrote in message news:cvi5ci$p4s$1@sea.gmane.org...
Similar problems are being solved by Dave Handley in Composite library (in development, zip is in snadbox files).
http://www.codeproject.com/vcpp/stl/typed_iteration.asp http://www.codeproject.com/vcpp/stl/functor_iteration.asp http://www.codeproject.com/gen/design/composite_visitor.asp
Maybe some inspiration could be taken here or common functionality factored out or reused.
Thanks for the direction Pavel - I'm checking out Handley's work now. =) Justin

Justin Gottschlich wrote:
Dear Boost library members -
My name is Justin Gottschlich and have just recently joined the boost mailing list. Since this is my first post, I would first like to apologize in advance for any e-mail inequities I manage to include in this e-mail.
I'm e-mailing is to ask if there may be interest in a generic tree container for addition to Boost.
Obviously there's interest ;-) But.. Like Thorsten I would think it a requirement that some basic interfaces/concepts come ahead of an implementation. What I don't see from your proposal, nor Kaspers tree, is some consideration for what makes a container a tree. I think we desperately need some definitions of what operations should be part of a tree, n_ary_tree, binary_tree, rank_tree, b_tree, etc. Additionally, I'm also of the opinion that the "iterator" concept should be kept as it's currently used in STL. An object that does more than an iterator, because it happens to be an iterator in a tree should be called something else, "cursor" comes to mind. If it's really masquerading as a sub-tree, or a tree, then call it a tree. If it's a pointer to a node in the tree, then perhaps a "node_pointer". Mixing the terms will just cause more confusion than it's worth. Reading some of your later posts I can see the confusion already creeping in. My suggestion would be for you to formalize the different parts of a tree: * node; Holder for the value, structure pointers, and tree coloring information * cursor; pointer to a node with tree specific operations to move the cursor within the tree and to interrogate the tree * iterator; linear traversal of the nodes in the tree * tree; the container of nodes and producer of cursors and iterators, and the needed allocators With those terms you can hopefully see how one could implement each in terms of the others. For example iterators would be implemented in terms of cursors such that they would be orthogonal to the type of tree they are traversing. But to not talk in a complete vacuum here's my implementation of a rank_tree.. http://redshift-software.com/~grafik/RankTree.h It demonstrates how if you separate out the node you can write an iterator independent of the tree itself. I only have one type of iterator, really an in-order iteration, but it would be straight forward to reformulate it in terms of a cursor. I think the most important aspect we need to keep in mind is that having a tree container, or a set of tree containers, is not to have basic n-ary tree to build from. But to have generic algorithms, programs, iterators, mutators, etc. that can operate on different kinds of tree implementations so that programmers can make storage and algorithmic trade offs in programs. Having a n-ary tree in STL doesn't help me much because I would never use it. But having various tree implementations in STL such that I can write an algorithm that works regardless of which type of tree i use it on is the real win. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com - 102708583/icq

"Rene Rivera" <grafik.list@redshift-software.com> wrote in message news:421C0E3A.3080003@redshift-software.com...
Obviously there's interest ;-) But..
Like Thorsten I would think it a requirement that some basic interfaces/concepts come ahead of an implementation. What I don't see from your proposal, nor Kaspers tree, is some consideration for what makes a container a tree. I think we desperately need some definitions of what operations should be part of a tree, n_ary_tree, binary_tree, rank_tree, b_tree, etc.
Hi Rene - you make an excellent point here. The more I think about this, the more I realize, perhaps a step back to look at what we're really trying to achieve is the way to solve this.
Additionally, I'm also of the opinion that the "iterator" concept should be kept as it's currently used in STL. An object that does more than an iterator, because it happens to be an iterator in a tree should be called something else, "cursor" comes to mind. If it's really masquerading as a sub-tree, or a tree, then call it a tree.
Hmmm ... =) you might be right. Still, have you read my explanation behind this argument (http://nodeka.com/TreePart2.pdf)? I'm not saying I disagree with you, but rather I want to make sure we're on the same page on the reasoning behind this node/sub-tree/tree/iterator/branch definition. =) Damn that Knuth for shaping my thoughts on trees!
But to not talk in a complete vacuum here's my implementation of a rank_tree..
Wonderful code. =)
I think the most important aspect we need to keep in mind is that having a tree container, or a set of tree containers, is not to have basic n-ary tree to build from. But to have generic algorithms, programs, iterators, mutators, etc. that can operate on different kinds of tree implementations so that programmers can make storage and algorithmic trade offs in programs. Having a n-ary tree in STL doesn't help me much because I would never use it. But having various tree implementations in STL such that I can write an algorithm that works regardless of which type of tree i use it on is the real win.
Yes, I completely agree with you here. The reason behind my initial implementation of the core::tree was to do just that - create a framework that will allow multiple types of trees to be constructed from it. However, it's the details that we differ on - I see using the n-ary tree path as the perfect means to achieving our mutual goal, because it can be wrapped inside its end-resultant tree and then expose only the interfaces (and iterators) necessary for that tree. Additionally, I see usefulness in n-ary trees for generalized tree container purposes, rather than algorithmic constructs. Your rank tree is really making me think about this in more detail, perhaps this is the biggest roadblock of the generic tree path - there are such a wide ranging purposes for trees, coming to a common goal that achieves all generalized purposes may be challenging. =) At any rate, this is great intellectual stimulation ... I personally am going to take a step back and think about this whole tree thing for a bit (my brain is beginning to hurt). Maybe walking the dog will give some revelations. =) Justin

On 02/22/2005 10:46 PM, Justin Gottschlich wrote:
"Rene Rivera" <grafik.list@redshift-software.com> wrote in message news:421C0E3A.3080003@redshift-software.com... [snip]
Hmmm ... =) you might be right. Still, have you read my explanation behind this argument (http://nodeka.com/TreePart2.pdf)?
Howdy, Justin. I assume you mean the argument that the name should be "iterator" instead of "cursor". Could you give the page of the pdf file where you make this argument? [snip]
But to have generic algorithms, programs, iterators, mutators, etc. that can operate on different kinds of tree implementations so that programmers can make storage and algorithmic trade offs in programs.
Would flatten_value_type_iterator (see below) work for you? [snip]
However, it's the details that we differ on - I see using the n-ary tree path as the perfect means to achieving our mutual goal, because it can be wrapped inside its end-resultant tree and then expose only the interfaces (and iterators) necessary for that tree.
By "wrapped inside" do you mean something like the Bridge pattern in GOF where a virtual operator++ would actually be implemented by vector<T>::iterator++ or list<T>::iterator++ since the actual container type wouldn't be known for a "generic" (i.e. abstract) tree? If so, then I've got a bridge_iterator I created for this purpose; however, it does have the virtual function call overhead :( [snip] One example of a tree iterator which works on any sequence of "nested" (see below) stl container types, is at: http://boost-sandbox.sourceforge.net/vault in file: flatten_value_type_iterator.zip It determines the depth of the tree by specializing mpl::{next,deref} to make the outermost container appear as a mpl sequence and where mpl::next returns the type of the next innermost (or nested) container. IOW, in case of: vector<list<int> > the outermost container would be vector<list<int> > and the next (and, in this case leaf container) would be list<int>. The iterators are actually composed of two iterators: flatten_iterator_value_type_iterator<Parent,Children>::my_current flatten_iterator_value_type_iterator<Parent,Children>::child_iter_type The first, or "parent iterator" corresponds to the higher level in the tree, and the other, or "child iterator", corresponds to all lower levels in the tree down to the leaf nodes. Because of mpl::fold, this code doesn't have the virtual function call overhead of the bridge_iterator mentioned above. Test code is in flatten_value_type_iterator_test.cpp where, in function, test_print, the "tree" is of type: typedef std::vector<std::list<std::deque<int> > > tree_type;

Hmmm ... =) you might be right. Still, have you read my explanation behind this argument (http://nodeka.com/TreePart2.pdf)?
Howdy, Justin.
I assume you mean the argument that the name should be "iterator" instead of "cursor". Could you give the page of the pdf file where you make this argument?
Hi Larry - yeppers that's exactly what I mean, sorry for not being more clear. =)
From the article (http://nodeka.com/TreePart2.pdf), the sections I'm referring to about an iterator being a tree and vice versa are here:
Section 2 - Terminology (pp 1-3) Section 3.2 - Understanding the core::tree<> recursive characteristic (p 3) Section 3.3 - 3.4 - trees as iterators, iterators as trees (pp 4 - 9) I'd be very curious to hear your thoughts on the arguments made there.
By "wrapped inside" do you mean something like the Bridge pattern in GOF where a virtual operator++ would actually be implemented by vector<T>::iterator++ or list<T>::iterator++ since the actual container type wouldn't be known for a "generic" (i.e. abstract) tree? If so, then I've got a bridge_iterator I created for this purpose; however, it does have the virtual function call overhead :(
To be honest, Larry, I'm not really sure what means would be best to "wrap" the base iterator functionality - I like your suggestion though. =) Using the bridge pattern would be a very nice way to achieve this, both for definition of an abstract iterator or an abstruct tree. Inline with that thinking (after such great feedback and thought provoking discussion), I think it's apparent that there are two distinct types of trees we need: 1) algorithmic trees 2) container trees Each tree serves a completely different purposes than the other and based on the number of points brought up here, it seems we need to build solutions towards both ends. I am currently thinking of a very basic tree design, like this: class base_tree {}; class container_tree : public base_tree {}; class algorithmic_tree : public base_tree {};
From there, you can go nuts:
class multitree : public container_tree {}; class tree_pair : public container_tree {}; class binary_tree : public algorithmic_tree {}; class balanced_binary_tree : public binary_tree {}; Or maybe, instead of direct derivation, you use policies for more flexible "undecided" routes: template <typename T> class DefaultTreeIterator {}; template <typename T> class DefaultTreePolicy { public: typedef DefaultTreeIterator<T> iterator; typedef const DefaultTreeIterator<T> const_iterator; // policy specific implementation here? }; template <typename T, template <class> class TreePolicy = DefaultTreePolicy> class base_tree : public TreePolicy<T> { // basic tree implementation here? }; base_tree<int, binary_tree_policy> binaryTree; base_tree<std::string, xml_tree_policy> xmlTree; Of course the ending tree hierarchy and template parameters would be different, but I think basic tree functionality must be achieved at some base level (at least that's what I'm thinking right now). From there, the differing goals of the trees (algorithmic and containment) can be tended towards for the two different paths. This is just an extremely basic layout of what I'm thinking at the moment ... =) What's perhaps most interesting to me is that many people would be required to implement what is seeming to be an entire "collection" of trees - but from the number of stellar-coder people who have shown interest, it seems we might just be able to pull it off. =) What we really need at this point is an agreed upon design decision - in order to do that I think we need to figure out if the above goals (of a container_tree and algorithmic_tree) are really what we want the tree libraries to be build on, or if something's being missed. As Rene was pointing out, what are we really trying to achieve? Hmmm ... Thanks for the great insight Larry, =) Justin p.s.
One example of a tree iterator which works on any sequence of "nested" (see below) stl container types, is at:
I'll definitely take a closer look at this once we can work out the details for the overall tree design - it seems the flatten_value_type_iterators would be a good match for the algorithmic_tree. While the idea of the bridge_iterator is excellent for this tree design, I wonder if those people interested in the algorithmic_trees would be willing to take the overhead of a virtual table lookup (albeit minor, the use of a algorithmic_tree would be concretely founded in efficiency).

----- Original Message ----- From: "Rene Rivera" <grafik.list@redshift-software.com>
Additionally, I'm also of the opinion that the "iterator" concept should be kept as it's currently used in STL. An object that does more than an iterator, because it happens to be an iterator in a tree should be called something else, "cursor" comes to mind. If it's really masquerading as a sub-tree, or a tree, then call it a tree. If it's a pointer to a node in the tree, then perhaps a "node_pointer". Mixing the terms will just cause more confusion than it's worth. Reading some of your later posts I can see the confusion already creeping in. My suggestion would be for you to formalize the different parts of a tree:
* node; Holder for the value, structure pointers, and tree coloring information
* cursor; pointer to a node with tree specific operations to move the cursor within the tree and to interrogate the tree
* iterator; linear traversal of the nodes in the tree
* tree; the container of nodes and producer of cursors and iterators, and the needed allocators
I think these formalizations of different parts of the tree is a very good start, but I'd like to see the iterator eliminated for a while, or at least totally decoupled from the tree: The reason for this is Separation of Concerns. A tree class is a container quite similar like the standard containers in that it stores values, but the big difference is that the structure of the tree matters, instead of the sequence. Therefore, I think the tree class should provide functions to navigate through the tree, and functions to alter the structure of a tree, like insertion, deletion, and rotation. These functions should probably work on things like the above-defined cursor. Once this tree class is built, then it is time to start thinking how the structure of the tree should be translated to flat sequences. There are of course a lot of options: depth first, breadth first, only a single level, or from top to a single node. Navigating in this way is done via 'normal' iterators. These iterators work on cursors, and should not be a part of the tree class itself (the iterator type should not be a member of the tree class). These iterators are like 'cursor-to-iterator adaptors'. So I guess my point is this: The tree doesn't need a normal or default or any iterator type, because it doesn't have a proper meaning: a tree is about structure, not about sequence. Translating that structure into a sequence can be done outside the tree. best regards, Richard Peters

"Richard Peters" <r.a.peters@student.tue.nl> wrote in message news:076301c51983$c5319ba0$0a01a8c0@campus.tue.nl...
I think these formalizations of different parts of the tree is a very good start, but I'd like to see the iterator eliminated for a while, or at least totally decoupled from the tree: The reason for this is Separation of Concerns. A tree class is a container quite similar like the standard containers in that it stores values, but the big difference is that the structure of the tree matters, instead of the sequence. Therefore, I think the tree class should provide functions to navigate through the tree, and functions to alter the structure of a tree, like insertion, deletion, and rotation. These functions should probably work on things like the above-defined cursor.
Hi Richard - your above points are extremely well made, and I've been swayed by yours (and the others arguments) about the iterator / cursor points. You and Rene are right, we should eliminate the concept of an iterator for awhile, it's just getting in the way of us defining a generic tree.
Once this tree class is built, then it is time to start thinking how the structure of the tree should be translated to flat sequences. There are of course a lot of options: depth first, breadth first, only a single level, or from top to a single node. Navigating in this way is done via 'normal' iterators. These iterators work on cursors, and should not be a part of the tree class itself (the iterator type should not be a member of the tree class). These iterators are like 'cursor-to-iterator adaptors'.
So I guess my point is this: The tree doesn't need a normal or default or any iterator type, because it doesn't have a proper meaning: a tree is about structure, not about sequence. Translating that structure into a sequence can be done outside the tree.
These are all interesting points and I think you've hit the concepts we need to focus on square on the head. As Larry and I were discussing just above this, I currently think we have two main design concerns to tackle: 1) container trees 2) algorithmic trees As you pointed out, the structure of the tree is the important aspect, not the delivery system of that structure (that can be determined afterwards). However, the reason I'm narrowing my focus to the above two types of trees is because I'm seeing differing opinions on the goals of a "tree" and I think both are valid. I currently think that we should try to identify the overlapping characteristics of the two fundamental types of trees and then begin working on identifying how that commonality can be implemented in our base tree definition, to achieve both ends in an elegant and robust manner. I fear even this distinction may be getting too low into detail, but I worry that without putting both types of trees in the picture before a base tree is defined, either group will feel slighted and believe our interests are not going to achieve their goals. Thanks for your insight Richard, =) Justin

Hi! First of all I want to say I'm definitely interested in generic Tree Containers. I started developing a tree library of my own not that long ago but got buried in the details and had to put it on hold for a while. I did not take the time to read your code, but went through the discussion and wanted to share the approach and options I took in the design of my (still to be completed) Oak Leaf library. The first decision I made was to try to approach STL containers as much as possible, but not where it interferes with a "nice" design. Like STL::Containers, Trees are containers of objects. However, unlike them, Trees are not Sequences and IMO it is an error to try to make them behave like that "by design". The basic concepts I came up with are: 1) TreeContainer This can be further refined into: k-ary tree container, binary tree container, search tree container, ... In my design the binary and ternary tree containers are given a special status but all expose the basic TreeContainer interface. The TreeContainer owns its nodes meaning it is solely responsible for their lifetime. You use the container to insert, copy and erase nodes and sub-trees. 2) TreeIterator TreeIterators are categorized according to vertical (parent-child) and horizontal traversal (among _siblings_). The vertical traversal category is one of ForwardVerticalTraversal (parent to child only) or BidirectionalVerticalTraversal (both ways). While horiontal traversal among siblings of a node follows the STL Sequence and Iterator concepts. That is, children of a node are viewed like an STL Sequence. 3) TreeNode The TreeNode contains the values and color information, as well as information on how to access other nodes like siblings, parent and children. Not all Nodes will grant access to its siblings and parent but all are supposed to grant access to children. This is more an implementation detail but my trees took the Node as template parameter so could provide different traversal/complexity tradeoffs. As the TreeNode is not supposed to be part of the user interface, TreeContainers and TreeIterators could be implemented with other node types. The User interface is made up by TreeContainers and the TreeIterators they expose. Each TreeContainer exposes at least a ForwardVerticalTraversal TreeIterator and this iterator should also expose a SiblingIterator to traverse the children of the current node. To keep the design clean and simple, depth, level, and other types of iteration would be provided outside the containers by iterator adaptors and by reusing BGL's algorithms. IMO interaction with the BGL is an important point to consider for Boost Tree Library because Trees are also Graphs and any generic Boost Tree Library should harvest what's available in BGL. Best regards, João Abecasis

Joao Abecasis wrote:
Hi!
... I thoroughly agree with all you've stated. I think this is the best foundation for a generic tree library effort that I've seen. In particular I too would think it imperative that the interaction with existing STL and BGL algorithms be supported. I hope Justin would look at using your concept organization as a starting point. Could one assume that a tree library implemented as described would be usable with Spirit for AST's and parse trees? Jeff Flinn

Jeff Flinn wrote:
Could one assume that a tree library implemented as described would be usable with Spirit for AST's and parse trees?
For the design I only established goals in terms of the tree structures themselves. I am not very familiar with the current AST and parse tree code in Spirit, but I'd be in favor (meaning I might participate ;) ) of a rewrite to accomodate generic trees and easier customization of the tree generation process. I've given this a quick thought before and *from the distance* it seems feasible... Still related to Spirit, one of the things I wanted to use the Oak Leaf library for was an implementation of the Symbol Table parser (using a Ternary Search Tree) in terms of binary search trees (RB, AVL, splay, ...). Best regards, João

On 02/24/2005 07:26 AM, Jeff Flinn wrote: [snip]
I thoroughly agree with all you've stated. I think this is the best foundation for a generic tree library effort that I've seen. In particular I too would think it imperative that the interaction with existing STL and BGL algorithms be supported. I hope Justin would look at using your concept organization as a starting point.
Could one assume that a tree library implemented as described would be usable with Spirit for AST's and parse trees?
Could it also be used with or merged with multi_array. IOW, a multi_array, m, can be thought of as a tree, where tree depth is m.num_dimensions() and each level of the tree has the same depth and number of children. This is not all that far fetched, as I think it could be used to calculate the follow sets of a Spirit Grammar since the follow set is defined in matrix terms (with transpose and closure operations) by Thomas W. Christopher [ p. 26 of http://facweb.cs.depaul.edu/tchristopher/grmanl.pdf ] Since the grammar would be stored as some sort of tree, which can be thought of as a sparse matrix or sorts, then calculating the transpose and closure would be matrix (or multi-array) operations.

Larry Evans wrote:
Could it also be used with or merged with multi_array. IOW, a multi_array, m, can be thought of as a tree, where tree depth is m.num_dimensions() and each level of the tree has the same depth and number of children.
I think the interface for a multi_array is different from what one would expect from a tree. So I don't think they should be merged. From what you say, maybe multi_array could use a tree as a backend. Still, if you need indexed access to the nodes, I think you'd be better off with a wrapper around a vector and an index translation scheme than with a tree. Hmm... On second thought, you could probably have a tree view on top of a multi_array or the other way around (I'd go for the first, if I was to choose...) Best, João

"Jeff Flinn" <TriumphSprint2000@hotmail.com> wrote in message news:cvknuj$rb9$1@sea.gmane.org...
I thoroughly agree with all you've stated. I think this is the best foundation for a generic tree library effort that I've seen. In particular I too would think it imperative that the interaction with existing STL and BGL algorithms be supported. I hope Justin would look at using your concept organization as a starting point.
Hi Joao and Jeff - I'm definitely using parts of Joao's tree design / explanation in the core::tree redesign. =)

I see I'm a little late to the party. However, I'd like to summarize what I have noticed as the important concepts that have been exposed, and chip in my own 2c. I think the core concept that should be recognized is that a tree is not a container of objects, but a container of containers. In this way, it is somewhat like a deque. However, a deque has a single natural sequence over all contained elements, whereas a tree has many natural sequences. Now technically, a tree is not a container of *just* containers. It is a container of objects plus containers. But the only other core property of trees is that the contained containers are themselves of the same type as the tree, as Justin originally observed. Beyond that, I don't see a need to distinguish between a tree and a node, and it's my intuition that eliminating the distinction should result in a more elegant solution. Second, trees are complicated because there are two sets of iterators that need to be addressed. First is the iterator over objects in the root, and second is the iterator over children. However, they are two separate kinds of iterators. Dereferencing the object iterator should give you an iterator, while dereferencing a child iterator should give you a tree. Since whole-container algorithms expect an iterator that traverses the entire structure, it should be clear that a linear iterator must lay outside the tree itself and manage the fact that two different types of implementation iterators need to be managed. The reason this distinction is so easily overlooked is that most trees have a trivial key collection of size 1, so it isn't clear that an iterator is needed at all. But I think a generic tree interface should recognize that some trees have non-trivial key sets and provide both kinds of iterators. I haven't been able to look at Jeff's library because I can't seem to access his web site, but it sounds to me like a good design would consist of a tree type (which some might intuitively think of as a node type) which exposes fundamental operations for accessing keys and children, and a set of algorithms which allow one to transform one tree into another by moving around subtrees. Finally, flat iterators would have to be built on top of the two iterator types presented by the tree itself. It makes sense that those iterators would be parameterized on the tree type itself, since it needs to know about both kinds of embedded iterators. I'll offer this as fodder for thought: template <typename T, int K> class key_tuple { public: typedef unspecified key_iterator; public: key_iterator begin(); key_iterator end(); T const& get(int pos); // convenience function void set(int pos, T const& key); // convenience function private: T key_[K]; }; template <typename Tree, int N> class child_tuple { public: typedef unspecified child_iterator; public: child_iterator begin(); child_iterator end(); Tree* get(int pos); // convenience function void set(int pos, Tree const* child); // convenience function private: Tree* child_[N]; }; template <typename T, int N = 2, int K = 1> class tree { public: typedef key_tuple<T, K> key_type; typedef child_tuple<tree, N> child_type; public: key_type key(); child_type child(); private: key_tuple<T, K> key_; child_tuple<tree, N> child_; }; template <typename T, int N, int K> bool is_leaf(tree<T, N, K>* t) { return std::find(t.child().begin(), t.child().end(), 0) == t.child().end(); } template <typename T, int N, int K> void find_first(tree<T, N, K>* t) { tree<T, N, K>* left = t.child().get(0); return left ? find_first(left) : t; } etc., etc. Clearly, the tree invariants would need to be carefully preserved by the algorithms which operate on them. But since different trees have different invariants, it makes more sense to group the algorithms by the set of common invariants they enforce, rather than by a concrete tree type. Some algorithms will have very few invariant requirements, such as find_first() and find_last(), and will work on any ordered tree. Others may have more stringent requirements, but defining the tree as a pair of tuples and a set of algorithms on those tuples seems like the best factorization of behaviors possible. At some level, we could package a set of invariants and associated algorithms as a "soft" (less parameterized) type that exposes a clean member interface for tree operations. But given the diversity of tree types, I think this open structure, which might seem to break all the rules of encapsulation, is the right way to go at the lowest level. Dave

"David B. Held" <dheld@codelogicconsulting.com> wrote in message news:cvqlqp$fnk$1@sea.gmane.org...
I see I'm a little late to the party. However, I'd like to summarize what I have noticed as the important concepts that have been exposed, and chip in my own 2c. I think the core concept that should be recognized is that a tree is not a container of objects, but a container of containers.
Yes, I couldn't agree more. =)
Second, trees are complicated because there are two sets of iterators that need to be addressed. First is the iterator over objects in the root, and second is the iterator over children. However, they are two separate kinds of iterators.
Well, I think this depends on how the tree's structure is formed. I think it's possible to achieve the tree building with a single iterator type and then add additional iterators as needed. The reason I say this is that if you consider the initial tree object as the "root node" (owning one of the contained objects of the tree), you should only need a single iterator type since by definition a tree can only have one root node. Since you wouldn't need to iterator over the objects contained within the root level as by definition, they'd be children of the root node. At least, this is how I handled this problem in my tree container. Nonetheless, as you've pointed out Dave - this is a very important issue to tackle, regardless of the way you attack it. =)
I haven't been able to look at Jeff's library because I can't seem to access his web site
I've sent Jeff a couple e-mails and am requesting he present his tree container ideas to the Boost group before Joao and I continue to work on building the generic base tree type, as his solution may be the solution to use. The last thing I want to do is overlook his design since it seems quite nice - although I have some questions about some of its functionality and reasoning behind some implementation details. I've asked Jeff if he could write some sample code showing how to build some basic trees and present that to the group. I hope he can find time to do this as I don't want anyone to feel "slighted" regarding any of their tree designs. Whatever happens I think we need to maintain our forward momentuum (which we've been doing wonderfully), without it this tree discussion could easily fall prey to what happened to Kasper Peeter's tree discussion in 2002. =( I don't think any of us want to see that happen.
Finally, flat iterators would have to be built on top of the two iterator types presented by the tree itself.
Yes, I'm with you on that logic.
Clearly, the tree invariants would need to be carefully preserved by the algorithms which operate on them. But since different trees have different invariants, it makes more sense to group the algorithms by the set of common invariants they enforce, rather than by a concrete tree type. Some algorithms will have very few invariant requirements, such as find_first() and find_last(), and will work on any ordered tree. Others may have more stringent requirements, but defining the tree as a pair of tuples and a set of algorithms on those tuples seems like the best factorization of behaviors possible.
Ummm, hmmm. I need to think about this some more. =) Thanks for the great feedback Dave, Justin

Justin Gottschlich wrote:
"David B. Held" <dheld@codelogicconsulting.com> wrote in message news:cvqlqp$fnk$1@sea.gmane.org... [...]
Second, trees are complicated because there are two sets of iterators that need to be addressed. First is the iterator over objects in the root, and second is the iterator over children. However, they are two separate kinds of iterators.
Well, I think this depends on how the tree's structure is formed. I think it's possible to achieve the tree building with a single iterator type and then add additional iterators as needed. The reason I say this is that if you consider the initial tree object as the "root node" (owning one of the contained objects of the tree), you should only need a single iterator type since by definition a tree can only have one root node. Since you wouldn't need to iterator over the objects contained within the root level as by definition, they'd be children of the root node. At least, this is how I handled this problem in my tree container. [...]
It sounds to me like you are not considering the general case where a given node can have K keys and N children. Now typically, K = 1, which is why I made it the default in the sample code. That is what you are talking about. But for a Btree, you often have K = N - 1. That is, each node contains K "objects" or "keys", and N children. Yes, you could provide a single iterator type that iterates over keys both *within* a node and *among* nodes, but it would be a fat iterator (at least algorithmically, if not structurally). I think a better decomposition is to recognize that for trees with large aggregate nodes, the fundamental operations are to access the keys within the node, and access the children separately. To illustrate, let me offer an exaggeration. Suppose we build a Btree for use in a disk-based container. We could build it so that K = 64, N = 4. That is, each node contains 64 "keys", but only 4 children. There are numerous ways to distribute keys and children in such a configuration, and there is no a priori reason for an implementation to exclude the possibility of any given distribution. So such a tree would have a narrow branching factor, but fat nodes. That might better correspond to the size of disk sectors while not wasting too much space on a wide branching factor. While it may seem wasteful to treat the key set for a node as an array, I suspect that for the degenerate cases where K = 1, the array access key_[0] will get optimized at compile time (and we could certainly provide convenience functions that aid in this optimization). Dave

"David B. Held" <dheld@codelogicconsulting.com> wrote in message news:cvu3uj$mrk$1@sea.gmane.org...
It sounds to me like you are not considering the general case where a given node can have K keys and N children. Now typically, K = 1, which is why I made it the default in the sample code. That is what you are talking about. But for a Btree, you often have K = N - 1. That is, each node contains K "objects" or "keys", and N children. Yes, you could provide a single iterator type that iterates over keys both *within* a node and *among* nodes, but it would be a fat iterator (at least algorithmically, if not structurally). I think a better decomposition is to recognize that for trees with large aggregate nodes, the fundamental operations are to access the keys within the node, and access the children separately.
To illustrate, let me offer an exaggeration. Suppose we build a Btree for use in a disk-based container. We could build it so that K = 64, N = 4. That is, each node contains 64 "keys", but only 4 children. There are numerous ways to distribute keys and children in such a configuration, and there is no a priori reason for an implementation to exclude the possibility of any given distribution. So such a tree would have a narrow branching factor, but fat nodes. That might better correspond to the size of disk sectors while not wasting too much space on a wide branching factor.
Ahh yes, I see what you're saying now - I wasn't understanding correctly. If you mentioned the B-tree example in your original post, I must have completely missed it, but your point in regards to a B-tree cleared it right up for me. I don't believe has been touched upon until your post, so doubly-thank you for bringing it up (if somebody else did bring it up already and I just missed it, sorry for being so blind =). Thanks for the extra explanation Dave, =) Justin
participants (16)
-
Andreas Pokorny
-
christopher diggins
-
Darren Cook
-
David B. Held
-
Fredrik Blomqvist
-
Jason Hise
-
Jeff Flinn
-
Joao Abecasis
-
Joaquín Mª López Muñoz
-
Justin Gottschlich
-
Larry Evans
-
Pavel Vozenilek
-
Rene Rivera
-
Richard Peters
-
Rogier van Dalen
-
Thorsten Ottosen