
Ion: I have limited time to make a review, so I'm sorry that I had to concentrate on the documentation without reviewing the code, which I will do at a later date. However, what I saw from the documentation is by-and-large how I would have done a design, no surprises there but some nice twists and a very well thought-out interface. The devil being in the detail I'm glad you validated the design with some working code. This being said, this is a first-rate and very useful library, and I vote whole-heartedly to accept it to boost. I have a few spot-comments on the documentation, to follow at the end of this message. My main criticism is with one aspect of the design: in std::set and other associative container, the key (part of the value_type that determines the order) is constant, and cannot be changed (once constructed) since the std::set interface only grants constant access to it. This makes it possible to maintain the invariant of the std::set class. It is no longer possible to guarantee this with intrusive containers. You need to mention that upfront in the doc of iset with a big warning sign! But now I must mention that this is actually a plus!!! In geometric algorithms (Fortune's sweep for Voronoi diagram, to be specific), the key is an intersection with the sweep line, and varies over time. To simplify a long story, the algorithm has discrete events at which the order can change, but the invariant is maintained by swapping two elements whose keys "cross". In order to guarantee logarithmic time per event, some kind of balanced tree is needed. Naturally one would like to use std::set, but because the key changes and the user (no longer the class) is responsible for maintaining the set invariant, one must go through hoops, and use mutable data members, etc. It gets ugly! The worse part is that one must add two nodes which are equal (at the critical sweep event, they'll get ordered as soon as the sweep line starts to move), but std::set does not allow to add the two nodes with guaranteed relative position, without performing some comparisons (which have no geometric meaning: the comparison object must be tweaked). It gets *very* ugly! For a long time, I have advocated separating the rbtree algorithms from the actual data structure, for reuse. I wrote a paper with Jyrki Katajainen (now a CPH STL report) about this. One can imagine applying rbtree algorithms, or avl_algorithms, or splay_tree_algorithms, all of them without a class; each offers a different tradeoff. In particular, I'd expect treaps to perform very well. In essence, this is exactly what your library does: it separates storage from algorithms, and thus offers the potential for reuse. This is the promise of generic programming even beyond the C+ + STL. You might add my Fortune's sweep example to the others you provide (legacy objects, etc.) I imagine kinetic data structures and some graph algorithms would be fine consumers of this approach as well. I think it is very natural, and would very much like to (but won't have the time), extend your framework with (geometric) halfedge and (combinatorial) graph-like data structures. I think it will be a great winner there, even more than standard C++ containers (one dimensional). Ultimately, because geometric algorithms typically require building several data structures on top of geometric objects, the framework here has even more value than the implementations. I would really like to see a chapter in the documentation "Guidelines for extending the framework to domain-specific containers". of course, it is not a requirement for the library, merely a suggestion but given the potential for extension, you will have questions. A graph-like example compatible with the Boost Graph Library would superbly demonstrate the power of the approach! Who knows, it could even become part of the intrusive library one day. After reading your library, I feel that this is how my generic HDSTL (half-edge data structure library) should be implemented (it remains as a half- baked research project, having been published in WAE'2001). I will take a longer look but the end of the review approaching, I feel it's best to dump this now and promise later :) Thanks for a very promising library, and regards, -- Herve Bronnimann PS: Here are the more important comments I jotted down while reading the docs: I spent about two hours doing so, so forgive the sketchiness and incompleteness of the review, but one more review is always better than nothing :) * documentation: I wanted to see, while browsing the doc and clicking on the class name, the synopsis for header, in addition to the class interface for the class (following the C++ standard). But actually, by the end of the documentation, we get the header synopsis so the physical architecture of the library is clear. * documentation: iterator and const_iterator types seems to be missing consistently for each class interface (from islist, ilist, but not from iset/imultiset.) * documentation: bool ConstantTimeSize = (default not provided in ilist and islist). * design: const/mutable value type? While it is possible to violate the invariants of a std::set, that requires willfull wrongdoing from the programmer, whereas it seems much easier to inadvertently do so with intrusive set. Must be pointed out with a big warning sign. * documentation: iterator validity guarantees for islist are part of individual functions, an overview of the iterator validity rules in the description would be welcome. * indeed, the two foremost advantages of intrusive containers are: non-invalidation rules for iunordered_set, and possibility to store derived objects in the intrusive container (without extra indirection, that is - it's possible with storing pointers to object in std containers). * combination of smart pointers + intrusive containers: the intrusive container can't use simple pointers if the object lifetimes are managed by smart pointers. Thus it appears that having managed objects actually either forces the intrusive containers to use same managed pointers, or else to use the auto-unhook feature. This is not well discussed in the relevant page (about smart pointers and intrusive containers). BTW, I don't understand the requirement on smart pointer that "It must have the same ownership semantics as a raw pointer." I don't see why a shared_ptr couldn't be used, in fact, I would find some advantages to doing so (preferable imho to the auto-unhook feature). But I haven't thought too long about it. * rbtree_algorithms: here's a great opportunity to incorporate the order-statistic tree (augmented rbtree) with additional requirements on the node traits. If there is an int to spare in the data, then you can use it. You only have to make sure you use the correct algorithms (if rotating, etc., then the node's size field must be recomputed too for the order-statistic operations to be correct). The extra field can be computed at any given time (not necessary to maintain it through the insertions/manipulation) if only needed in one section of the code, and even reclaimed later if needed. This is similar to having a singly or doubly linked list sometimes or in other situations (see page "custom valueTraits"). Here is another use for that schema of using storage and reclaiming it when no longer needed: insertion in singly linked list (which is faster) and some operations on it (e.g. move-to-front heuristic which only requires islist), then making doubly-linked list (in linear time by setting the reverse pointer) to allow more operations when that is need (e.g. entering another section of the code). Or reusing the two same list pointers for an iset if the ilist structure is no longer needed.

Hi Hervé, Hervé Brönnimann wrote:
This being said, this is a first-rate and very useful library, and I vote whole-heartedly to accept it to boost.
Thanks.
My main criticism is with one aspect of the design: in std::set and other associative container, the key (part of the value_type that determines the order) is constant, and cannot be changed (once constructed) since the std::set interface only grants constant access to it.
I'm not sure about this. Maybe some STL expert can correct me but I think there are implementations where set::value_type can be modified. But I'm not sure. Anyway, I find this to be a downside, because a non-mutable value_type is quite limiting.
This makes it possible to maintain the invariant of the std::set class. It is no longer possible to guarantee this with intrusive containers. You need to mention that upfront in the doc of iset with a big warning sign!
I will mention this. I think this is essential, because otherwise, we couldn't implement other associative containers (like std::map, for example) above Intrusive.
For a long time, I have advocated separating the rbtree algorithms from the actual data structure, for reuse. I wrote a paper with Jyrki Katajainen (now a CPH STL report) about this. One can imagine applying rbtree algorithms, or avl_algorithms, or splay_tree_algorithms, all of them without a class; each offers a different tradeoff. In particular, I'd expect treaps to perform very well. In essence, this is exactly what your library does: it separates storage from algorithms, and thus offers the potential for reuse. This is the promise of generic programming even beyond the C+ + STL.
My goal is to include all the algorithms people find useful, so that we could use those algorithms directly or build Intrusive containers using avl_trees, etc... For example, null ended linked lists are also useful.
I think it is very natural, and would very much like to (but won't have the time), extend your framework with (geometric) halfedge and (combinatorial) graph-like data structures. I think it will be a great winner there, even more than standard C++ containers (one dimensional). Ultimately, because geometric algorithms typically require building several data structures on top of geometric objects, the framework here has even more value than the implementations.
I'm open to contributions, because my knowledge in this domain is null. Although I've optimized a bit red-black tree algorithms, I don't even know how a red-black tree works ;-) I think Intrusive framework can be extended with more data types from users.
A graph-like example compatible with the Boost Graph Library would superbly demonstrate the power of the approach!
It seems that you know a lot about this issue, would you mind sending me an small example?
* documentation: I wanted to see, while browsing the doc and clicking on the class name, the synopsis for header, in addition to the class interface for the class (following the C++ standard). But actually, by the end of the documentation, we get the header synopsis so the physical architecture of the library is clear.
Do you mean when the class names appears in the example code? I will need to investigate a bit, because I'm currently using Quickbook code importing facilities and I don't even know if this is possible. I'll try to investigate a bit.
* documentation: iterator and const_iterator types seems to be missing consistently for each class interface (from islist, ilist, but not from iset/imultiset.)
Ok. Thanks.
* documentation: bool ConstantTimeSize = (default not provided in ilist and islist).
Yes, this is a Boostbook but already solved in HEAD. The next documentation version will fix this.
* design: const/mutable value type? While it is possible to violate the invariants of a std::set, that requires willfull wrongdoing from the programmer, whereas it seems much easier to inadvertently do so with intrusive set. Must be pointed out with a big warning sign.
Ok.
* documentation: iterator validity guarantees for islist are part of individual functions, an overview of the iterator validity rules in the description would be welcome.
Ok.
* indeed, the two foremost advantages of intrusive containers are: non-invalidation rules for iunordered_set, and possibility to store derived objects in the intrusive container (without extra indirection, that is - it's possible with storing pointers to object in std containers).
Good information for the introduction.
* combination of smart pointers + intrusive containers: the intrusive container can't use simple pointers if the object lifetimes are managed by smart pointers. Thus it appears that having managed objects actually either forces the intrusive containers to use same managed pointers, or else to use the auto-unhook feature. This is not well discussed in the relevant page (about smart pointers and intrusive containers).
The documentation states that the smart pointer should have the same ownership semantics as raw pointers, so smart pointers can't managed lifetime.
BTW, I don't understand the requirement on smart pointer that "It must have the same ownership semantics as a raw pointer." I don't see why a shared_ptr couldn't be used, in fact, I would find some advantages to doing so (preferable imho to the auto-unhook feature). But I haven't thought too long about it.
Basically because this would break the code. The pointers shouldn't manage the lifetime. The lifetime must be managed externally.
* rbtree_algorithms: here's a great opportunity to incorporate the order-statistic tree (augmented rbtree) with additional requirements on the node traits. If there is an int to spare in the data, then you can use it. You only have to make sure you use the correct algorithms (if rotating, etc., then the node's size field must be recomputed too for the order-statistic operations to be correct). The extra field can be computed at any given time (not necessary to maintain it through the insertions/manipulation) if only needed in one section of the code, and even reclaimed later if needed. This is similar to having a singly or doubly linked list sometimes or in other situations (see page "custom valueTraits").
I'm not familiar with this augmented rbtree. The color is supposed to be a bit, so that some node implementations can embed it in the parent color. By default this color bit is embedded for Intrusive hooks if the alignment of the node is even. Maybe this new augmented rbtree must be added as a new node algorithm class with its own NodeTraits definition. Nodes compatible with this augmented rbtree will be compatible with the simple rbtree node algorithms class.
Thanks for a very promising library, and regards,
Thanks to you. Please if you feel the library can be improved better adding more algorithms, feel free to propose and add your own algorithms. But please, don't ask me to maintain them ;-)
Herve Bronnimann
Thanks for you review and suggestions, Ion
participants (2)
-
Hervé Brönnimann
-
Ion Gaztañaga