Hi Ion, Here are some comments I have from reading the online documentation. I haven't followed the review discussion (because I don't have the time), so I apologize if my comments have already been discussed. I was originally review manager for Olaf, and was almost ready to allow a review when he said he did not have time to finnish the library. From what I recall, the docs have been improved somewhat, so kudos for that work. 1. "Constant time size: The intrusive container holds a size member that it's updated with every insertion/erasure. This implies that the size() function has constant time complexity. On the other hand, increases the size of the container, and some operations, like splice() taking a range of iterators, have linear time complexity in linked lists." Howward Hinnant proposed a new splice() function that allows one to pass the size into splice() to keep splice() O(1). Would this not be an idea here? 2. My general view of the design: fairly complicated, though the extra speed can be worth the effort. I find the amount of typedef'ing fairly large and can't help feeling it could be simplified. 3. Is 'Foo' really needed twice: typedef boost::intrusive::ilist_base_hook<Foo>::value_traits<Foo> FooValueTraits; ??? 4. The cloning example (http://ice.prohosting.com/newfunk/boost/libs/intrusive/doc/html/intrusive/cl...) I don't quite get; I thought the whole point of intrusive container was to avoid memory allotions. Also, how does this affect exception-safety? In general I feel this is an important library with many uses for performance savy applications. I feel the quality in general is high. I had one major discussion with Olaf which relates to the general easy-of-use and also to space-tradeoffs: ---------------------------------------- When it is a design-time decision to allow the object to be inserted into 1 container, one object always carry the node-structure overhead (that is also when I don't insert the object into an intrusive container). Furthermore, If some objects must be in more than one container, all objects of that type pay the space penalty even though only a few objects are in several containers. Third, the compile-time dependency is fairly complicated, and would be cool to get rid of. Therefore I think there are compelling reasons for exploring a different design: an extrusive design where the node-structure is not part of the object. This immediately eliminates three problems: -1- there is no compile-time dependency -2- there is by default no any space overhead -3- there is by default no problem or contstraint when an object is added to more containers at runtime The approach has one serious drawback: -1- it might require extra memory This happens when all live objects are added to an intrusive container. An example: consider an ilist of size N. The node structure requires two T* objects and so the space required is N * 2 * sizeof(T*) An extrusive approach would look a bit like this: template< class T > class extrusive_list { struct Node { T* prev; T* next; T* me; }; std::vector<Node> nodes; }; Thus the space required would be N * 3 * sizeof(T*) if all N objects are put into the container. If less than 2/3's of the live objects are put into the container, the extrusive version takes up less space. As you add support for several containers, the extrusive version is even more likely to save space IMO. Now, I haven't actually implemented extrusive containers as out-lined above, but the simplity of use and the potential space savings tells me that it would be well worth persue. Let me know what you all think. best regards Thortsen
Hi Thorsten, Thorsten Ottosen wrote:
Hi Ion,
[snip]
I was originally review manager for Olaf, and was almost ready to allow a review when he said he did not have time to finnish the library.
I would like to thank Olaf for his great work. It's a shame we had no time to finish the library but I think the need of a good intrusive library (I badly needed one for Interprocess containers and memory algorithms) has pushed this to the review.
From what I recall, the docs have been improved somewhat, so kudos for that work.
Joaquin is the main responsible of the general improvement. I was scared every time I received comments from him ;-)
1. "Constant time size: The intrusive container holds a size member that it's updated with every insertion/erasure. This implies that the size() function has constant time complexity. On the other hand, increases the size of the container, and some operations, like splice() taking a range of iterators, have linear time complexity in linked lists."
Howward Hinnant proposed a new splice() function that allows one to pass the size into splice() to keep splice() O(1). Would this not be an idea here?
It's already there. "On list's size" (http://home.twcny.rr.com/hinnant/cpp_extensions/On_list_size.html) was the main motivation ;-)
2. My general view of the design: fairly complicated, though the extra speed can be worth the effort. I find the amount of typedef'ing fairly large and can't help feeling it could be simplified.
Ok.
3. Is 'Foo' really needed twice:
typedef boost::intrusive::ilist_base_hook<Foo>::value_traits<Foo> FooValueTraits;
???
The first Foo is a tag. Previously, each base hook needed a number to distinguish different base hooks from each other. Recently this was changed with a tag, and the example just uses Foo as a tag. I could have used any other type. Or as Daniel James suggests, a default tag, if only have one hook. Something like: typedef boost::intrusive::ilist_base_hook<>::value_traits<Foo>
4. The cloning example (http://ice.prohosting.com/newfunk/boost/libs/intrusive/doc/html/intrusive/cl...)
I don't quite get; I thought the whole point of intrusive container was to avoid memory allotions. Also, how does this affect exception-safety?
Yes the whole point is to avoid memory allocations, but also offer an easy way to build non-intrusive containers. For example, copying a red-black tree is not efficient if you don't do an structural copy (which also has better exception guarantees, because the potentially throwing comparison functor is not used). Some copies are recursive and need to create values as the cloning algorithm does the work. Obviously, the cloning function is allowed to throw but the function will only throw if the cloner throws whereas the usual clear()+insert() can throw with the predicate. This cloner can be used, for example, to implement an assignment that recycles the memory where the old values where placed. I agree that for list and slist the cloning algorithm is not complicated, but I wanted to establish a common cloning function for all the algorithms, so the can have access to the most efficient cloning algorithm (structural copy) without knowing the internals of the intrusive container.
In general I feel this is an important library with many uses for performance savy applications. I feel the quality in general is high.
Thanks.
I had one major discussion with Olaf which relates to the general easy-of-use and also to space-tradeoffs: ----------------------------------------
When it is a design-time decision to allow the object to be inserted into 1 container, one object always carry the node-structure overhead (that is also when I don't insert the object into an intrusive container).
Furthermore, If some objects must be in more than one container, all objects of that type pay the space penalty even though only a few objects are in several containers.
Right.
Third, the compile-time dependency is fairly complicated, and would be cool to get rid of.
Compile-time dependency is basically because of performance reasons. The hook can be placed everywhere in the type (any base or member at any offset) and having this information at compile time makes code faster, instead of querying this at runtime.
Therefore I think there are compelling reasons for exploring a different design: an extrusive design where the node-structure is not part of the object.
Interesting. "Extrusive" sounds cool ;-)
This happens when all live objects are added to an intrusive container.
An example: consider an ilist of size N. The node structure requires two T* objects and so the space required is
N * 2 * sizeof(T*)
Ok.
An extrusive approach would look a bit like this:
template< class T > class extrusive_list { struct Node { T* prev; T* next; T* me; }; std::vector<Node> nodes; };
Thus the space required would be
N * 3 * sizeof(T*)
When an a node is erased from the container, I suppose you are you going to erase it from the vector (otherwise, you would need to track the free and used Nodes), so depending when the Node was placed (in the beginning or end of the vector), the overhead of an erasure can be high. Wait! Another option would be to convert a Node in a union so that free Nodes can form a linked list when they are not used, and add to to extrusive_list a pointer to the first free Node. If there are no free Nodes to reuse, the vector can be safely expanded and the last nodes can form again a linked list of free nodes. But you lose exception safety in insertions, a unique property of intrusive containers, because the needed memory is already allocated.
Let me know what you all think.
Still not sure about the advantages (and I'm afraid they are not trivial to implement efficiently), but I'm interested in the idea. Are you volunteering to write the alter-ego of Intrusive, Extrusive? Even this Extrusive library would benefit from its evil twin Intrusive ;-)
best regards
Thortsen
Regards, Ion
[This should have been on the dev-list, but ended up here by mistake; anyway, let's continue it here] Ion Gaztañaga skrev:
Hi Thorsten,
Thorsten Ottosen wrote:
Therefore I think there are compelling reasons for exploring a different design: an extrusive design where the node-structure is not part of the object.
Interesting. "Extrusive" sounds cool ;-)
This happens when all live objects are added to an intrusive container.
An example: consider an ilist of size N. The node structure requires two T* objects and so the space required is
N * 2 * sizeof(T*)
Ok.
An extrusive approach would look a bit like this:
template< class T > class extrusive_list { struct Node { T* prev; T* next; T* me;
hm...correction: struct Node { Node* prev; Node* next; T* me; };
}; std::vector<Node> nodes; };
Thus the space required would be
N * 3 * sizeof(T*)
When an a node is erased from the container, I suppose you are you going to erase it from the vector (otherwise, you would need to track the free and used Nodes), so depending when the Node was placed (in the beginning or end of the vector), the overhead of an erasure can be high.
Wait! Another option would be to convert a Node in a union so that free Nodes can form a linked list when they are not used, and add to to extrusive_list a pointer to the first free Node. If there are no free Nodes to reuse, the vector can be safely expanded and the last nodes can form again a linked list of free nodes.
This is good thinking :-) Alternatively I think the index into the vector of the first node in the unused list could work.
But you lose exception safety in insertions, a unique property of intrusive containers, because the needed memory is already allocated.
right; however, the exception-safety could be guaranteed if the user call's set_capacity() to be large enough to hold the number of anticipated insertions. Typically you would do this up front.
Let me know what you all think.
Still not sure about the advantages (and I'm afraid they are not trivial to implement efficiently), but I'm interested in the idea. Are you volunteering to write the alter-ego of Intrusive, Extrusive? Even this Extrusive library would benefit from its evil twin Intrusive ;-)
I'm not volunteering, no. Too busy even to do a proper review of your library. I have zero experience with the normal use cases for intrusive containers. What I would like to know is if typically all objects is inside intrusive containers; also for the use of 2,3, or 4 simultaneous containers. -Thorsten
I have zero experience with the normal use cases for intrusive containers. FWIW very typical use cases are algorithms which need to compute the iterator to an element from the element itself. Whenever I've used intrusive containers so far, I used them because of their ability to do
What I would like to know is if typically all objects is inside intrusive containers; also for the use of 2,3, or 4 simultaneous containers. I wouldn't say so. In fact, at least one triangulation algorithm needs to maintain and iterate over a list of all concave vertices of a polygon again and again (the list becomes smaller during the triangulation
Hi,
Thorsten Ottosen wrote:
that computation in constant time and to detect if they are in a
particular container respectively. I think, that constant-time
detection, whether a given element is in a container and constant-time
computation of an iterator are outstanding abilites of intrusive
containers.
process). But usually in a typical polygon there are much more convex
vertices. Thus just marking the vertices with a flag was no option,
because you'd have to iterate over the whole set of vertices just to
(re)find the concave ones. A "std::list
Olaf Krzikalla skrev:
Hi,
I have zero experience with the normal use cases for intrusive containers. FWIW very typical use cases are algorithms which need to compute the iterator to an element from the element itself. Whenever I've used intrusive containers so far, I used them because of their ability to do
Thorsten Ottosen wrote: that computation in constant time and to detect if they are in a particular container respectively. I think, that constant-time detection, whether a given element is in a container and constant-time computation of an iterator are outstanding abilites of intrusive containers.
Ok, that seems hard to do with the extrusive approach; unless the the actual algorithm can be made to work on list::node instead of the original container.
What I would like to know is if typically all objects is inside intrusive containers; also for the use of 2,3, or 4 simultaneous containers. I wouldn't say so. In fact, at least one triangulation algorithm needs to maintain and iterate over a list of all concave vertices of a polygon again and again (the list becomes smaller during the triangulation process). But usually in a typical polygon there are much more convex vertices. Thus just marking the vertices with a flag was no option, because you'd have to iterate over the whole set of vertices just to (re)find the concave ones. A "std::list
concaves;" could help and in usual cases generate less memory overhead than the intrusive approach. However, the triangulation algorithm also needs to detect, whether a given vertex was concave. By using the linked() property of the intrusive node used by the "concave ilist" it was possible to have a very fast test for that. IMHO a typical example of the classic space vs. time tradeoff.
Ok. How badly would the algorithm perform if these two types of test where only O(lg N) ? -Thorsten
Hi, Thorsten Ottosen wrote:
What I would like to know is if typically all objects is inside intrusive containers; also for the use of 2,3, or 4 simultaneous containers.
I wouldn't say so. In fact, at least one triangulation algorithm needs to maintain and iterate over a list of all concave vertices of a polygon again and again (the list becomes smaller during the triangulation process). But usually in a typical polygon there are much more convex vertices. Thus just marking the vertices with a flag was no option, because you'd have to iterate over the whole set of vertices just to (re)find the concave ones. A "std::list
concaves;" could help and in usual cases generate less memory overhead than the intrusive approach. However, the triangulation algorithm also needs to detect, whether a given vertex was concave. By using the linked() property of the intrusive node used by the "concave ilist" it was possible to have a very fast test for that. IMHO a typical example of the classic space vs. time tradeoff. Ok. How badly would the algorithm perform if these two types of test where only O(lg N) ? Hmm, from a quick glance I can definitely say it raises from O(m*n) to at least O(m*n*lg(n)), maybe even worse (m being the number of all veritces in a polygon, n being the number of concave vertices in it).
Best Olaf
Olaf Krzikalla skrev:
Hi,
Thorsten Ottosen wrote:
What I would like to know is if typically all objects is inside intrusive containers; also for the use of 2,3, or 4 simultaneous containers. I wouldn't say so. In fact, at least one triangulation algorithm needs to maintain and iterate over a list of all concave vertices of a polygon again and again (the list becomes smaller during the triangulation process). But usually in a typical polygon there are much more convex vertices. Thus just marking the vertices with a flag was no option, because you'd have to iterate over the whole set of vertices just to (re)find the concave ones. A "std::list
concaves;" could help and in usual cases generate less memory overhead than the intrusive approach. However, the triangulation algorithm also needs to detect, whether a given vertex was concave. By using the linked() property of the intrusive node used by the "concave ilist" it was possible to have a very fast test for that. IMHO a typical example of the classic space vs. time tradeoff. Ok. How badly would the algorithm perform if these two types of test where only O(lg N) ? Hmm, from a quick glance I can definitely say it raises from O(m*n) to at least O(m*n*lg(n)), maybe even worse (m being the number of all veritces in a polygon, n being the number of concave vertices in it).
Ok. And what are typical values of m and n? -Thorsten
Hmm, from a quick glance I can definitely say it raises from O(m*n) to at least O(m*n*lg(n)), maybe even worse (m being the number of all veritces in a polygon, n being the number of concave vertices in it).
Ok. And what are typical values of m and n? Hard to say. Depends on the application domain. At http://www.cs.man.ac.uk/~toby/alan/software/
Hi, Thorsten Ottosen wrote: there is an example with m=2763, n~1000. I remember of practical problems with m=200 and m~n (where a strict O(m*lg(m)) algorithm would have done better, but we needed a 'nice' triangulation). Best Olaf
participants (3)
-
Ion Gaztañaga
-
Olaf Krzikalla
-
Thorsten Ottosen