Intrusive containers: second iteration

Hi, the second version of intrusive containers is available. Excerpt from the introduction: ---snip--- Welcome to Boost.Intrusive, a library presenting some intrusive containers to the world of C++. While intrusive containers were and are widely used in C, they became more and more forgotten in the C++-world due to the presence of the standard containers, which don't support intrusive techniques. Boost.Intrusive not only reintroduces this technique to C++, but also encapsulates the implementation in STL-like interfaces. Hence anyone familiar with standard containers can use intrusive containers with ease. ---snap--- You find the file intrusive.zip either in the yahoo file section or here: http://people.freenet.de/turtle++/intrusive.zip I consider this version being rather complete now and quite ready for a formal review (barring islist maybe). Best regards Olaf Krzikalla

Hi, I'm currently developing some container libraries for my graduation thesis, that has some similarity to this one. So some questions came up in my stomach. First of all, what is this intrusive approach all about? As far as I understood the stored elements are not owned by the containers itself, but allocated by some other object. But why not using a reference type and storing this in the usual non-instrusive containers? To depict that idea, I assembled a small example: #include <iostream> #include <set> using namespace std; template<class T> class Ref { public: Ref(T* obj): obj(obj) {} inline operator const T&() const { return *obj; } inline bool operator<(const Ref<T>& b) const { return *obj < *b.obj; } private: T* obj; }; class MyRec { public: MyRec(int x, int y, int z): x(x), y(y), z(z) {} int x,y,z; inline bool operator<(const MyRec& b) const { return z < b.z; } }; typedef Ref<MyRec> MyRef; int main() { MyRec a(0,1,2), b(3,4,5), c(6,7,8); typedef set<MyRef> MySet; MySet s1, s2; s1.insert(MyRef(&a)); s1.insert(MyRef(&b)); s1.insert(MyRef(&c)); s2.insert(MyRef(&a)); MySet::const_iterator i = s1.begin(); while (i != s1.end()) { const MyRec& r = *i; cout << r.z << endl; ++i; } return 0; } My second question goes into data integrity. I feel really bad, when I see the example.cpp where the test class is inherited from imultiset_node_d. So the value of a object of type test is defined by x and is accessibly from the user's side? I mean changing that value from outside the underlying RB-tree implementation would easily break it. So it's all about being more efficient and being compatible with existing algorithms that work on STL containers? A point, my stomach feels good about;) Best regards, Frank.

Hi, frank@cyblogic.de wrote:
As far as I understood the stored elements are not owned by the containers itself, but allocated by some other object. Right. Actually the term 'store' is little bit misleading. But even in our native language german I didn't found a better term.
But why not using a reference type and storing this in the usual non-instrusive containers? Performance reasons. Look at Overview, section 'Properties of intrusive containers' and then go to Usage, section 'When to use?'. It not only mentions your idea, but should answers your question too. In fact, the development of these containers was the result of falling in a deep performance hole while developing some mathematical algorithms (e.g. a fast triangulation algorithm, which needs 3 sequences over the vertices). Another example is the open source polygon clipping algorithm GPC. A C++ version of GPC could draw huge advantages from Boost.Intrusive.
My second question goes into data integrity. I feel really bad, when I see the example.cpp where the test class is inherited from imultiset_node_d. So the value of a object of type test is defined by x and is accessibly from the user's side? I mean changing that value from outside the underlying RB-tree implementation would easily break it. Hmm, big point. I have to change this. Indeed the public interface of imultiset_node_d should consist only of the members mentioned in the docs (same for the other nodes).
So it's all about being more efficient and being compatible with existing algorithms that work on STL containers? Right again.
Best regards, Olaf Krzikalla

On Friday 29 April 2005 23:27, Olaf Krzikalla wrote:
But why not using a reference type and storing this in the usual non-instrusive containers?
Performance reasons. Look at Overview, section 'Properties of intrusive containers' and then go to Usage, section 'When to use?'. It not only mentions your idea, but should answers your question too. In fact, the development of these containers was the result of falling in a deep performance hole while developing some mathematical algorithms (e.g. a fast triangulation algorithm, which needs 3 sequences over the vertices). Another example is the open source polygon clipping algorithm GPC. A C++ version of GPC could draw huge advantages from Boost.Intrusive.
Yep, now I understood. The whole thing is about laying out objects in memory in an efficient way. Sorry for always talking in critic, but I would love to suggest to think about memory allocators. Pool allocators can be utilised greatly to achieve memory layout for ideal cache heating. I missed that point in the efficiency discussion of intrusive containers versus non-intrusive containers. There's no need for any runtime allocation logic at all for non-intrusive containers, but intrusive containers may compete with non-intrusive one when applying pool allocators. Oh, and thank you answering my questions. These non-instrusive containers provide me some good ideas for my own work. Best regards, Frank.

Yep, now I understood. The whole thing is about laying out objects in memory in an efficient way. Sorry for always talking in critic, but I would love to suggest to think about memory allocators. Pool allocators can be utilised greatly to achieve memory layout for ideal cache heating. I missed that point in the efficiency discussion of intrusive containers versus non-intrusive containers. Well, I'm not sure if this discussion belongs to the doc. Anyway, let's think about the four points in 'Overviw: Properties of intrusive containers' wrt pool allocators. The first point: Depending on your RTL pool allocators may improve the
Hi, frank@cyblogic.de wrote: performance of the memory management. But it's a step in quantity, not in quality. That is, there is still some memory management, while for intrusive containers computing a node from a value and vice versa (to a certain degree this operations are the counterparts of memory management) is boiled down by the compiler to maybe three machine instructions (in some cases it vanishes completely). The second point: indeed it's possible to write a pool allocator in a way, that at least the algorithm, that would need an intrusive container otherwise, don't throw. Just preallocate enough elements beforehand (for intrusive containers you do that implicitly). The third point: I have no clue how to accomplish this task using non-intrusive containers without a lot of additionally effort. And in fact, this feature is very useful. The fourth point: Regarding this discussion this is a non-issue IMHO. I just wnat to point out, that clearing a std::list<T*> is as linear as implicitly removing all elements stored in an ilist. Best regards Olaf Krzikalla

Oh, it's much to late! Mixed "intrusive" with "non-intrusive" in the last few sentences. Hopefully nobody becomes to much confused. Frank.

| frank@cyblogic.de wrote: | > As far as I understood the stored elements are not owned by the | > containers itself, but allocated by some other object. | Right. Actually the term 'store' is little bit misleading. But even in | our native language german I didn't found a better term. | | > But why not | > using a reference type and storing this in the usual non-instrusive | > containers? | Performance reasons. so why not use "ordinary" views, eg. containers of pointers? it's not difficult to lay out std::vector<Foo*> s.t. no memory allocation occurs (except once). -Thorsten

Hi, Thorsten Ottosen wrote:
so why not use "ordinary" views, eg. containers of pointers?
it's not difficult to lay out std::vector<Foo*> s.t. no memory allocation occurs (except once). It strongly depends on your application. E.g. inserting and removing elements in the middle of a sequence rather forbids the using of std::vector due to performance reasons. And in an answer to Frank I discussed the using of std::list together with pool allocators. I don't argue, that intrusive containers are the swiss-army knife they used to be in the C world. In fact, the application area is very limited in the C++ world, since you can choose from a wide range of approaches regarding sequences. However, IMHO intrusive containers certainly belong to that range and sometimes they even aren't replaceable without a loss of efficiency.
Best regards, Olaf Krzikalla

"Olaf Krzikalla" <krzikalla@gmx.de> wrote in message news:d52d51$r2s$1@sea.gmane.org... | Hi, | | Thorsten Ottosen wrote: | > so why not use "ordinary" views, eg. containers of pointers? | > | > it's not difficult to lay out std::vector<Foo*> s.t. no memory allocation | > occurs (except once). | It strongly depends on your application. E.g. inserting and removing | elements in the middle of a sequence rather forbids the using of | std::vector due to performance reasons. but if you're creating a linked-list on the "stack", then you need to store the list somewhere; what else can be used that vector<Foo*> ? -Thorsten

Hi, I'm not sure, if I get all your points right. Thorsten Ottosen wrote:
but if you're creating a linked-list on the "stack", then you need to store the list somewhere; The list nodes aren't created on the 'stack', they are part of the objects.
what else can be used that vector<Foo*> ? I have no clue, what you are actually asking. Are you looking for a sensible usage of vector<Foo*>? vector<Foo*> perfectly fits e.g. for std::priority_queue. But maybe I misunderstood your question.
Best regards Olaf Krzikalla

"Olaf Krzikalla" <krzikalla@gmx.de> wrote in message news:d53hmo$dtq$1@sea.gmane.org... | Hi, | | I'm not sure, if I get all your points right. | | Thorsten Ottosen wrote: | > but if you're creating a linked-list on the "stack", then you need to store | > the list somewhere; | The list nodes aren't created on the 'stack', they are part of the objects. yes, like in struct Link { Link* prev, next; }; right? So if you say the library is about performance, then I assume Link's cannot be heap-allocated; so where are they allocated? I'm just trying to figure out what it is that makes your library faster in certain cituations :-) -Thorsten

struct Link { Link* prev, next; };
So if you say the library is about performance, then I assume Link's cannot be heap-allocated; so where are they allocated?
Nowhere. They are part of the object stored in the container. STL containers (apart from array-like ones such as vector) have structures that are external to the object. For example std::list has 1 node for every object stored in it. Thus, you have 2 allocations per each object in std::list - one for object itself and another one for its node. Putting prev and next pointers into the object makes the node no longer neccessary. Intrusive containers also have several other advantages: - adding object into container does not require making copies - one object can be in more than one intrusive container at once I've seen a lot of libraries written in C using this sort of approach, and also tried to write some in C++ but got nowhere close to Olaf. Where performance is critical, this is the right choice. It would be great if we had standarized intrusive containers in boost. cheers, M.

"Marcin Kaliciñski" <kalita@poczta.onet.pl> wrote in message news:d53ven$q88$1@sea.gmane.org... |> struct Link | > { | > Link* prev, next; | > }; | > | > So if you say the library is about performance, then I assume Link's | > cannot be | > heap-allocated; so | > where are they allocated? | | Nowhere. They are part of the object stored in the container. STL containers | (apart from array-like ones such as vector) have structures that are | external to the object. For example std::list has 1 node for every object | stored in it. Thus, you have 2 allocations per each object in std::list - | one for object itself and another one for its node. Putting prev and next | pointers into the object makes the node no longer neccessary. ok, but... | Intrusive containers also have several other advantages: | - adding object into container does not require making copies | - one object can be in more than one intrusive container at once | | I've seen a lot of libraries written in C using this sort of approach, and | also tried to write some in C++ but got nowhere close to Olaf. Where | performance is critical, this is the right choice. It would be great if we | had standarized intrusive containers in boost. I buy the performance goal; I'm just not too comfortable with the intrusive approach; It must be posibly to look at this from a more "view" like perspektive (no pun intended) and integrate it with those ideas. For example, unintrusve_list<T> could store strutures of the type struct Foo { Foo* next, prev; T* data; }; I don't see any extra overhead by making this non-intrusive. I assume the intrusive approach means that I can only put the elements into one list; the non-intrrusive would't have that requirement. Does it make sense? -Thorsten

Hi, Thorsten Ottosen wrote:
It must be posibly to look at this from a more "view" like perspektive Indeed.
[quote from the doc] Semantically spoken, an intrusive container in the sense of Boost.Intrusive is similiar to an STL-container holding non-owning pointers to objects. That is, if you have an boost::intrusive::ilist<accessor<T, ...> >, than std::list<T*> would allow you to do quite the same things - maintaining and navigating a set of objects of type T and types derived from it, but no implicit memory menagment with regard to T. [/quote] To be fair, you can do even more with std::list<T*>: it can hold more than one pointer to a certain object. But...
I buy the performance goal Note, that the performance improvement is not a step in quantity but in quality. But it seems, that I have to write a real-world example to convince you about that statement. I'm going to do that.
Best regards Olaf Krzikalla

"Marcin Kaliciñski" <kalita@poczta.onet.pl> wrote in message news:d53ven$q88$1@sea.gmane.org... > > struct Link > > { > > Link* prev, next; > > }; > Intrusive containers also have several other advantages: > - adding object into container does not require making copies > - one object can be in more than one intrusive container at once I would like to add that it's also possible to remove an object without knowning the container it's in, sometimes that's an advantage. Also when deleting an object it's not required to first explicitly remove it from the list, ~Link() can handle this.

"Olaf Krzikalla" <krzikalla@gmx.de> wrote in message news:d53hmo$dtq$1@sea.gmane.org... | Hi, | | I'm not sure, if I get all your points right. | | Thorsten Ottosen wrote: | > but if you're creating a linked-list on the "stack", then you need to store | > the list somewhere; | The list nodes aren't created on the 'stack', they are part of the objects. | | > what else can be used that vector<Foo*> ? | I have no clue, what you are actually asking. Are you looking for | a sensible usage of vector<Foo*>? vector<Foo*> perfectly fits e.g. for | std::priority_queue. But maybe I misunderstood your question. I see now that you always store the elements in a vector before you create a list of pointers; so that answers my question about where the stuff is stored. Then I need to ask, what the big difference would be between your classes and using boost.multi-index container? Coulnd't the same functionality be found there already? If not, wouldn't it be easy to add the functionality there with a new style of indices. bests -Thorsten

Thorsten Ottosen wrote:
I see now that you always store the elements in a vector before you create a list of pointers; so that answers my question about where the stuff is stored.
In normal use they wouldn't have to be stored in a vector.
Then I need to ask, what the big difference would be between your classes and using boost.multi-index container?
Coulnd't the same functionality be found there already? If not, wouldn't it be easy to add the functionality there with a new style of indices.
You might find this article useful: http://www.codefarms.com/publications/intrusiv/intr.htm Or if you've got Dr. Dobbs access: http://www.ddj.com/documents/s=898/ddj9910a/ might be better, although it's more than a little hyberbolic. Anyway, for a motivating example, consider a class that wishes to keep track of the smart pointers pointing at it. The pointers could act as nodes on an intrusive list. When a pointer's value is changed, it just removes itself from its current pointee's list and adds itself to its new pointee. Using standard containers might be too expensive here because or the required memory allocations. Then, at any time, all the pointers that point to the class can be found. Daniel

Hi, Thorsten Ottosen wrote:
I see now that you always store the elements in a vector before you create a list of pointers; so that answers my question about where the stuff is stored. Note, that the vector is used only as an example. Actually the elements can be stored anywhere.
Then I need to ask, what the big difference would be between your classes and using boost.multi-index container?
Coulnd't the same functionality be found there already? If not, wouldn't it be easy to add the functionality there with a new style of indices. Well, at a first glance it comes very close. You still need memory management and copy-ctors, but besides this you can think of a
struct vertex { double x, y, z; }; typedef multi_index_container<vertex, /*sequences*/, pool_alloc> a_lot_of_sequences; being near the same as an vector of vetices including the nodes for intrusive containers. But what really strikes me: it seems impossible to have views with only subranges of the stored elements. That is, if you look at the first figure in the Boost.Multiindex Tutorial, I couldn't figure out how to get a sequenced view to e.g. the elements (1) and (2) only. IMHO it's possible to add such functionality but it's by no way easy (you'll need some sort of ref_counting internally). Hence, like Boost.Intrusive Boost.Multiindex is a sequence approach, but for another application area. Best regards Olaf Krzikalla

Hi, Olaf has now finished his containers, but nobody seems to care very much. What does boost procedure say at this point? What next needs to be done to push them further towards reviewing/accepting? cheers, M. Uzytkownik "Olaf Krzikalla" <krzikalla@gmx.de> napisal w wiadomosci news:d4talb$e1e$1@sea.gmane.org...
Hi,
the second version of intrusive containers is available. Excerpt from the introduction:
---snip--- Welcome to Boost.Intrusive, a library presenting some intrusive containers to the world of C++. While intrusive containers were and are widely used in C, they became more and more forgotten in the C++-world due to the presence of the standard containers, which don't support intrusive techniques. Boost.Intrusive not only reintroduces this technique to C++, but also encapsulates the implementation in STL-like interfaces. Hence anyone familiar with standard containers can use intrusive containers with ease. ---snap---
You find the file intrusive.zip either in the yahoo file section or here:
http://people.freenet.de/turtle++/intrusive.zip
I consider this version being rather complete now and quite ready for a formal review (barring islist maybe).
Best regards Olaf Krzikalla
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Marcin Kalicinski wrote:
Hi,
Olaf has now finished his containers, but nobody seems to care very much. What does boost procedure say at this point? What next needs to be done to push them further towards reviewing/accepting?
cheers, M.
The procedure is described here: http://www.boost.org/more/submission_process.htm Daniel

Hi, Marcin Kalicinski wrote:
Olaf has now finished his containers, I wouldn't go so far and say: it's finished. It is in reasonable state, but there is still a lot of work to do and as the discussion moves on, it becomes more and more work (nevertheless I'm glad, that there is some discussion).
but nobody seems to care very much. I have some private eMail correspondence regarding Boost.Intrusive too. And Tom already informed me, that I will be added to the review queue soon.
What next needs to be done to push them further towards reviewing/accepting? There have to be a third iteration. Even if I don't work all the suggestions into Boost.Intrusive 1.0, some things still need to be considered for the first version (e.g. I want to push the 'privateness' as far as possible). I hope to finish this iteration today. I will post a message in this thread.
Best regards Olaf Krzikalla

Olaf Krzikalla <krzikalla@gmx.de> writes:
Hi,
but nobody seems to care very much.
I have some private eMail correspondence regarding Boost.Intrusive too. And Tom already informed me, that I will be added to the review queue soon.
It is of course the review wizard's perogative to do whatever he feels is appropriate, but if it's true that "nobody seems to care very much," adding the library to the review queue is probably premature at best. -- Dave Abrahams Boost Consulting www.boost-consulting.com

but if it's true that "nobody seems to care very much," adding the library to the review queue is probably premature at best. I'm already aware of this problem too. Unfortunately I don't have a good measurement of 'how much interest is enough interest?'. I've browsed all
Hi, David Abrahams wrote: the (so called) searchable archives of boost.devel, but didn't find anything useful. Either the search function was too limited or the search results couldn't be ordered in threads or there was no 'view complete thread'. So I have no clue, how many replies indicate enough interest. And of course you have to keep boost as small as possible. If it becomes too big, nobody will use it anymore, because nobody have a complete overview and is able say, what's actually available (and IMHO boost is already too big, the fundamentals are coupled too tight with high-level code). OTOH intrusive containers are a very fundamental piece of code. You can build all sort of containers on top of it (including STL container, boost::multi_index_container), while you can't do the opposite. But I admit, that the application area is limited - they serve either as a base or are used in perf-critical parts. And as you wrote in another post, the application programmer rarely needs to write really fast code. In addition, programmers coming to C++ from 'higher' languages (Java, script lang.) will certainly have problems to understand the differences between intrusive and non-intrusive containers and hence don't consider intrusive containers even at the right places. Nevertheless IMHO intrusive containers belong to boost. They are part of an development process opposite to the popular one (that is, they reintroduce one of the roots of the language instead of trying to emulate yet another JSDK/C# library feature). I wouldn't go as far as to say they belong to std (as someone emailed me), but once in a while a question in a newsgroup pops up, where intrusive containers either fits perfectly or at least have to be considered as a possibility. Best regards Olaf Krzikalla

Olaf Krzikalla <krzikalla@gmx.de> writes:
Hi,
but if it's true that "nobody seems to care very much," adding the library to the review queue is probably premature at best. I'm already aware of this problem too. Unfortunately I don't have a good measurement of 'how much interest is enough interest?'. I've browsed all
David Abrahams wrote: the (so called) searchable archives of boost.devel, but didn't find anything useful. Either the search function was too limited or the search results couldn't be ordered in threads or there was no 'view complete thread'.
If you use gmane search, when you click on a message's subject you go to the complete thread.
So I have no clue, how many replies indicate enough interest.
And of course you have to keep boost as small as possible.
I disagree.
If it becomes too big, nobody will use it anymore, because nobody have a complete overview
Too late for that.
and is able say, what's actually available (and IMHO boost is already too big, the fundamentals are coupled too tight with high-level code).
We just need better infrastructure for managing Boost's size. I'm working on that.
OTOH intrusive containers are a very fundamental piece of code. You can build all sort of containers on top of it (including STL container, boost::multi_index_container), while you can't do the opposite. But I admit, that the application area is limited - they serve either as a base or are used in perf-critical parts. And as you wrote in another post, the application programmer rarely needs to write really fast code. In addition, programmers coming to C++ from 'higher' languages (Java, script lang.) will certainly have problems to understand the differences between intrusive and non-intrusive containers and hence don't consider intrusive containers even at the right places. Nevertheless IMHO intrusive containers belong to boost. They are part of an development process opposite to the popular one (that is, they reintroduce one of the roots of the language instead of trying to emulate yet another JSDK/C# library feature). I wouldn't go as far as to say they belong to std (as someone emailed me), but once in a while a question in a newsgroup pops up, where intrusive containers either fits perfectly or at least have to be considered as a possibility.
It's not yet clear to me how important endogenous (intrusive) containers are once you have the exogenous kind. The main question is: what do we need them for? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Olaf Krzikalla wrote:
the second version of intrusive containers is available.
Some thoughts based on a very quick look: I think you could allow ilist_node_d or ilist_node_m to be private, by using a mechanism along the lines of how boost::iterator_access is used with iterator facade and adaptor. I still feel a little uncomfortable with the: ilist<member<Vertex, &Vertex::prev, &Vertex::next> > form. Although allowing Vertex to be a POD is a very good reason for it. As before these members could be private. Also it would be nice to able to use a policy to define how the links are used - eg. if they're null when a node is not a member of a list, if the list is circular. I'm not sure how this could be used with a legacy structure otherwise (as it might make assumptions about its members). I don't believe an assertion that an inserted node isn't already in a list would cost much - it would be reduced to nothing in release mode. Alternatively, this is possibly another use for a policy, you could have no checks, or if it is in a list you could remove it, trigger an assertion, or throw an exception. Daniel
participants (8)
-
Daniel James
-
David Abrahams
-
Eric Zijlstra
-
frank@cyblogic.de
-
Marcin Kalicinski
-
Marcin Kalici�ski
-
Olaf Krzikalla
-
Thorsten Ottosen