Ion Gaztañaga's Intrusive Library

Tom Brinkman

16 Mar 2007 16 Mar '07

3:26 p.m.

I use STL now like its second nature. As the Intrusive library proposes to be an alternative to STL for some situations, I am intrigued. I have no background with intrusive containers so I'm not shur why I would use one. I scanned the documentation, and still do not feel like I would know when to use Intrusive and when to use STL. Learning STL is a huge commitement of time and energy, so I think that people might be reluctant to give Intrusive a look without a strong and clear reason why. Here is the introduction from the documentation: *"Boost.Intrusive* is a library presenting some intrusive containers to the world of C++. While intrusive containers were and are widely used in C, they became more and more forgotten in C++ due to the presence of the standard containers, which don't support intrusive techniques. *Boost.Intrusive* not only reintroduces this technique to C++, but also encapsulates the implementation in STL-like interfaces. Hence anyone familiar with standard containers can easily use *Boost.Intrusive*. Like their STL-counterparts, intrusive containers use template parameters to control the stored data types and some special aspects of intrusive containers." This is not adequate for a motivating introductory discussion about why and where I should use boost::intrusive. Give a couple of hello-world motivating examples of where STL falls short and why boost::intrusive is better. Get the reader excited about why boost::intrusive is "cool".

Show replies by date

Ion Gaztañaga

16 Mar 16 Mar

4:20 p.m.

Hi Tom, Tom Brinkman wrote:

...

Here is the introduction from the documentation:

<snip>

This is not adequate for a motivating introductory discussion about why and where I should use boost::intrusive. Give a couple of hello-world motivating examples of where STL falls short and why boost::intrusive is better. Get the reader excited about why boost::intrusive is "cool".

Ok. Thanks for your comment. I will change the introduction to something more exciting, based on the performance benefits of intrusive containers. Regards, Ion

Daniel James

17 Mar 17 Mar

2:37 p.m.

On Fri, 16 Mar 2007 16:20:51 -0000, Ion Gaztañaga <igaztanaga@gmail.com> wrote:

...

Ok. Thanks for your comment. I will change the introduction to something more exciting, based on the performance benefits of intrusive containers.

I think you could emphasize the improved exception guarantees, they're a big advantage over STL containers. Also, you only mention that non-copyable/assignable objects can be stored in intrusive containers under the downsides. Intrusive data structures can also be useful when defining the relations between classes, although I think that's out of scope for your library. But at the very least, an object should know when it's the member of an intrusive container (hopefully?). A couple of old Dr. Dobbs articles were about this: http://www.drdobbs.com/dept/architect/184411070 (this one's a little hyperbolic as they're trying to sell their own library) http://www.drdobbs.com/dept/cpp/184401437?pgno=5 (this one seems to be back to front, so I've linked to the last page) I think your library could supply a good base for this sort of thing. Daniel P.S. I'll review the library in the next few days.

Ion Gaztañaga

3:18 p.m.

Daniel James wrote:

...

On Fri, 16 Mar 2007 16:20:51 -0000, Ion Gaztañaga <igaztanaga@gmail.com> wrote:

...
Ok. Thanks for your comment. I will change the introduction to something more exciting, based on the performance benefits of intrusive containers.

I think you could emphasize the improved exception guarantees, they're a big advantage over STL containers. Also, you only mention that non-copyable/assignable objects can be stored in intrusive containers under the downsides.

Yes. I think these 3 reasons can be a good summary of the advantages of intrusive containers.

...

Intrusive data structures can also be useful when defining the relations between classes, although I think that's out of scope for your library.

For the moment, they are out of the library. But that can be a good idea for a new library built above Intrusive.

...

But at the very least, an object should know when it's the member of an intrusive container (hopefully?).

If you use what Intrusive defines as "safe hooks" an object always knows if it's inserted in a container. Boost.Intrusive containers put the hook in a safe-state after erasing the object from the container.

...

A couple of old Dr. Dobbs articles were about this:

Interesting.

...

I think your library could supply a good base for this sort of thing.

Yep!

...

P.S. I'll review the library in the next few days.

Waiting for your comments. I would like to know if you consider Boost.Intrusive complete enough to be a valuable tool to build non-intrusive containers (like STL containers or TR1 unordered containers). Regards, Ion

Daniel James

18 Mar 18 Mar

10:43 a.m.

On Sat, 17 Mar 2007 15:18:59 -0000, Ion Gaztañaga <igaztanaga@gmail.com> wrote:

...

Waiting for your comments. I would like to know if you consider Boost.Intrusive complete enough to be a valuable tool to build non-intrusive containers (like STL containers or TR1 unordered containers).

Well, I think they could be useful for building non-intrusive containers in general but I'm not sure about STL containers, as they tend to have very demanding requirements. If you want to fully support STL allocators then you'll need to use the allocator's address function to get a pointer object from a reference or pointer. I think you could add an extra function to do this to your node traits class, but it'd need some way to access the allocator - so the traits class would need to also be able to store an allocator. And address can only be called with references to objects created with the allocator itself, in ilist you use node_ptr to point to the root node which wouldn't created with the allocator. Of course, most STL implementations don't fully support allocators to this level so maybe it doesn't matter. But I'd imagine making the traits flexible enough to account for this kind of thing would be too complex. Taking a quick look at iunordered_set, there would also be problems implementing an unordered_set with that. One problem is the exception requirement for insert:

...

For unordered associative containers, if an exception is thrown by any operation other than the container’s hash function from within an insert() function inserting a single element, the insert() function has no effect.

This needs carefull implementation. All calls to the equality predicate and object construction need to be done before a rehash but, in your implementation, a rehash will invalidate insert_commit_data (actually, you should mention that in the documentation, or have I missed something?). You could change the implementation to support this - insert_commit_data would need to store the hash value, and then insert_commit would recalculate the bucket. I suspect there are probably other similar problems and I'm not sure if it's worth the complication of fully supporting the STL. Although, if this is intended to be standardized. maybe it is? An unfortunate problem is that in C++ today the implementation of template code is not hidden from the user - error messages will be made worse by the extra layer of abstraction. Although, concepts might help considerably and even negate that problem. Implementors often use unusual data structures. I believe the dinkumware unordered containers use a skip list for the buckets. The data structure I use for unordered_multiset/unordered_multimap isn't directly supported (I'll document it soon - it's pretty unusual). A combination of intrusive containers and pointer manipulation could be used - but I think it'd get quite clumsy. Having said all that, I do think they could be useful for creating containers. Even for seemingly trivial structures, such as a singly linked list, have their complications. Getting a ready made iterator is a definite plus. Daniel

Ion Gaztañaga

12:10 p.m.

Daniel James wrote:

...

If you want to fully support STL allocators then you'll need to use the allocator's address function to get a pointer object from a reference or pointer. I think you could add an extra function to do this to your node traits class, but it'd need some way to access the allocator - so the traits class would need to also be able to store an allocator.

The allocator stuff is really a problem. Although I haven't seen any useful uses of address(), I agree that I could offer the possibility to store an object with address() and other related functions. But internally i need more than that because I also need a way to implement static_cast and const_cast for non-raw pointer. Currently I'm doing this converting the smart pointer to a raw pointer, doing the conversion, and going back. In Boost, even with Boost 1.34 cast functions (boost/pointer_cast.hpp) I contributed, there is no "official" generic way to do pointer casting. Another possibility is that the object I allow to be contained inside the intrusive container could have both allocator and casting functions. In general, current allocator approach is lacking in both pointer conversion and placement construction issues.

...

And address can only be called with references to objects created with the allocator itself, in ilist you use node_ptr to point to the root node which wouldn't created with the allocator.

That's another problem. And take in care that if we want to obtain no-throw move constructors and assignments and maintain the moved vector usable, storing the "end" node inside the container is a must. If we want: //Imagine list allocates the "end" node using the allocator std::list<MyObject> list; //If we want list to be usable after being moved //we must COPY the allocator and ALLOCATE a new //end node. Both operations might throw! std::list<MyObject> list2(std::move(list)); Howard can correct me if I'm wrong but I think the Metrowerks/Freescale implementation of std containers using move semantics store the end node inside the container to achieve no throw guarantee in default construction and move operations in node containers. I the standard requires that containers should have no throwing move operations (which would be really nice), then we have a big problem with allocators.

...

Of course, most STL implementations don't fully support allocators to this level so maybe it doesn't matter. But I'd imagine making the traits flexible enough to account for this kind of thing would be too complex.

In Interprocess I've reworked all the containers to so Intrusive. I've decided just support pointers that allow conversions to raw pointers. A limitation, but still more flexible than usual STL implementations.

...

Taking a quick look at iunordered_set, there would also be problems implementing an unordered_set with that. One problem is the exception requirement for insert:

...
For unordered associative containers, if an exception is thrown by any operation other than the container’s hash function from within an insert() function inserting a single element, the insert() function has no effect.

This needs carefull implementation. All calls to the equality predicate and object construction need to be done before a rehash but, in your implementation, a rehash will invalidate insert_commit_data (actually, you should mention that in the documentation, or have I missed something?). You could change the implementation to support this - insert_commit_data would need to store the hash value, and then insert_commit would recalculate the bucket.

Nice comment. The bucket number is already calculated in the insert_check function so this wouldn't cost more than one extra word for insert_commit_data. I want to maintain insert_commit as a no-throw operation since I think it's a big advantage.

...

I suspect there are probably other similar problems and I'm not sure if it's worth the complication of fully supporting the STL. Although, if this is intended to be standardized. maybe it is?

Not for the moment. Another question: would you make irbtree and ihashtable implementation classes public? Normally vendors implementing set/multiset/map/multimap classes implement those using a common tree class. This tree class could be based in irbtree, so maybe this class could be offered as a new associative container: one that allows unique and multiple insertion functions. I'm really think that if I can make Intrusive a good tool to implement standard containers, that would mean that the library is complete a good enough for any task.

...

Implementors often use unusual data structures. I believe the dinkumware unordered containers use a skip list for the buckets. The data structure I use for unordered_multiset/unordered_multimap isn't directly supported (I'll document it soon - it's pretty unusual). A combination of intrusive containers and pointer manipulation could be used - but I think it'd get quite clumsy.

I think Dinkumware used a doubly linked list in their old hash implementation. I don't know about TR1, though. This doubly linked list-based hash container would be also interesting to offer, since it has some advantages (a lightweight iterator, for example, and faster erasure functions for if we insert a lot of objects with the same key) comparing it to the classic, singly linked list array approach.

...

Having said all that, I do think they could be useful for creating containers. Even for seemingly trivial structures, such as a singly linked list, have their complications. Getting a ready made iterator is a definite plus.

Specially non-trivial functions like sorting, merging... A slist/list sorting that achieves the same performance as list::sort is not trivial to write. Regards, Ion

Daniel James

3:18 p.m.

On Sun, 18 Mar 2007 12:10:19 -0000, Ion Gaztañaga <igaztanaga@gmail.com> wrote:

...

Although I haven't seen any useful uses of address()

IIRC the original use case was for segmented memory - the allocator would store the segment which would be stored with the pointer in the pointer object. Although, I don't really know what I'm talking about here so, hopefully, someone will correct me if I'm wrong. And if I'm right, I don't think anyone ever actually did this, or ever will. I'm actually suprised that you're not using it in Boost.Interprocess - I assumed you would store the location of the mapped memory in the allocator and then in address you'd calculate the offset. I didn't realise that offset_ptr was the offset from the actual pointer.

...

I agree that I could offer the possibility to store an object with address() and other related functions. But internally i need more than that because I also need a way to implement static_cast and const_cast for non-raw pointer. Currently I'm doing this converting the smart pointer to a raw pointer, doing the conversion, and going back.

I'm doing the same in the unordered containers. I think the current version in the vault has been written to have no casts, but I'm not very happy with the technique I used to do this, so I'll probably switch back.

...

Another possibility is that the object I allow to be contained inside the intrusive container could have both allocator and casting functions. In general, current allocator approach is lacking in both pointer conversion and placement construction issues.

Yeah, my main motivation for supporting a pretty strict interpretation of the allocator interface was Boost.Interprocess, and since that doesn't seem to require it, maybe it isn't that important.

...

I the standard requires that containers should have no throwing move operations (which would be really nice), then we have a big problem with allocators.

I'm not really familiar with the move proposal, but having to allocate the end node would probably be unacceptable to most - so I think it's fair to ignore that requirement. [snip]

...

Another question: would you make irbtree and ihashtable implementation classes public?

I was actually going to comment on this kind of thing in my review proper. But my answer is: not yet, maybe not at all. I'm not sure how useful ihashtable would be - it's tied closely to the STL requirements, so I can't see any advantage to using it instead of iunordered_set/iunordered_multiset. I'll look more closely at irbtree later. I was also thinking that it might be nice to have simple, lower level, structures like binary trees (or n-ary trees for that matter) and circular lists, which wouldn't require a container object, just a pointer to a node. Again, I'm not sure about this.

...

I'm really think that if I can make Intrusive a good tool to implement standard containers, that would mean that the library is complete a good enough for any task.

I think I'll try creating a simple unordered_set implementation and see how it plays with my unit tests (probably not in time for the end of review I'm afraid). Daniel

Howard Hinnant

11:55 p.m.

On Mar 18, 2007, at 8:10 AM, Ion Gaztañaga wrote:

...

//Imagine list allocates the "end" node using the allocator std::list<MyObject> list;

//If we want list to be usable after being moved //we must COPY the allocator and ALLOCATE a new //end node. Both operations might throw! std::list<MyObject> list2(std::move(list));

Howard can correct me if I'm wrong but I think the Metrowerks/ Freescale implementation of std containers using move semantics store the end node inside the container to achieve no throw guarantee in default construction and move operations in node containers.

It always amuses me when Ion *knows* I'm lurking. :-) Caught red- handed again! Yes, the CodeWarrior Freescale list, map, multimap, set, multiset use the embedded end-node technique. And it is possible this issue is about to come to a head, not sure. Fwiw, gcc 4.x also uses the same technique (but I claim CodeWarrior was there first :-)). And yes, I think this design for node-based STL containers is far superior to the heap-allocated end node. And I haven't been lurking quite closely enough to know if this matters, but in both CodeWarrior and gcc the value_type is elided from the embedded end node. I.e. a node_base is actually embedded which contains only the links, and node derives from node_base and adds the value_type. <aside> I once did an intrusive std::list, using the CodeWarrior std::list implementation by using a custom allocator. I told the allocator where to allocate the next node and it just did so. The allocator had to have knowledge of the node layout for the std::list, making it not really officially portable. But how many node layouts could there be in practice? struct my_node { my_node* prev_; my_node* next_; my_data_ data_; }; ... std::list<my_data, my_allocator<my_data>> l; char buffer[sizeof(my_node)] a_node; // alignment magic... l.get_allocator().set_allocate(&a_node); l.push_back(a_node.data_); or something along those lines (I'm going from memory). It was very low level and my_data_ was a pod, so I didn't have to worry about double constructing and such. It worked pretty well performance wise, but was admittedly pretty ugly and I couldn't claim it was portable. Fwiw, the application was malloc/free (never actually shipped). I managed the memory blocks in the malloc heap using std::list - intrusively, one list node per memory block. </aside> -Howard

Ion Gaztañaga

19 Mar 19 Mar

6:05 p.m.

Howard Hinnant wrote:

...

It always amuses me when Ion *knows* I'm lurking. :-) Caught red- handed again!

Something related to containers/move semantics/performance is just too attractive to ignore ;-)

...

And I haven't been lurking quite closely enough to know if this matters, but in both CodeWarrior and gcc the value_type is elided from the embedded end node. I.e. a node_base is actually embedded which contains only the links, and node derives from node_base and adds the value_type.

The same happens with Intrusive. The user derives or stores a hook to make the class intrusive-friendly. The hook contains a minimum generic node that just stores pointers. All the low-level (link, unlink, rebalance...) algorithms operate with these minimal nodes to avoid instantiating algorithms for all user types. The end node is a minimal node. When the user access to the real value (for example dereferencing an iterator), a downcasting or similar transformation is done to obtain the whole value from the minimal node.

...

std::list<my_data, my_allocator<my_data>> l; char buffer[sizeof(my_node)] a_node; // alignment magic... l.get_allocator().set_allocate(&a_node); l.push_back(a_node.data_);

or something along those lines (I'm going from memory). It was very low level and my_data_ was a pod, so I didn't have to worry about double constructing and such. It worked pretty well performance wise, but was admittedly pretty ugly and I couldn't claim it was portable.

<advertisement> The philosophy behind Intrusive was to build something that would make building STL-like containers trivial. It's also a way to extract all the common code unrelated to object construction so that we could easily construct new value_types using copy construction, move construction or even in-place construction, reusing the same binary code to insert them in the container. As an experiment, I've implemented all Interprocess node containers using Intrusive, so I think Intrusive offers enough functions to do so. </advertisement> Regards, Ion

Mathias Gaunard

16 Mar 16 Mar

4:23 p.m.

Tom Brinkman wrote:

...

I use STL now like its second nature. As the Intrusive library proposes to be an alternative to STL for some situations, I am intrigued. I have no background with intrusive containers so I'm not shur why I would use one.

Here is what I understand: They're a more time and space efficient alternative for node-based containers of pointers when you have the possibility of modifying the type to make it into a node. So basically, if you have std::list<T*>, and you can modify T, then it is more efficient to use std::ilist<T>. Note however, that the container doesn't "own" its contents and is not therefore responsible for destroying them. An intrusive container allocates and frees nothing, it just alters the nodes you provide it with to build the data structure. It's up to the user to allocate and free the nodes, which are actually the object themselves.

Gennadiy Rozental

4:30 p.m.

New subject: Ion Gazta�aga's Intrusive Library

"Mathias Gaunard" <mathias.gaunard@etu.u-bordeaux1.fr> wrote in message news:eteg9r$hks$1@sea.gmane.org...

...

Tom Brinkman wrote:

...
I use STL now like its second nature. As the Intrusive library proposes to be an alternative to STL for some situations, I am intrigued. I have no background with intrusive containers so I'm not shur why I would use one.

I would like to second that. Clearly stated problem domain is a primary goal if introduction, especially for the puposes on review.

...

Here is what I understand:

They're a more time and space efficient alternative for node-based containers of pointers when you have the possibility of modifying the type to make it into a node.

Why? And how much? I need specific numbers.

...

So basically, if you have std::list<T*>, and you can modify T, then it is more efficient to use std::ilist<T>.

Why? how std::list<T*> prevent you to modify the pointed value and/or how does it affect performance?

...

Note however, that the container doesn't "own" its contents and is not therefore responsible for destroying them. An intrusive container allocates and frees nothing, it just alters the nodes you provide it with to build the data structure. It's up to the user to allocate and free the nodes, which are actually the object themselves.

IOW this library promotes unsafe programming right? Gennadiy

Mathias Gaunard

4:52 p.m.

New subject: (no subject)

Gennadiy Rozental wrote:

...

"Mathias Gaunard" <mathias.gaunard@etu.u-bordeaux1.fr> wrote in message

...
They're a more time and space efficient alternative for node-based containers of pointers when you have the possibility of modifying the type to make it into a node.

Why? And how much? I need specific numbers.

...
So basically, if you have std::list<T*>, and you can modify T, then it is more efficient to use std::ilist<T>.

Why? how std::list<T*> prevent you to modify the pointed value and/or how does it affect performance?

Because std::list<T*> generates a node type that should look like that: struct list_node { T* value; list_node* next; list_node* prev; }; This adds some indirection. On the other hand, the intrusive technique doesn't allow to contain NULL pointers, while std::list<T*> can.

...

IOW this library promotes unsafe programming right?

Maybe it's not appropriately named, since they don't own anything. Logically containers should own what they contain. The technique could be adapted to something more like the pointer containers (which are more similar to the original containers in how they own their contents) though.

Gennadiy Rozental

5:47 p.m.

New subject: (no subject)

"Mathias Gaunard" <mathias.gaunard@etu.u-bordeaux1.fr> wrote in message news:etei0b$o33$1@sea.gmane.org...

...

Gennadiy Rozental wrote:

...
"Mathias Gaunard" <mathias.gaunard@etu.u-bordeaux1.fr> wrote in message

...
They're a more time and space efficient alternative for node-based containers of pointers when you have the possibility of modifying the type to make it into a node.

Why? And how much? I need specific numbers.

...
So basically, if you have std::list<T*>, and you can modify T, then it is more efficient to use std::ilist<T>.

Why? how std::list<T*> prevent you to modify the pointed value and/or how does it affect performance?

Because std::list<T*> generates a node type that should look like that:

struct list_node { T* value; list_node* next; list_node* prev; };

This adds some indirection.

So? could you give an example of an algorithm? And/or other speific performace advantage example? Gennadiy

Michael Fawcett

9:07 p.m.

New subject: (no subject)

On 3/16/07, Gennadiy Rozental <gennadiy.rozental@thomson.com> wrote:

...

So? could you give an example of an algorithm? And/or other speific performace advantage example?

I just constructed a quick test and used VTune and std::clock to sample. Here are my results: sorting across 1,000,000 random integers (averaged across 5 runs) std::list::sort - 6.87s boost::ilist::sort - 4.45s number of L2 cache requests (L1 cache misses) std::list - 1,790 boost::ilist - 862 I've never really exported VTune results before, but I think it's possible if anyone wants detailed reports and/or the source code for the benchmark. --Michael Fawcett

Gennadiy Rozental

17 Mar 17 Mar

5:42 p.m.

New subject: intrusive container performance example

"Michael Fawcett" <michael.fawcett@gmail.com> wrote in message news:bc5bffe80703161407ybbc7c71ja06b0c2230c5602c@mail.gmail.com...

...

On 3/16/07, Gennadiy Rozental <gennadiy.rozental@thomson.com> wrote:

...
So? could you give an example of an algorithm? And/or other speific performace advantage example?

Would you care to share with us a source code for your test?

...

I just constructed a quick test and used VTune and std::clock to sample. Here are my results:

sorting across 1,000,000 random integers (averaged across 5 runs) std::list::sort - 6.87s boost::ilist::sort - 4.45s

number of L2 cache requests (L1 cache misses) std::list - 1,790 boost::ilist - 862

How about std::vector? Gennadiy

Michael Fawcett

18 Mar 18 Mar

1:25 a.m.

New subject: intrusive container performance example

On 3/17/07, Gennadiy Rozental <gennadiy.rozental@thomson.com> wrote:

...

"Michael Fawcett" <michael.fawcett@gmail.com> wrote in message news:bc5bffe80703161407ybbc7c71ja06b0c2230c5602c@mail.gmail.com...

...
On 3/16/07, Gennadiy Rozental <gennadiy.rozental@thomson.com> wrote:

...
So? could you give an example of an algorithm? And/or other speific performace advantage example?

Would you care to share with us a source code for your test?

...
I just constructed a quick test and used VTune and std::clock to sample. Here are my results:

sorting across 1,000,000 random integers (averaged across 5 runs) std::list::sort - 6.87s boost::ilist::sort - 4.45s

number of L2 cache requests (L1 cache misses) std::list - 1,790 boost::ilist - 862

How about std::vector?

Sure, I'm at home now, but I can post it from work on Monday. I didn't test vector, but I assume it would beat intrusive list in speed and have fewer cache misses as well. --Michael Fawcett

Michael Fawcett

19 Mar 19 Mar

1:31 p.m.

New subject: intrusive container performance example

On 3/17/07, Michael Fawcett <michael.fawcett@gmail.com> wrote:

...

On 3/17/07, Gennadiy Rozental <gennadiy.rozental@thomson.com> wrote:

...
"Michael Fawcett" <michael.fawcett@gmail.com> wrote in message news:bc5bffe80703161407ybbc7c71ja06b0c2230c5602c@mail.gmail.com...

...
On 3/16/07, Gennadiy Rozental <gennadiy.rozental@thomson.com> wrote:

...
So? could you give an example of an algorithm? And/or other speific performace advantage example?

Would you care to share with us a source code for your test?

...
I just constructed a quick test and used VTune and std::clock to sample. Here are my results:

sorting across 1,000,000 random integers (averaged across 5 runs) std::list::sort - 6.87s boost::ilist::sort - 4.45s

number of L2 cache requests (L1 cache misses) std::list - 1,790 boost::ilist - 862

How about std::vector?

Sure, I'm at home now, but I can post it from work on Monday. I didn't test vector, but I assume it would beat intrusive list in speed and have fewer cache misses as well.

I've attached the source file and pasted the code below my signature. It does make use of the VTune API, but if you undef USE_VTUNE at the top of the file, it will compile without VTune. A note about the block_prefetch code. I simply grabbed that from a paper written by AMD. My hope was that it would clear the cache before running the sort so that the previously generated nodes were not already present before running the sort, but I did not analyze the generated code to see if that actually happened. There was a note in AMD's text that optimizers commonly remove the code in block_prefetch entirely, so it may be useless. --Michael Fawcett #define USE_VTUNE #define USE_BOOST //#define USE_STDLIB #include <algorithm> #include <cstdlib> #include <ctime> #include <iostream> #ifdef USE_STDLIB #include <list> #endif #include <ostream> #include <vector> #ifdef USE_VTUNE #include "vtuneapi.h" #endif #ifdef USE_BOOST #include "boost/intrusive/ilist.hpp" #endif #include "boost/shared_array.hpp" #ifndef USE_VTUNE #define VTResume() #define VTPause() #endif #ifdef USE_BOOST struct my_tag; struct MyNode : public boost::intrusive::ilist_base_hook<my_tag> { MyNode(int n) : num(n) { } int num; bool operator<(const MyNode &rhs) const { return num < rhs.num; } }; typedef boost::intrusive::ilist<boost::intrusive::ilist_base_hook<my_tag>::value_traits<MyNode>

...

list; #elif defined(USE_STDLIB) typedef int MyNode; typedef std::list<int> list; #endif

#define CACHEBLOCK 128 // prefetch block size #define PREFETCH_BLOCK_SIZE CACHEBLOCK * 64 #pragma optimize("", off) int fetch; void block_prefetch(const void *addr) { // Grab every 64th address to hit each cache line once. const int *a = (int *)addr; fetch += a[0] + a[16] + a[32] + a[48] + a[64] + a[80] + a[96] + a[112] + a[128] + a[144] + a[160] + a[176] + a[192] + a[208] + a[224] + a[240]; a += 256; fetch += a[0] + a[16] + a[32] + a[48] + a[64] + a[80] + a[96] + a[112] + a[128] + a[144] + a[160] + a[176] + a[192] + a[208] + a[224] + a[240]; a += 256; fetch += a[0] + a[16] + a[32] + a[48] + a[64] + a[80] + a[96] + a[112] + a[128] + a[144] + a[160] + a[176] + a[192] + a[208] + a[224] + a[240]; a += 256; fetch += a[0] + a[16] + a[32] + a[48] + a[64] + a[80] + a[96] + a[112] + a[128] + a[144] + a[160] + a[176] + a[192] + a[208] + a[224] + a[240]; } #pragma optimize("", on) int main(void) { const int NumTestNodes = 1000000; std::srand(static_cast<unsigned int>(std::time(0))); std::vector<MyNode> my_nodes; std::generate_n(std::back_inserter(my_nodes), NumTestNodes, std::rand); std::vector<boost::shared_array<int> > fragmented_memory; list my_list; for (int i = 0; i < NumTestNodes; ++i) { // Simulate fragmented memory boost::shared_array<int> arr(new int[rand() % 500]); fragmented_memory.push_back(arr); my_list.push_back(my_nodes[i]); } // Clear the cache boost::shared_array<char> dummy(new char[4096]); block_prefetch(dummy.get()); std::clock_t start_time = std::clock(); VTResume(); my_list.sort(); VTPause(); std::cout << "time for sorting: " << static_cast<double>(std::clock() - start_time) / CLOCKS_PER_SEC << std::endl; my_list.clear(); }

Michael Fawcett

20 Mar 20 Mar

4:33 p.m.

New subject: intrusive container performance example

On 3/19/07, Michael Fawcett <michael.fawcett@gmail.com> wrote:

...

I've attached the source file and pasted the code below my signature.

Well, I forgot to attach the source file, sorry, Gennadiy. Here it is this time. --Michael Fawcett

Ion Gaztañaga

16 Mar 16 Mar

5:27 p.m.

Gennadiy Rozental wrote:

...

IOW this library promotes unsafe programming right?

Don't be that hard ;-) Let's say that intrusive containers are interesting for the following purposes: -> Embedded systems. -> High performance algorithms. -> Store non-copyable and non-movable objects. -> Low-level memory algorithms, node pools, and similar stuff. -> To build more complex containers, avoiding memory allocations. I agree that Intrusive is not a high-level library comparing it to other Boost libraries, but I really think that it will be very useful to build other Boost libraries. I would say that Intrusive wants to convert some currently unsafe/low-level practices a bit more safer without losing performance. But obviously, this was better said in the introduction, as you have already commented. Best, Ion

Gennadiy Rozental

5:42 p.m.

New subject: Ion Gazta�aga's Intrusive Library

"Ion Gaztañaga" <igaztanaga@gmail.com> wrote in message news:45FAD381.1090600@gmail.com...

...

Gennadiy Rozental wrote:

...
IOW this library promotes unsafe programming right?

Don't be that hard ;-) Let's say that intrusive containers are interesting for the following purposes:

-> Embedded systems.

It's too generic. So are you saying std::list is not usefull for Embedded systems?

...

-> High performance algorithms.

How intrusive containers enhance performance?

...

-> Store non-copyable and non-movable objects.

std::list<T*> does not require copyability either.

...

-> Low-level memory algorithms, node pools, and similar stuff. -> To build more complex containers, avoiding memory allocations.

What an alternative existing solutions and how your library compare with them?

...

I agree that Intrusive is not a high-level library comparing it to other Boost libraries, but I really think that it will be very useful to build other Boost libraries. I would say that Intrusive wants to convert some currently unsafe/low-level practices a bit more safer without losing performance.

How safer? How much it adds to handwritten 10 liner intrusive container? Genandiy

Paul Rose

6:05 p.m.

...

...
How safer? How much it adds to handwritten 10 liner intrusive container?

Hi, I am interested in your handwritten 10 liner intrusive conatiners, especially associative. Any chance you could post? Thanks!

Ion Gaztañaga

9:54 p.m.

Gennadiy Rozental wrote:

...

"Ion Gaztañaga" <igaztanaga@gmail.com> wrote in message news:45FAD381.1090600@gmail.com...

...
Gennadiy Rozental wrote:

...
IOW this library promotes unsafe programming right? Don't be that hard ;-) Let's say that intrusive containers are interesting for the following purposes:

-> Embedded systems.

It's too generic. So are you saying std::list is not usefull for Embedded systems?

I've used std::list in embedded systems. I'm only saying that intrusive containers are more embedded friendly because they are more space efficient than non-intrusive containers. An class whose sizeof is 12 bytes to be inserted in two lists at the same time would be implemented using two std::list<Object*> + dynamic allocation, which means (in a typical 32 system): 12 bytes + bookeeping memory from the heap for each object 12 bytes per std::list node (two pointers to form the list plus thes sizeof(Object*) + plus bookeeping memory from the heap This means that you have 36 bytes per object + 3*bookeeping memory from the heap. With an intrusive list your object needs 12 bytes + 16 bytes from the hooks: 28 bytes + bookeeping from the heap. If bookepping is 8 bytes (a typical figure) you have 60 bytes vs. 36 bytes in size. If you have a million objects, there is big difference. An intrusive container will always be more space efficient than the non-intrusive approach, specially for small objects. I know that you can use memory pools to minimize the size overhead, but you will never beat the intrusive approach.

...

...
-> High performance algorithms.

How intrusive containers enhance performance?

Intrusive containers save memory allocations. That's a big advantage. For example: std::vector<MyObject> objects(NUM_OBJECTS); std::list<MyObject*> list; for(int i = 0; i < NUM_OBJECTS; ++i){ list.push_back(objects[i]); } needs NUM_OBJECTS + 1 calls to the heap. The same example with intrusive containers: std::vector<MyIntrusiveObject> objects(NUM_OBJECTS); boost::intrusive::ilist<MyIntrusiveObject> list (objects.begin(), objects.end()); just need 1 call to the heap. That's a big difference. Apart from that, iterating std::list<Object*> needs two memory accesses to reach the object. ilist<Object> just needs one. You can insert an object in several containers at the same time, without any dynamic memory allocation.

...

...
-> Low-level memory algorithms, node pools, and similar stuff. -> To build more complex containers, avoiding memory allocations.

What an alternative existing solutions and how your library compare with them?

Insertions in some intrusive containers (like lists) have no-throw guarantee. So inserting in an intrusive list can't never fail and that's an important feature when writing low-level algorithms. std::list<Object*> can throw because the allocation can throw.

...

...
I agree that Intrusive is not a high-level library comparing it to other Boost libraries, but I really think that it will be very useful to build other Boost libraries. I would say that Intrusive wants to convert some currently unsafe/low-level practices a bit more safer without losing performance.

How safer? How much it adds to handwritten 10 liner intrusive container?

Write an intrusive red-black tree. I'm pretty sure you will need more than 10 lines. I agree that I should add numbers to the library documentation. But I think that the advantages of intrusive containers are well-known. For example, operating system kernels use intrusive containers internally (linux is an example) because they are more efficient than their non-intrusive counterparts. Just like intrusive_ptr is more efficient than shared_ptr, because avoids an extra allocation to allocate the reference count and the reference count is stored inside the object avoiding extra memory accesses and cache misses. Best, Ion

Timmo Stange

11:01 p.m.

Ion Gaztañaga wrote:

...

I agree that I should add numbers to the library documentation. But I think that the advantages of intrusive containers are well-known. For example, operating system kernels use intrusive containers internally (linux is an example) because they are more efficient than their non-intrusive counterparts. Just like intrusive_ptr is more efficient than shared_ptr, because avoids an extra allocation to allocate the reference count and the reference count is stored inside the object avoiding extra memory accesses and cache misses.

While I would generally agree with the performance and size advantages of intrusive containers you mentioned, efficient cache usage will not always be a strength but sometimes a weakness of that approach. Just like with intrusive reference counting, you have the disadvantage of cache pollution and potential false sharing across threads with the maintenance data being part of the object. That depends heavily on the use case, but in your example of one set of objects being stored in two lists, the result may be almost three times the necessary memory bandwidth/cache load for iterating over just one list looking for a certain object by address (intrusive approach worst case). It seems to be difficult to describe the advantages of intrusive containers in a few words without raising objections, which is probably because non-intrusive containers (together with para- meterizable allocators) already are sufficiently fast in most cases. I'd still like to see your library to become part of boost. Regards Timmo Stange

Andres J. Tack

17 Mar 17 Mar

8:02 a.m.

Ion, these concrete examples you brought up are very compelling. I think you should include them with the documentation. It is helpful to me, at least, to have an example analysis in front of me, so I can review my own situation by it. -- Tack On Mar 16, 2007, at 4:54 PM, Ion Gaztañaga wrote:

...

Gennadiy Rozental wrote:

...
"Ion Gaztañaga" <igaztanaga@gmail.com> wrote in message news:45FAD381.1090600@gmail.com...

...
Gennadiy Rozental wrote:

...
IOW this library promotes unsafe programming right? Don't be that hard ;-) Let's say that intrusive containers are interesting for the following purposes:

-> Embedded systems.

It's too generic. So are you saying std::list is not usefull for Embedded systems?

I've used std::list in embedded systems. I'm only saying that intrusive containers are more embedded friendly because they are more space efficient than non-intrusive containers. An class whose sizeof is 12 bytes to be inserted in two lists at the same time would be implemented using two std::list<Object*> + dynamic allocation, which means (in a typical 32 system):

12 bytes + bookeeping memory from the heap for each object 12 bytes per std::list node (two pointers to form the list plus thes sizeof(Object*) + plus bookeeping memory from the heap

This means that you have 36 bytes per object + 3*bookeeping memory from the heap.

With an intrusive list your object needs 12 bytes + 16 bytes from the hooks: 28 bytes + bookeeping from the heap.

If bookepping is 8 bytes (a typical figure) you have 60 bytes vs. 36 bytes in size. If you have a million objects, there is big difference.

An intrusive container will always be more space efficient than the non-intrusive approach, specially for small objects. I know that you can use memory pools to minimize the size overhead, but you will never beat the intrusive approach.

...
...
-> High performance algorithms.

How intrusive containers enhance performance?

Intrusive containers save memory allocations. That's a big advantage. For example:

std::vector<MyObject> objects(NUM_OBJECTS);

std::list<MyObject*> list;

for(int i = 0; i < NUM_OBJECTS; ++i){ list.push_back(objects[i]); }

needs NUM_OBJECTS + 1 calls to the heap.

The same example with intrusive containers:

std::vector<MyIntrusiveObject> objects(NUM_OBJECTS);

boost::intrusive::ilist<MyIntrusiveObject> list (objects.begin(), objects.end());

just need 1 call to the heap. That's a big difference.

Apart from that, iterating std::list<Object*> needs two memory accesses to reach the object. ilist<Object> just needs one. You can insert an object in several containers at the same time, without any dynamic memory allocation.

...
...
-> Low-level memory algorithms, node pools, and similar stuff. -> To build more complex containers, avoiding memory allocations.

What an alternative existing solutions and how your library compare with them?

Insertions in some intrusive containers (like lists) have no-throw guarantee. So inserting in an intrusive list can't never fail and that's an important feature when writing low-level algorithms. std::list<Object*> can throw because the allocation can throw.

...
...
I agree that Intrusive is not a high-level library comparing it to other Boost libraries, but I really think that it will be very useful to build other Boost libraries. I would say that Intrusive wants to convert some currently unsafe/low-level practices a bit more safer without losing performance.

How safer? How much it adds to handwritten 10 liner intrusive container?

Write an intrusive red-black tree. I'm pretty sure you will need more than 10 lines.

I agree that I should add numbers to the library documentation. But I think that the advantages of intrusive containers are well-known. For example, operating system kernels use intrusive containers internally (linux is an example) because they are more efficient than their non-intrusive counterparts. Just like intrusive_ptr is more efficient than shared_ptr, because avoids an extra allocation to allocate the reference count and the reference count is stored inside the object avoiding extra memory accesses and cache misses.

Best,

Ion _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/ listinfo.cgi/boost

Ion Gaztañaga

12:43 p.m.

Andres J. Tack wrote:

...

Ion, these concrete examples you brought up are very compelling. I think you should include them with the documentation. It is helpful to me, at least, to have an example analysis in front of me, so I can review my own situation by it.

Thanks for your comment, I will add them.

...

-- Tack

Regards, Ion

gchen

18 Mar 18 Mar

6:38 a.m.

Ion Gaztañaga wrote:

...

Andres J. Tack wrote:

...
Ion, these concrete examples you brought up are very compelling. I think you should include them with the documentation. It is helpful to me, at least, to have an example analysis in front of me, so I can review my own situation by it.

Thanks for your comment, I will add them.

And please add them with source code (projects), not only add them in document. Copying source code from html document to try it is not good exercise. While reading the document of a new library, I always try to compile and run the samples.

Ion Gaztañaga

8:01 a.m.

gchen wrote:

...

And please add them with source code (projects), not only add them in document. Copying source code from html document to try it is not good exercise.

While reading the document of a new library, I always try to compile and run the samples.

Most of the code that appears in the documentation is in the boost/libs/intrusive/doc/code directory. These examples will be there, but I'll put a link in the doc pointing to the source code. Regards, Ion

gchen

9:19 a.m.

Ion Gaztañaga wrote:

...

gchen wrote:

...
And please add them with source code (projects), not only add them in document. Copying source code from html document to try it is not good exercise.

While reading the document of a new library, I always try to compile and run the samples.

Most of the code that appears in the documentation is in the boost/libs/intrusive/doc/code directory. These examples will be there,

Oh My, sorry I didn't notice that directory. Maybe it would be better to put them in boost/libs/intrusive/example, like other boost libraries?

...

but I'll put a link in the doc pointing to the source code.

Good.

Ion Gaztañaga

16 Mar 16 Mar

10:09 p.m.

Just a comment: intrusive containers are the C++0x wishlist revision 6 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2034.htm Search for "intrusive containers". The request was based in the Olaf Krzikalla's old code. I don't know how many C++ programmers are really interested in intrusive containers. Regards, Ion

Tobias Schwinger

11:48 p.m.

Hi, Tom Brinkman wrote: <comments about the introduction> I felt exactly the same after reading the introduction. Then I figured it might be about "one-shot allocation" and reading the rest of this thread it seems that assumption has been correct. Anyway, I still don't understand why we can't have intrusive containers without intrusive syntax. E.g: struct slist_node // <-- part of container implementation { T element; // note: held by value slist_node * next; }; Then there's only one allocation per 'slist_node'. With compiler options that enable structure padding (up to 2^n, for alignment) it will be bad, but usually it can be disabled for individual structures (in this case one would have to disable alignment/padding for whatever T is). So, why does the user have to add custom data members? Regards, Tobias

Timmo Stange

17 Mar 17 Mar

12:03 a.m.

Tobias Schwinger wrote:

...

Anyway, I still don't understand why we can't have intrusive containers without intrusive syntax. E.g:

struct slist_node // <-- part of container implementation { T element; // note: held by value slist_node * next; };

Then there's only one allocation per 'slist_node'.

What's the difference between that and the standard (non-intrusive) containers? Regards Timmo Stange

Tobias Schwinger

9:22 a.m.

Timmo Stange wrote:

...

Tobias Schwinger wrote:

...
Anyway, I still don't understand why we can't have intrusive containers without intrusive syntax. E.g:

struct slist_node // <-- part of container implementation { T element; // note: held by value slist_node * next; };

Then there's only one allocation per 'slist_node'.

What's the difference between that and the standard (non-intrusive) containers?

The same as intrusive containers: Allocation in one shot and having both the actual data and the accounting information in one block of memory. I might be missing something. Am I? Granted, allocators will need some special care this way, but a better interface would be worth it. Regards, Tobias

Timmo Stange

12:43 p.m.

Tobias Schwinger wrote:

...

...
What's the difference between that and the standard (non-intrusive) containers?

The same as intrusive containers: Allocation in one shot and having both the actual data and the accounting information in one block of memory.

I might be missing something. Am I?

I am not sure. Perhaps I'm the one missing something in your idea, but what you suggested is how list nodes may already look like in a standard library implementation. They use "single-shot allocation" when T is copy-constructible. If it is not, one needs to use list<T*> and allocate Ts separately. I do not see how that can be avoided when the node type is hidden in the implementation.

...

Granted, allocators will need some special care this way, but a better interface would be worth it.

Hm, do you mean allocator<T>::rebind<slist_node<T>>? It is already there for exactly that purpose (rebinding from T to a container node type). Regards Timmo Stange

Tobias Schwinger

2:06 p.m.

Timmo Stange wrote:

...

Tobias Schwinger wrote:

...
...
What's the difference between that and the standard (non-intrusive) containers?

The same as intrusive containers: Allocation in one shot and having both the actual data and the accounting information in one block of memory.

I might be missing something. Am I?

Yes.

...

I am not sure. Perhaps I'm the one missing something in your idea, but what you suggested is how list nodes may already look like in a standard library implementation. They use "single-shot allocation"

You're right - I should've taken a look into the source before making noise :).

...

when T is copy-constructible. If it is not, one needs to use list<T*> and allocate Ts separately. I do not see how that can be avoided when the node type is hidden in the implementation.

Cool, now I understand the purpose of this submission. I vote for a less confusing intro. Thanks, Tobias

Olaf Krzikalla

19 Mar 19 Mar

4:24 p.m.

...

Get the reader excited about why boost::intrusive is "cool". You know Alan Murta's GPC? It is fast. Really fast. It's well-written C. But with the boost::intrusive toolkit you easily can remake it in C++. Perhaps even faster (with less memory allocations). And with a lot less

Hi, Tom Brinkman wrote: lines of code. Very well then. Readable C++ becomes faster than C. Cool ;-) Seriously spoken, there are a lot of complex (geometric) algorithms operating over multiple sequences of the same set of objects. And of course all these algorithms needs to be fast. If you have 10.000, 100.000 or even more objects, the memory allocation done by std::list<T*> is not suitable any more. And if there are a lot of insertions and erases amidst the sequence std::vector<T*> is not suitable too. In this case you should be able to go one step down and use the fundamental container algorithms more directly. boost::intrusive gives you that option. There is another container (boost::multi_index) which can manage multiple sequences over one set of objects. It is safer to use but IMHO it's less flexible than boost::intrusive (AFAIK each object in a multi_index_container needs to be stored in every sequence of that container). Actually I think, that boost::multi_index easily could (and maybe should) be built on top of boost::intrusive. Regarding the unsafe programming thingy: I'm unsure about the term. What means "unsafe programming"? You have to keep in mind the lifetime of the objects stored in an intrusive container. Just like you have to keep in mind the lifetime of the objects stored in a std::list<T*>. However I admit, that 'intrusive containers' are perhaps merely a wrong and thus misleading name. In turn this would explain the "unsafe programming" feeling. Maybe they would be better labeled with something like 'intrusive sequences'. But I wouldn't change it no longer. Best regards Olaf Krzikalla

6700

Age (days ago)

6704

Last active (days ago)

List overview

Download

34 comments

13 participants

participants (13)

Andres J. Tack
Daniel James
gchen
Gennadiy Rozental
Howard Hinnant
Ion Gaztañaga
Mathias Gaunard
Michael Fawcett
Olaf Krzikalla
Paul Rose
Timmo Stange
Tobias Schwinger
Tom Brinkman