[intrusive] Intrusive containers library strikes back

Hi to all, Olaf Krzikalla's Intrusive container library was once in the review queue, but sadly the he has no time to develop it further. The library is, IMHO, very useful for many applications and it's well designed. I was interested in continuing Olaf's work because intrusive containers would be very helpful for Boost.Interprocess users so I've contacted Olaf to ask permission to continue his library. He agrees and he has offered help so I'm planning to start some work on the library in a few weeks. Olaf has collected some user requests from the past and the library was developed a bit further than the last snapshot posted in his web-page (http://people.freenet.de/turtle++/intrusive.html). I don't know if someone else was developing the library and I wouldn't like to duplicate any effort. So if someone is actively working on the library, please reply this post so that we can collaborate . I'm currently planning to port the documentation to Quickbook and to post Olaf's last version to start requesting comments and new features. Once the current code is posted, I have some features that I would like add: -> Custom pointer types to be able to use intrusive containers in memory mapped regions. -> Adding a new iset class (currently there is only an intrusive multiset, but not a set). -> Adding configurable constant-time size() to containers (currently ilist/islist have linear time size()) -> Adding a pseudo-intrusive hash container. If someone needs/wants more features, now's the time to request them! Regards, Ion

On Wed, 13 Sep 2006 04:12:03 -0300, Ion Gaztañaga <igaztanaga@gmail.com> wrote:
If someone needs/wants more features, now's the time to request them!
I think intrusive containers would be useful. There was a separate initiative that added intrusiveness to the multi index library. I think it would be better to fold the many STL remakes that are popping up (pointer containers, intrusive containers/shared mem containers) into multi index than repeat the effort everywhere. Case in point: what do I do if I need a pointer-sharedmem-bimap? Bruno

On 9/17/06, Bruno Martínez <br1@internet.com.uy> wrote:
I think intrusive containers would be useful. There was a separate initiative that added intrusiveness to the multi index library. I think it would be better to fold the many STL remakes that are popping up (pointer containers, intrusive containers/shared mem containers) into multi index than repeat the effort everywhere. Case in point: what do I do if I need a pointer-sharedmem-bimap?
Hi, I'm working on an intrusive dynamic multi-index container. I've got some code working which allows me to add sequential views at runtime. I wondered as well if it was possible to share code for common operations in these containers. AFAIK pointer containers are just adaptors over standard containers. On the subject of the intrusive containers by Olaf. I had some email conversation with him in january and gave him my opinion and offered to help, however as it turned out I did not have enough time. I suggest bringing the containers into the intrusive namespace and drop the i prefix. Kevin

----- Mensaje original ----- De: Kevin Sopp <baraclese@googlemail.com> Fecha: Domingo, Septiembre 17, 2006 10:50 pm Asunto: Re: [boost] [intrusive] Intrusive containers library strikes back Para: boost@lists.boost.org
Hi, I'm working on an intrusive dynamic multi-index container. I've got some code working which allows me to add sequential views at runtime.
Hello Kevin, I am *very* interested in seeing your work. Is that possible? Are you building on Boost.MultiIndex, or is your material written from scratch? [...]
Kevin
Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

On 9/17/06, "JOAQUIN LOPEZ MU?Z" <joaquin@tid.es> wrote:
Hello Kevin, I am *very* interested in seeing your work. Is that possible? Are you building on Boost.MultiIndex, or is your material written from scratch?
I knew you'd answer ;) It is written from scratch. Basically the code isn't anywhere ready to be publically consumed. I still have to figure out some tough things. But the gist is this: - Derive your type from super_node. // can be done as a member too, but is not implemented - super_node manages access to a vector of sub_nodes. - sub_nodes are linked list nodes/red-black tree nodes, etc. - super_node holds these as a vector<void*>, it has no type information - sub_nodes are looked up via a minimal perfect hash function (MPHF) - minimal perfect hash function is generated in linear time! (BMZ algorithm, also found in cmph on sourceforge) this is a C++ implementation of the BMZ paper - default key type is typeid(index_type), user can provide other key type though Complexity is added because we can create subviews on data, unlike your multi_index container the number of sub_nodes per super_node does not need to be the same. For example one can have a sequenced<> index with a unary predicate to test the value to be inserted for some condition. - I have to manage the MPHFs for all used index combinations. That means super_node needs a pointer to the index_combination it belongs to. sequenced_index derives from a base class with some pure virtual functions that provide the bare necessities for a container. Mainly insert(super_node*) and erase(super_node*). Then insertion becomes: dynamic_multi_index::insert(value_type& v) { for each index index->insert(v.get_super_node()); } and sequenced_index::insert(value_type& v) { combination = v.get_super_node()->get_index_combination(); for each index in combination index->insert(v.get_super_node()); } I have decided against the use of a predicate template parameter in sequenced_index because it is not sufficient enough for more complex uses. For example: if (predicate(value)) insert into sequenced_index A else insert into sequenced_index B I was thinking about using some kind of inserter object that the user can compose for this purpose. I am interested in putting this up on sourceforge as soon as I feel that the code has advanced sufficiently. Kevin

Kevin Sopp ha escrito:
On 9/17/06, "JOAQUIN LOPEZ MU?Z" <joaquin@tid.es> wrote:
Hello Kevin, I am *very* interested in seeing your work. Is that possible? Are you building on Boost.MultiIndex, or is your material written from scratch?
I knew you'd answer ;)
:)
It is written from scratch. Basically the code isn't anywhere ready to be publically consumed. I still have to figure out some tough things. But the gist is this: - Derive your type from super_node. // can be done as a member too, but is not implemented - super_node manages access to a vector of sub_nodes. - sub_nodes are linked list nodes/red-black tree nodes, etc. - super_node holds these as a vector<void*>, it has no type information - sub_nodes are looked up via a minimal perfect hash function (MPHF)
As it happens, I did in the past a little sketch of how a dynamic multi_index_container (no intrusiveness) could be implemented, and basically all of the above coincides with my stuff, except one thing: what is this MPHF thing and what do you use it for?
- minimal perfect hash function is generated in linear time! (BMZ algorithm, also found in cmph on sourceforge) this is a C++ implementation of the BMZ paper - default key type is typeid(index_type), user can provide other key type though
Complexity is added because we can create subviews on data, unlike your multi_index container the number of sub_nodes per super_node does not need to be the same.
Very interesting possibility. [...]
I am interested in putting this up on sourceforge as soon as I feel that the code has advanced sufficiently.
All of this looks sexy. Looking forward to seeing your stuff soon.
Kevin
Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

On 9/18/06, Joaquín Mª López Muñoz <joaquin@tid.es> wrote:
As it happens, I did in the past a little sketch of how a dynamic multi_index_container (no intrusiveness) could be implemented, and basically all of the above coincides with my stuff, except one thing: what is this MPHF thing and what do you use it for?
It is used to lookup the appropriate sub_node in super_node in constant time since the offset isn't always the same. I use a minimal perfect hash function so that the sub_node vector doesn't waste any space. Its size always equals the number of contained sub_nodes. Kevin

Kevin Sopp wrote:
Hi, I'm working on an intrusive dynamic multi-index container. I've got some code working which allows me to add sequential views at runtime. I wondered as well if it was possible to share code for common operations in these containers. AFAIK pointer containers are just adaptors over standard containers.
Sharing core operations is definitely a good idea. But that will also set dependencies that you might want to avoid. For example one could imagine these dependencies(-> means depends on): multi_index -> intrusive_multi_index -> intrusive ^ | interprocess_containers ---- I don't know if Joaquín would want to add more dependencies to multi_index. Maybe this also limits the Joaquin's development rate. Anyway, the basic features I would like to have in this core operations are: -> list (doubly linked), rb tree, slist(singly linked) algorithms. -> Support for non-raw pointers. Based on these algorithms, it would nice to have these basic intrusive classes: -> list (with optional constant-time size) -> slist (with optional constant-time size) -> set/rbtree (with optional constant-time size) -> multiset/rbtree -> unordered_set -> unordered_multiset Olaf's code has also options implemented as policies: -> Node destructor might automatic unlink the node from the container -> Container destructor might automatic unlink all the nodes Olaf's offers also 3 type of accessors to make a class intrusive-capable: -> Derivation from accessor -> Member accessor -> Legacy accessor (for ilist) Member accesor has a performance penalty and I don't know if people would like to use it. Perhaps we should outline a layered approach to see which features each library needs a have common code library. Regards, Ion

Ion Gaztañaga ha escrito:
Kevin Sopp wrote:
Hi, I'm working on an intrusive dynamic multi-index container. I've got some code working which allows me to add sequential views at runtime. I wondered as well if it was possible to share code for common operations in these containers. AFAIK pointer containers are just adaptors over standard containers.
Sharing core operations is definitely a good idea. But that will also set dependencies that you might want to avoid. For example one could imagine these dependencies(-> means depends on):
multi_index -> intrusive_multi_index -> intrusive ^ | interprocess_containers ----
I don't know if Joaquín would want to add more dependencies to multi_index. Maybe this also limits the Joaquin's development rate.
Maxim Yegorushkin's proposal for an intrusive multi_index_container (see at http://tinyurl.com/e5uac ) does not induce a new dependency on regular multi_index; instead, it is more of a side-by-side add-on: multi_index ------------| ---> B.MI core intrusive multi_index --| Another surprising feature of Maxim's proposal is its very small size, he manages to inject intrusiveness into B.MI by merely changing a couple of files and adding a handful of new ones! My intention is to wait for Olafs's (or now Ion's) intrusive library to start being discussed and then perhaps convince Maxim to adopt analogous design decisions for his proposal. I hope an intrusive multi_index_container can eventually make it into B.MI some day. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Joaquín Mª López Muñoz wrote:
Ion Gaztañaga ha escrito:
Kevin Sopp wrote:
Hi, I'm working on an intrusive dynamic multi-index container. I've got some code working which allows me to add sequential views at runtime. I wondered as well if it was possible to share code for common operations in these containers. AFAIK pointer containers are just adaptors over standard containers. Sharing core operations is definitely a good idea. But that will also set dependencies that you might want to avoid. For example one could imagine these dependencies(-> means depends on):
multi_index -> intrusive_multi_index -> intrusive ^ | interprocess_containers ----
I don't know if Joaquín would want to add more dependencies to multi_index. Maybe this also limits the Joaquin's development rate.
Maxim Yegorushkin's proposal for an intrusive multi_index_container (see at http://tinyurl.com/e5uac ) does not induce a new dependency on regular multi_index; instead, it is more of a side-by-side add-on:
multi_index ------------| ---> B.MI core intrusive multi_index --|
Another surprising feature of Maxim's proposal is its very small size, he manages to inject intrusiveness into B.MI by merely changing a couple of files and adding a handful of new ones! My intention is to wait for Olafs's (or now Ion's) intrusive library to start being discussed and then perhaps convince Maxim to adopt analogous design decisions for his proposal. I hope an intrusive multi_index_container can eventually make it into B.MI some day.
Sorry for being a bit too late. Now, having had a little more practice with intrusive containers, the following design might me more appropriate: intrusive -> multi_index -> B.MI core It appears that for most if not all node-based containers the non-intrusive version can be built on top of the intrusive one. The non-intrusive version just takes care of node memory management, what users of the intrusive version prefer do themselves. insert() allocates the node and pass it down to intrusive::insert, erase() invokes intrusive::erase() then deallocates the erased nodes. Does it make sense?

Hi Maxim, ----- Mensaje original ----- De: Maxim Yegorushkin <maxim.yegorushkin@gmail.com> Fecha: Sábado, Septiembre 23, 2006 0:44 am Asunto: Re: [boost] [intrusive] Intrusive containers library strikes back Para: boost@lists.boost.org
Joaquín Mª López Muñoz wrote: [...]
Maxim Yegorushkin's proposal for an intrusive multi_index_container (see at http://tinyurl.com/e5uac ) does not induce a new dependency on regular multi_index; instead, it is more of a side-by-side add-on:
multi_index ------------| ---> B.MI core intrusive multi_index --|
Another surprising feature of Maxim's proposal is its very small size, he manages to inject intrusiveness into B.MI by merely changing a couple of files and adding a handful of new ones! My intention is to wait for Olafs's (or now Ion's) intrusive library to start being discussed and then perhaps convince Maxim to adopt analogous design decisions for his proposal. I hope an intrusive multi_index_container can eventually make it into B.MI some day.
Sorry for being a bit too late. Now, having had a little more practice with intrusive containers, the following design might me more appropriate:
intrusive -> multi_index -> B.MI core
Umm? From your comments below, I guess you mean multi_index -> intrusive -> B.MI core ?
It appears that for most if not all node-based containers the non- intrusive version can be built on top of the intrusive one. The non-intrusive version just takes care of node memory management, what users of the intrusive version prefer do themselves. insert() allocates the node and pass it down to intrusive::insert, erase() invokes intrusive::erase() then deallocates the erased nodes. Does it make sense?
IMHO it does. But the approach you took in your intrusive_multi_index sketch seems to me equally satisfactory. Basically, node creation and destroying is very localized within B.MI, and, more importantly, is completely dettached from the implementation of the indices themseleves, so it is easy to provide different implementations for his aspect resulting in traditional and intrusive versions of the container. Maybe you're seeing a defficiency in your later approach that I don't see? Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

JOAQUIN LOPEZ MU?Z wrote:
Maxim Yegorushkin's proposal for an intrusive multi_index_container (see at http://tinyurl.com/e5uac ) does not induce a new dependency on regular multi_index; instead, it is more of a side-by-side add-on: multi_index ------------| ---> B.MI core intrusive multi_index --|
Another surprising feature of Maxim's proposal is its very small size, he manages to inject intrusiveness into B.MI by merely changing a couple of files and adding a handful of new ones! My intention is to wait for Olafs's (or now Ion's) intrusive library to start being discussed and then perhaps convince Maxim to adopt analogous design decisions for his proposal. I hope an intrusive multi_index_container can eventually make it into B.MI some day. Sorry for being a bit too late. Now, having had a little more
Joaquín Mª López Muñoz wrote: [...] practice with intrusive containers, the following design might me more appropriate:
intrusive -> multi_index -> B.MI core
Umm? From your comments below, I guess you mean
multi_index -> intrusive -> B.MI core
I probably misnamed the concepts in the scheme.
It appears that for most if not all node-based containers the non- intrusive version can be built on top of the intrusive one. The non-intrusive version just takes care of node memory management, what users of the intrusive version prefer do themselves. insert() allocates the node and pass it down to intrusive::insert, erase() invokes intrusive::erase() then deallocates the erased nodes. Does it make sense?
IMHO it does. But the approach you took in your intrusive_multi_index sketch seems to me equally satisfactory. Basically, node creation and destroying is very localized within B.MI, and, more importantly, is completely dettached from the implementation of the indices themseleves, so it is easy to provide different implementations for his aspect resulting in traditional and intrusive versions of the container. Maybe you're seeing a defficiency in your later approach that I don't see?
I agree that it was satisfactory, just trying to get rid of false-perfectionism habit ;) Anyway, it's nice to see that the new intrusive container initiative is gaining momentum.

Bruno Martínez wrote:
On Wed, 13 Sep 2006 04:12:03 -0300, Ion Gaztañaga <igaztanaga@gmail.com> wrote:
If someone needs/wants more features, now's the time to request them!
I think intrusive containers would be useful. There was a separate initiative that added intrusiveness to the multi index library. I think it would be better to fold the many STL remakes that are popping up (pointer containers, intrusive containers/shared mem containers) into multi index than repeat the effort everywhere. Case in point: what do I do if I need a pointer-sharedmem-bimap?
Good point. But the problem is that users also want simple STL-like classes and they don't want to start using a class that offers a lot of additional features. That multi_index class might be also more expensive so using a multi_index as a simple single-index class can be a bad idea. If a user just wants an intrusive linked list class, I suppose that he/she expects sizeof(ilist) == 2*sizeof(void *) (plus an optional size member if he/she wants constant-time size() member). A list user might also want a non-constant size() member but a constant time splice(), etc... Another problem is the coordination between libraries and authors. Having a single library with a lot functionalities can be very difficult to maintain and it would require coordination between several authors. If you want to have shared memory multi_index you need to change multi_index implementation to accept a templatized pointer type. That can be really a big task. It's not something a wrapper can solve. Maybe ptr container can also offer wrapper over user-provided classes apart from using STL containers, so that we can have a ptr_vector based on a shared memory vector. Again, it would require support for non-raw pointers. But it would be nice to have some *core* algorithms somewhere (surely intrusive, multi_index, and intrusive_multi_index have their own version). Other Boost have also those algorithms as internal files. For example, boost/detail has rb tree algorithm implementation. Regards, Ion

Ion Gaztañaga ha escrito: [...]
If you want to have shared memory multi_index you need to change multi_index implementation to accept a templatized pointer type. That can be really a big task. It's not something a wrapper can solve.
This is something I've really wanted for someone bold to appear and do the job. As you say, making B.MI shmem-compatible is not trivial, though it shoudln't be immensely difficult. If noone finally shows I guess I'll have to do it myself sometime in early 2007. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo
participants (6)
-
"JOAQUIN LOPEZ MU?Z"
-
Bruno Martínez
-
Ion Gaztañaga
-
Joaquín Mª López Muñoz
-
Kevin Sopp
-
Maxim Yegorushkin