[multi_index] container for groups?

Janek Kozicki

17 Jul 2007 17 Jul '07

1:45 p.m.

Hello, I just wanted to check grounds before I get to implementing this (don't know when ;). I want a container that allows to assign several groups (sth. like std::set<int> ) to id numbers (int). Let's say that I have following id numbers: 1,3,5,7. And I assign that 1 belongs to groups 1,2,3. Then I say that 2 belongs to groups 2,3,4. And 5 belongs to groups 2,4 and finaly 7 belongs to groups 1,4. So it would look like: id: groups 1 : 1,2,3 3 : 2,3,4 5 : 2,4 7 : 1,4 Using this container I want to iterate in all possible ways: - iterate over all ids from start to end - answer: 1,3,5,7 - iterate over all ids which belong to group 3 - answer: 1,3 - iterate over all groups from start to end - answer: 1,2,3,4 - iterate over all groups in which id 5 appears - answer: 2,4 I guess that this should be easy with multi_index, since that's what for they were designed :-) Also it would be nice to be able to give names to *some* groups. I expect to have about 10000 groups in a program (iterating over all id numbers from group 787 must be really quick!), and the user might want to name few groups. This can be slow, since it's only for GUI and user interaction. But I don't want to store a string with each group even if it's empty. With 10000 groups that would waste too much memory. I prefer to only store strings for groups which indeed have names, like that: group 1 : Walls group 2 : Grains other groups unnamed. can you give me some hints about how to approach that? -- Janek Kozicki |

Show replies by date

Matias Capeletto

17 Jul 17 Jul

2:03 p.m.

On 7/17/07, Janek Kozicki <janek_listy@wp.pl> wrote:

...

Hello,

I just wanted to check grounds before I get to implementing this (don't know when ;). I want a container that allows to assign several groups (sth. like std::set<int> ) to id numbers (int). Let's say that I have following id numbers: 1,3,5,7. And I assign that 1 belongs to groups 1,2,3. Then I say that 2 belongs to groups 2,3,4. And 5 belongs to groups 2,4 and finaly 7 belongs to groups 1,4. So it would look like:

id: groups 1 : 1,2,3 3 : 2,3,4 5 : 2,4 7 : 1,4

Using this container I want to iterate in all possible ways:

- iterate over all ids from start to end - answer: 1,3,5,7 - iterate over all ids which belong to group 3 - answer: 1,3 - iterate over all groups from start to end - answer: 1,2,3,4 - iterate over all groups in which id 5 appears - answer: 2,4

I guess that this should be easy with multi_index, since that's what for they were designed :-)

Yep, it is easy with MultiIndex. But this is exactly the kind of things Boost.Bimap was designed on top of Boost.MultiIndex. http://www.drivehq.com/web/matias.capeletto/bimap/doc/html/index.html In your case: struct id {}; struct group {}; typedef bimap< tagged< int, id >, multiset_of< tagged< int, group > >

...

bm_type;

... bm_type bm; assing::insert( bm ) ( 1, 1 ) ( 2, 1 ) ( 3, 1 ) ( 4, 2 ) ( 5, 2 ) ( 6, 3 ) ; // Ordered by id... BOOST_FOREACH( bm_type::map_by<id>::reference i, bm.by<id>() ) { std::cout << i->get<id>() << " belongs to " << i->get<group>() << endl; } // Ordered by group... BOOST_FOREACH( bm_type::map_by<group>::reference i, bm.by<group>() ) { std::cout << i->get<id>() << " belongs to " << i->get<group>() << endl; }

...

Also it would be nice to be able to give names to *some* groups. I expect to have about 10000 groups in a program (iterating over all id numbers from group 787 must be really quick!), and the user might want to name few groups. This can be slow, since it's only for GUI and user interaction. But I don't want to store a string with each group even if it's empty. With 10000 groups that would waste too much memory. I prefer to only store strings for groups which indeed have names, like that:

group 1 : Walls group 2 : Grains

other groups unnamed.

This seems like a good place for another bimap.

...

can you give me some hints about how to approach that?

Hope it helps. King regards Matias

Janek Kozicki

2:20 p.m.

Matias Capeletto said: (by the date of Tue, 17 Jul 2007 11:03:23 -0300)

...

On 7/17/07, Janek Kozicki <janek_listy@wp.pl> wrote:

...

Yep, it is easy with MultiIndex. But this is exactly the kind of things Boost.Bimap was designed on top of Boost.MultiIndex.

http://www.drivehq.com/web/matias.capeletto/bimap/doc/html/index.html

Great, thanks! I knew it was better to ask first, than start reading wrong docs (for multi_index) ;-)) -- Janek Kozicki |

Michael Goldshteyn

3:15 p.m.

"Matias Capeletto" <matias.capeletto@gmail.com> wrote in message news:e9b043a10707170703o394fee14s97cdecac86c46395@mail.gmail.com...

...

Yep, it is easy with MultiIndex. But this is exactly the kind of things Boost.Bimap was designed on top of Boost.MultiIndex.

http://www.drivehq.com/web/matias.capeletto/bimap/doc/html/index.html

Is Boost.Bitmap slated to become part of Boost any time soon? I don't see it as part of: http://www.boost.org/libs/libraries.htm Thanks, Mike

Matias Capeletto

3:32 p.m.

On 7/17/07, Michael Goldshteyn <mgoldshteyn@comcast.net> wrote:

...

"Matias Capeletto" <matias.capeletto@gmail.com> wrote in message news:e9b043a10707170703o394fee14s97cdecac86c46395@mail.gmail.com...

...
Yep, it is easy with MultiIndex. But this is exactly the kind of things Boost.Bimap was designed on top of Boost.MultiIndex.

http://www.drivehq.com/web/matias.capeletto/bimap/doc/html/index.html

Is Boost.Bitmap slated to become part of Boost any time soon? I don't see it as part of: http://www.boost.org/libs/libraries.htm

Hopefully in Boost 1.35. Thanks for the interest. Best regards Matias

Janek Kozicki

5:19 p.m.

Matias Capeletto said: (by the date of Tue, 17 Jul 2007 12:32:27 -0300)

...

On 7/17/07, Michael Goldshteyn <mgoldshteyn@comcast.net> wrote:

...
"Matias Capeletto" <matias.capeletto@gmail.com> wrote in message news:e9b043a10707170703o394fee14s97cdecac86c46395@mail.gmail.com...

...
Yep, it is easy with MultiIndex. But this is exactly the kind of things Boost.Bimap was designed on top of Boost.MultiIndex.

http://www.drivehq.com/web/matias.capeletto/bimap/doc/html/index.html

Is Boost.Bitmap slated to become part of Boost any time soon? I don't see it as part of: http://www.boost.org/libs/libraries.htm

Hopefully in Boost 1.35. Thanks for the interest.

was it already reviewed, or am I going to write a review? :-) -- Janek Kozicki |

Matias Capeletto

5:28 p.m.

On 7/17/07, Janek Kozicki <janek_listy@wp.pl> wrote:

...

Matias Capeletto said: (by the date of Tue, 17 Jul 2007 12:32:27 -0300)

...
On 7/17/07, Michael Goldshteyn <mgoldshteyn@comcast.net> wrote:

...
"Matias Capeletto" <matias.capeletto@gmail.com> wrote in message news:e9b043a10707170703o394fee14s97cdecac86c46395@mail.gmail.com...

...
Yep, it is easy with MultiIndex. But this is exactly the kind of things Boost.Bimap was designed on top of Boost.MultiIndex.

http://www.drivehq.com/web/matias.capeletto/bimap/doc/html/index.html

Is Boost.Bitmap slated to become part of Boost any time soon? I don't see it as part of: http://www.boost.org/libs/libraries.htm

Hopefully in Boost 1.35. Thanks for the interest.

was it already reviewed, or am I going to write a review? :-)

It was accepted in march :) But your feedback will be great! Really... Best regards Matias

Janek Kozicki

18 Jul 18 Jul

4 p.m.

New subject: [multi_index] a more complicated container

Matias Capeletto said: (by the date of Tue, 17 Jul 2007 14:28:28 -0300)

...

It was accepted in march :) But your feedback will be great! Really...

Hi Matias, thanks for your faith in... how good my comments might be :) Currently I'm not going to go straight to the C++ coding (aka. testing Bimap "live"). I am preparing grounds for huge refactoring. And I'm looking at how to do it the best way. My codebase of 15k lines already works quite good. Now in fact I want to reorganize everything inside so it will be clean again. Because currently more and more hacks appear to get something done. I plan to stop using my own plugin-loading and serialization library and switch to boost::extension and ::serialization. Propose my own Multimethods lirbary for boost inclusion (after boostifying it ;). And stop using my own containers, but use boost instead. So... in fact I have another question about containers. But I'm afraid that bimap is not an answer here... So I keep [multi_index] in the subject line :-) Imagine that in my scientific simulation software I have bodies that can interact with each other. Any number of bodies can interact. As you already know, bodies are idenfied by id numbers. But interactions are identified by std::set<int> - a group of bodies that are interacting. So I have a class Interaction and it's indexed by sets of numbers like: identifier interaction 1,2,3 class Interaction between boedies 1,2,3 2,3 Interaction 3,4 Interaction 2,4,6 some other Interaction This is 'groups' in fact (the one with which I started this thread). To make stuff easier to manage, I'll use negative numbers for group_id and positive for body_id. So basically I could rewrite above with: identifier interaction -1 class Interaction between boedies 1,2,3 -2 Interaction -3 Interaction -4 some other Interaction What I need here is a *cooperation* between containers (because information about group components is stored in other container). So with the Interaction container above I need to iterate FAST on following things: - all interactions from first to last (easy) - all bodies that take part in interaction -3, answer is: 3,4 - all interactions in which body 3 is involved, answer is: -1,-2,-3 This is convoluted :) And even worse... becasue I need interactions to have groups too! We can call it a group of groups, like this: identifier group interaction -1 -5 class Interaction between boedies 1,2,3 -2 -5 Interaction -3 -6 Interaction -4 -6 some other Interaction You can see that group -5 has two components: -1,-2, and group -6 has -3 and -4. now I need to - iterate over all interactions from group -5, answer is: -1 -2 - iterate over all bodies from group -5, answer is: 1,2,3 - all interactions involving body 3 that are in group -5, answer: -1 -2 - all interactions in which group of bodies 2 and 3 interact, answer: -1 -2 (may be slower) I'm stil not sure if that's all i need. Still working on it ;-) The hardest(??) part is that I will need separate containers (cannot make one huge container for everything). So imagine, I have a class Model, that holds the context of my computations. To simplify a little: I have two classes that describe a single body. Those are BodyState (position, velocity) and BodyShape (vertex positions). They reside in a std::map. So, I say, that body_states[5] and body_shapes[5] describe the same body with body_id=5. And If I need to know to which groups it belongs I look in groups[5]. To see its name (a body could have name too) I see into names[5]. Like this: typedef int id; typedef id body_id; typedef id group_id; class Model { bimap< tagged< int, body_id >, multiset_of< tagged< int, group_id > > > groups; // a bimap between positive body_id <-> negative group_id bimap<id,std::string> names; std::map<body_id , BodyState*> body_states; std::map<body_id , BodyShape*> body_shapes; /* Now I want to throw interactions into the basket. They can (I'm still not sure) use the 'groups' container to index the content. Easiest is to say: */ std::map<group_id, Interaction*> interactions; }; But here, group_id used in 'interactions' is a "group of groups". And (of course) I want to iterate over all interactions involved (or bodies involved) as described above. So I guess that I need multi_index here, since bimap won't be enough? :) PS: actually I have more classes and containers here, this is simplified a bit. This is not the end.... ;) But for the next part I'm almost sure that I'll need a custom container. I'll not describe it now, just to keep you excited a bit. -- Janek Kozicki |

Janek Kozicki

5:53 p.m.

New subject: [multi_index] a more complicated container - good example

Ok, I managed to find a good example, so it will be easier to get the picture. (because previous email doesn't seem too clear ;) But still when I write it I assume that you looked at "class Model" from parent post. Imagine that we want to simulate 4 magnets (eg. from loudspeakers) falling down on a ground. 1. To speed up calculation we decide that magnetic interaction does not exist above some defined distance (so we avoid calculating magnetic forces between all magnets). 2. magnets can collide with each other and with the ground 3. we say that ground can collide with the mangnets, and also emits gravity to all of them with unlimited distance. 5. ground doesn't move. Ok, so let's crank it out. First we name the bodies and the interactions using "bimap<id,std::string> names;". So it will be easier for humans to grasp. Remember that negative number is a group. names: id name 1 magnet 1 2 magnet 2 3 magnet 3 4 magnet 4 5 ground -1 all the magnets -2 groups of interactions happening through collision -3 groups of interactions happening through magnetic forces -4 groups of interactions happening through gravity Now we examine the simulation at some particular point in time. Let's see what interactions are happenning here. We have some bodies colliding with each other (also a three body collision). We have some magnets close enough interacting through magnetic force. Let's say that bodies 2 and 4 are currently too far to interact megnetically. bodies_involved group group_of_groups Interaction* instance 1,2 -5 -2 ... (collision) 2,3,4 -6 -2 ... (collision) 4,5 -7 -2 ... (collision with ground) 1,2 -8 -3 ... (magnetic interaction) 2,3 -9 -3 ... (magnetic interaction) 3,4 -10 -3 ... (magnetic interaction) 1,2,3,4,5 -11(*) -4 ... (gravity) (*) -10 is in fact not necessary, because it duplicates information in -4. During my calculations I must iterate separetely over all "collision" interactions, all "magnetic" interactions, and that (single) gravity interaction. Becuase different formulas govern collisions (Hooke's law), other magnetism (Maxwell's equations), others gravity (Newton's law). (in fact gravity is so special that currently I do it differently, but let's now try it that way) Also, my employer told me that "magnet 4" is really special, and I must answer if it will be damaged in this simulation, while others are not important. I must iterate over every interaction where "magnet 4" takes part to calculate something to see if it breaks. But it is simpler to iterate *first* over all "collision" interactions involving "magnet 4". because it's a different mathematical formula that will say if it will break or not. Then iterate over *other* categories with other formula. Also let's say that "magnet 2" is really dangerous for "magnet 4" if they touch each other I must instantly know this. Now, the point is, that I think (not sure!) is that I should compress this information, to make it simpler (for the computer, not me :-). So I will use that 'groups' container about which I started this thread. It contains information already written above. "bimap<id,group_id> groups;": id groups it belongs to 1 -1 -5 -8 -11 2 -1 -5 -6 -8 -9 -11 3 -1 -6 -9 -10 -11 4 -1 -6 -7 -10 -11 5 -7 -11 -5 -2 -6 -2 -7 -2 -8 -3 -9 -3 -10 -3 -11 -4 And finally the interaction container will use just one index to identify things (the last column "group_of_groups"): group_of_groups Interaction* -2 ... (collision) -2 ... (collision) -2 ... (collision with ground) -3 ... (magnetic interaction) -3 ... (magnetic interaction) -3 ... (magnetic interaction) -4 ... (gravity) I need to: - iterate on all interactions no. -2, answer: -5 -6 -7 - iterate on all bodies in interaction -2, answer: 1 2 3 4 5 (accidentally all bodies) - iterate on all bodies in interaction -6. answer: 2,3,4 - iterate on all interactions from group -2 that involve body 4, answer: -6 -7 - iterate on all interactions that involve body 1, answer: -5 -8 -11, -2 -3 -4 - check if there is an interaction from group -2, that involves bodies 2 and 4, answer: yes it is -6 that's it. The interaction container will heavily rely on information stored in 'groups' container to answer all the queries. Most of all, I really would like to have a constant answer time (regardless of container size), because I'm going to simulate 10'000 of bodies. If some other solution would be faster for 500 bodies I don't care, I need constant (possibly longer) time. PS: gravity in different way: You noticed that in fact only ground is an "emitter" of gravity, why the magnets are only receivers. I would need to distinguish the emitter/receiver somehow. But that's even a more different story. First I want to crunch the problem above :) -- Janek Kozicki

Janek Kozicki

6:06 p.m.

New subject: [multi_index] a more complicated container - good example

Janek Kozicki said: (by the date of Wed, 18 Jul 2007 19:53:27 +0200)

...

that's it. The interaction container will heavily rely on information stored in 'groups' container to answer all the queries.

another reason to keep 'groups' container outside of interaction container is that other containers stored in class Model need to use it also. This 'groups' container in fact helps keeping all the other containers to be ''organized'' :-) -- Janek Kozicki |

Joaquín Mª López Muñoz

19 Jul 19 Jul

10:13 a.m.

New subject: [multi_index] a more complicated container - good example

Hello, Janek, I've studied a bit the scenario you propose and I've got some questions and proposals related to a possible implementation with Boost.MultiIndex. Janek Kozicki ha escrito: [...]

...

Imagine that we want to simulate 4 magnets (eg. from loudspeakers) falling down on a ground.

1. To speed up calculation we decide that magnetic interaction does not exist above some defined distance (so we avoid calculating magnetic forces between all magnets).

2. magnets can collide with each other and with the ground

3. we say that ground can collide with the mangnets, and also emits gravity to all of them with unlimited distance.

5. ground doesn't move.

Ok, so let's crank it out. First we name the bodies and the interactions using "bimap<id,std::string> names;". So it will be easier for humans to grasp. Remember that negative number is a group.

names:

id name 1 magnet 1 2 magnet 2 3 magnet 3 4 magnet 4 5 ground -1 all the magnets -2 groups of interactions happening through collision -3 groups of interactions happening through magnetic forces -4 groups of interactions happening through gravity

Why do you assign positive IDs to bodies and negative IDs to groups and groups of interactions? As far as I can see there's no need to use this kind of loose type checking, since IDs for different classes of entities are handled separately.

...

Now we examine the simulation at some particular point in time. Let's see what interactions are happenning here. We have some bodies colliding with each other (also a three body collision). We have some magnets close enough interacting through magnetic force. Let's say that bodies 2 and 4 are currently too far to interact megnetically.

bodies_involved group group_of_groups Interaction* instance 1,2 -5 -2 ... (collision) 2,3,4 -6 -2 ... (collision) 4,5 -7 -2 ... (collision with ground) 1,2 -8 -3 ... (magnetic interaction) 2,3 -9 -3 ... (magnetic interaction) 3,4 -10 -3 ... (magnetic interaction) 1,2,3,4,5 -11(*) -4 ... (gravity)

(*) -10 is in fact not necessary, because it duplicates information in -4.

During my calculations I must iterate separetely over all "collision" interactions, all "magnetic" interactions, and that (single) gravity interaction. Becuase different formulas govern collisions (Hooke's law), other magnetism (Maxwell's equations), others gravity (Newton's law).

(in fact gravity is so special that currently I do it differently, but let's now try it that way)

Also, my employer told me that "magnet 4" is really special, and I must answer if it will be damaged in this simulation, while others are not important. I must iterate over every interaction where "magnet 4" takes part to calculate something to see if it breaks. But it is simpler to iterate *first* over all "collision" interactions involving "magnet 4". because it's a different mathematical formula that will say if it will break or not. Then iterate over *other* categories with other formula. Also let's say that "magnet 2" is really dangerous for "magnet 4" if they touch each other I must instantly know this.

Now, the point is, that I think (not sure!) is that I should compress this information, to make it simpler (for the computer, not me :-).

So I will use that 'groups' container about which I started this

...

thread. It contains information already written above.

"bimap<id,group_id> groups;":

id groups it belongs to 1 -1 -5 -8 -11 2 -1 -5 -6 -8 -9 -11 3 -1 -6 -9 -10 -11 4 -1 -6 -7 -10 -11 5 -7 -11 -5 -2 -6 -2 -7 -2 -8 -3 -9 -3 -10 -3 -11 -4

And finally the interaction container will use just one index to identify things (the last column "group_of_groups"):

group_of_groups Interaction* -2 ... (collision) -2 ... (collision) -2 ... (collision with ground) -3 ... (magnetic interaction) -3 ... (magnetic interaction) -3 ... (magnetic interaction) -4 ... (gravity)

OK, here's one possible realization. Note that there are some divergences with respect to some initial assumptions you made, please verify whether they're acceptable: Say bodies are implemented like this: struct body { body(int id):id(id){} ... int id; }; and what you call groups of interactions I'll simply call interactions and model them like this: struct interaction { interaction(int id,int type):id(id),type(type){} ... int id; int type; }; where id is self-explanatory and type codes what you call "groups of groups", using for instance some mnemonic enum: enum{collision=0,magnetic=1,gravitational=2}; interaction i5(5,collision); Now, all the relationships between interactions and bodies can be represented as a multi-index container over pairs of (body*,interaction*): typedef std::pair< body*, interaction*

...

body_interaction_value_type;

typedef multi_index_container< body_interaction_value_type, indexed_by< hashed_non_unique<body_id_extractor>, hashed_non_unique<interaction_id_extractor>, ordered_non_unique<interaction_type_extractor> >

...

body_interaction;

where the custom key extractors (which are defined in the complete program attached) do what their names suggest. So, body_interaction has indices on body ID, interaction ID and type of interaction. Some remarks: * Unlike your approach, mine does not create any kind of entities for "groups of groups": instead, the type of each interaction is just an attribute of that entity. I don't see any need for having groups of groups assigned IDs as if they were interactions. * Indices are hashed so as to provide constant time lookup complexities. This works best under the following assumptions: - Each interaction does not involve many bodies (with the possible exception of gravity, which nonetheless can be dealt with with special methods, namely via index #2.) - Each body is not involved in many interactions. As always, only real measurements can tell whether this model can really scale up. In the following examples, assume we've got a container of type body_interaction called bi.

...

I need to: - iterate on all interactions no. -2, answer: -5 -6 -7

accumulate( acc,bi.get<2>().equal_range(collision),interaction_id_extractor()); This uses index #2 (grouping by type of interaction); note that each interaction will we visited as many times as bodies it involves: to eliminate duplicates a O(1)-complecity consists in accumulating the entries into a unique hashed set, which is what accumulate does (see the code attached for more details.)

...

- iterate on all bodies in interaction -2, answer: 1 2 3 4 5 (accidentally all bodies)

accumulate( acc,bi.get<2>().equal_range(collision),body_id_extractor()); As the previous, but we extract body IDs instead of interaction IDs.

...

- iterate on all bodies in interaction -6. answer: 2,3,4

print(bi.get<1>().equal_range(6),body_id_extractor()); We look for all entries with interaction IDs 6 (in O(1) time) and lists their associated body IDs.

...

- iterate on all interactions from group -2 that involve body 4, answer: -6 -7

print( bi.equal_range(4),interaction_id_extractor(), is_interaction_type(collision)); We (efficiently) list the interactions in whicj voy 4 is involved and filter according to the type of collision. This works on the assumption that body 4 is not involved in many interactions, as explained above.

...

- iterate on all interactions that involve body 1, answer: -5 -8 -11, -2 -3 -4

print(bi.equal_range(1),interaction_id_extractor());

...

- check if there is an interaction from group -2, that involves bodies 2 and 4, answer: yes it is -6

print( bi.equal_range(2),interaction_id_extractor(), bll::bind(is_interaction_type(collision),bll::_1)&& bll::bind(does_interaction_affect_body(4,bi),bll::_1)); We look up the interactions in which body 2 is involved and filter as appropriate. I don't really know if this approaches what you've got in mind. Depending on the type of queries you're going to make, it might be benefficial to include more indices, but looks to me like your basic needs could be covered with the model I propose. Complete example attached, compiled under GCC 3.2 HTH, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Janek Kozicki

11:23 a.m.

New subject: [multi_index] a more complicated container - good example

Joaquín Ma López Munoz said: (by the date of Thu, 19 Jul 2007 12:13:43 +0200)

...

Complete example attached, compiled under GCC 3.2

Great! Many thanks for your effort. I'll study this carefully, and answer in several days. -- Janek Kozicki |

6591

Age (days ago)

6593

Last active (days ago)

List overview

Download

11 comments

4 participants

participants (4)

Janek Kozicki
Joaquín Mª López Muñoz
Matias Capeletto
Michael Goldshteyn