multi_index_container

older
Re: [Boost-users] linker errors on...

Aashit Soni

16 Feb 2006 16 Feb '06

10:16 a.m.

Hi all, I am using multi_index_container on MSVC6, WinXP. I ported my existing Hash and order list to multi_index_container. which is working quite expected? a pure vanila hash_index with ordered_index I am facing some problem in fine tuning the hash_index which is slower compared to my previous implementation. I am looking for different ways how can i fine tune the hash_index in multi_index_container? specifically i am yet to discover, How can i set the bucket size for the hash container? How can configure the container for maximum store it can reserve for me in one shot? any other optimization parameter. If any one has some idea. It will be of great help to me. regards, aashit -- A a s h i t

Show replies by date

Joaquín Mª López Muñoz

16 Feb 16 Feb

11:54 a.m.

Hi Aashit, Aashit Soni ha escrito:

...

Hi all,

I am using multi_index_container on MSVC6, WinXP. I ported my existing Hash and order list to multi_index_container. which is working quite expected? a pure vanila hash_index with ordered_index

I am facing some problem in fine tuning the hash_index which is slower compared to my previous implementation.

Slower in which aspect? Insertion, deletion, lookup?

...

I am looking for different ways how can i fine tune the hash_index in multi_index_container?

specifically i am yet to discover, How can i set the bucket size for the hash container?

You cannot control bucket size explicitly. Hashed indices occupancy level is controlled thru a parameter called load factor (lf) defined as: lf=size()/bucket_count() In general, the lower lf, the faster lookup operations are. You can set the maximum value of lf to be allowed with max_load_factor(): float mlf=c.max_load_factor(); // query max lf c.max_load_factor(0.8f); // set max lf When the load factor reaches the specified maximum, the number of buckets is automatically increased. By default max_load_factor()=1.0. Lower this for faster lookup, at the expense of more memory usage (more buckets.)

...

How can configure the container for maximum store it can reserve for me in one shot?

Use rehash(), documented at http://boost.org/libs/multi_index/doc/reference/hash_indices.html#hash_polic... prior to mass insertion of elements. For instance: // insert 1000000 elements. With lf=1.0, we will need 1000000 buckets // so as to be under the specified max load factor c.rehash(1000000); c.insert(first,last); Doing a prior rehash is not necessary, but will save you some intermediate automatic rehashings, pretty much the same way as it happens with std::vector and reserve.

...

any other optimization parameter.

Other than this, your best option is to choose a good hashing function. The default is expected to work well except for exceptional situations. You might want to use the hashing function used in your previous Hash container so as to check out whether it does a better job than Boost.Hash, although I'd say this is unlikely.

...

If any one has some idea. It will be of great help to me.

regards, aashit

Hope this helps a little, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Aashit Soni

17 Feb 17 Feb

5:56 a.m.

Thank you Mr Lopez,

...

Slower in which aspect? Insertion, deletion, lookup? I have observed it is slower in insertion. I am roughly creation some 2 entries into it. One entry is arround 10 Bytes of information. Each creation also creates some external objects to that are being cached in this container.

- I am having some problem in look up also. My hash index is non_unique. I am not sure how the collisions are handled? When i call find and if the collisions are there how shall i iterate so that i get another element store with in same line. In some case, i need to reserve the space for the elements, So when i create my multi_index container, i want to tell it saying that i need x amount of space so please do the allocation and keep it ready. Is such facility available. I will just incorporate- your sugguesion and check how it is working. Thank you once again, I really appreciate your time. regards, aashit On 2/16/06, Joaquín Mª López Muñoz <joaquin@tid.es> wrote:

...

Hi Aashit,

Aashit Soni ha escrito:

...
Hi all,

I am using multi_index_container on MSVC6, WinXP. I ported my existing Hash and order list to multi_index_container. which is working quite expected? a pure vanila hash_index with ordered_index

I am facing some problem in fine tuning the hash_index which is slower compared to my previous implementation.

Slower in which aspect? Insertion, deletion, lookup?

...
I am looking for different ways how can i fine tune the hash_index in multi_index_container?

specifically i am yet to discover, How can i set the bucket size for the hash container?

You cannot control bucket size explicitly. Hashed indices occupancy level is controlled thru a parameter called load factor (lf) defined as:

lf=size()/bucket_count()

In general, the lower lf, the faster lookup operations are. You can set the maximum value of lf to be allowed with max_load_factor():

float mlf=c.max_load_factor(); // query max lf c.max_load_factor(0.8f); // set max lf

When the load factor reaches the specified maximum, the number of buckets is automatically increased. By default max_load_factor()=1.0. Lower this for faster lookup, at the expense of more memory usage (more buckets.)

...
How can configure the container for maximum store it can reserve for me in one shot?

Use rehash(), documented at

http://boost.org/libs/multi_index/doc/reference/hash_indices.html#hash_polic...

prior to mass insertion of elements. For instance:

// insert 1000000 elements. With lf=1.0, we will need 1000000 buckets // so as to be under the specified max load factor c.rehash(1000000); c.insert(first,last);

Doing a prior rehash is not necessary, but will save you some intermediate automatic rehashings, pretty much the same way as it happens with std::vector and reserve.

...
any other optimization parameter.

Other than this, your best option is to choose a good hashing function. The default is expected to work well except for exceptional situations. You might want to use the hashing function used in your previous Hash container so as to check out whether it does a better job than Boost.Hash, although I'd say this is unlikely.

...
If any one has some idea. It will be of great help to me.

regards, aashit

Hope this helps a little,

Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

-- A a s h i t

Aashit Soni

5:59 a.m.

Some correction!! in my previous mail. Thank you Mr Lopez,

...

Slower in which aspect? Insertion, deletion, lookup? I have observed it is slower in insertion. I am roughly creating some 200,000 entries into it. One entry is arround 10 Bytes of information. Each creation also creates some external objects to that are being cached in this container.

- I am having iteration problem in hash -look up also. My hash index is non_unique. I am not sure how the collisions are handled? When i call find and if the collisions are there i get the last inserted node, how shall i get previously inserted nodes. In some case, i need to reserve the space for the elements, So when i create my multi_index container, i want to tell it saying that i need x amount of space so please do the allocation and keep it ready. Is such facility available. I will just incorporate- your sugguesion and check how it is working. Thank you once again, I really appreciate your time. regards, aashit

...

On 2/16/06, Joaquín Mª López Muñoz <joaquin@tid.es> wrote:

...
Hi Aashit,

Aashit Soni ha escrito:

...
Hi all,

I am using multi_index_container on MSVC6, WinXP. I ported my existing Hash and order list to multi_index_container. which is working quite expected? a pure vanila hash_index with ordered_index

I am facing some problem in fine tuning the hash_index which is slower compared to my previous implementation.

Slower in which aspect? Insertion, deletion, lookup?

...
I am looking for different ways how can i fine tune the hash_index in multi_index_container?

specifically i am yet to discover, How can i set the bucket size for the hash container?

You cannot control bucket size explicitly. Hashed indices occupancy level is controlled thru a parameter called load factor (lf) defined as:

lf=size()/bucket_count()

In general, the lower lf, the faster lookup operations are. You can set the maximum value of lf to be allowed with max_load_factor():

float mlf=c.max_load_factor(); // query max lf c.max_load_factor(0.8f); // set max lf

When the load factor reaches the specified maximum, the number of buckets is automatically increased. By default max_load_factor()=1.0. Lower this for faster lookup, at the expense of more memory usage (more buckets.)

...
How can configure the container for maximum store it can reserve for me in one shot?

Use rehash(), documented at

http://boost.org/libs/multi_index/doc/reference/hash_indices.html#hash_polic...

prior to mass insertion of elements. For instance:

// insert 1000000 elements. With lf=1.0, we will need 1000000 buckets // so as to be under the specified max load factor c.rehash(1000000); c.insert(first,last);

Doing a prior rehash is not necessary, but will save you some intermediate automatic rehashings, pretty much the same way as it happens with std::vector and reserve.

...
any other optimization parameter.

Other than this, your best option is to choose a good hashing function. The default is expected to work well except for exceptional situations. You might want to use the hashing function used in your previous Hash container so as to check out whether it does a better job than Boost.Hash, although I'd say this is unlikely.

...
If any one has some idea. It will be of great help to me.

regards, aashit

Hope this helps a little,

Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

-- A a s h i t

-- A a s h i t

Joaquín Mª López Muñoz

7:47 a.m.

Aashit Soni ha escrito:

...

Some correction!! in my previous mail.

Thank you Mr Lopez,

...
Slower in which aspect? Insertion, deletion, lookup? I have observed it is slower in insertion. I am roughly creating some 200,000 entries into it. One entry is arround 10 Bytes of information. Each creation also creates some external objects to that are being cached in this container.

Well, it's hard to say. Note that if you're comparing a preexistent hash table with a multi_index_container composed of a hashed plus an ordered index, the comparison is not fair --the multi-index container is doing more work. On the other hand, it is perfectly possible that your previous hash container is just faster, but I don't think it can be *much* faster. If you can provide me (through the list or privately) with some test code I can take a deeper look.

...

- I am having iteration problem in hash -look up also. My hash index is non_unique. I am not sure how the collisions are handled?

non_unique means equivalent elements (elements with the same key) are allowed into the container. In the particular case of hashed indices, equivalent elements are guaranteed to be adjacent.

...

When i call find and if the collisions are there i get the last inserted node, how shall i get previously inserted nodes.

Use equal_range() to get all of them. As explained above, equivalent elements are kept adacent.

...

In some case, i need to reserve the space for the elements, So when i create my multi_index container, i want to tell it saying that i need x amount of space so please do the allocation and keep it ready. Is such facility available.

No such facility, sorry. This preallocation stuff is not in the spirit of node-based STL containers. You might gain some performance by using custom allocators specialized for small node requests: take a look for instance at boost::fast_pool_allocator: http://boost.org/libs/pool/doc/interfaces/pool_alloc.html

...

I will just incorporate- your sugguesion and check how it is working.

Thank you once again, I really appreciate your time. regards, aashit

Good luck, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Aashit Soni

8:28 a.m.

Thank you for your quick response, My intension about the comparision was to get some more functions with which, i can squeeze performance from multi_index_container. You are absolutely right that my comparision is not fair. I was expecting lot of things at one go. Still, I need to add lot of functionality on top of my multi_index_container to provide robust retrival and recycling. I think The best thing is to test it is with full prototype implementation. I will verfify and definately share you my experience and test results. Because my existing implementation reclying is very weak, and for my peformance testing i have disabled it. So the adverse results are shown. I will just write the code to set the load factor and equal_range() retrieval. It may be possible i will finish my prototype and then will swith to quantify for performance analysis. I must say it was to early for me to say these result of multi_index_container as i am adding extra overhead for one additional ordered list. This list is yet to utilized. I appreciate your support and time. Thanking you, aashit On 2/17/06, Joaquín Mª López Muñoz <joaquin@tid.es> wrote:

...

Aashit Soni ha escrito:

...
Some correction!! in my previous mail.

Thank you Mr Lopez,

...
Slower in which aspect? Insertion, deletion, lookup? I have observed it is slower in insertion. I am roughly creating some 200,000 entries into it. One entry is arround 10 Bytes of information. Each creation also creates some external objects to that are being cached in this container.

Well, it's hard to say. Note that if you're comparing a preexistent hash table with a multi_index_container composed of a hashed plus an ordered index, the comparison is not fair --the multi-index container is doing more work. On the other hand, it is perfectly possible that your previous hash container is just faster, but I don't think it can be *much* faster. If you can provide me (through the list or privately) with some test code I can take a deeper look.

...
- I am having iteration problem in hash -look up also. My hash index is non_unique. I am not sure how the collisions are handled?

non_unique means equivalent elements (elements with the same key) are allowed into the container. In the particular case of hashed indices, equivalent elements are guaranteed to be adjacent.

...
When i call find and if the collisions are there i get the last inserted node, how shall i get previously inserted nodes.

Use equal_range() to get all of them. As explained above, equivalent elements are kept adacent.

...
In some case, i need to reserve the space for the elements, So when i create my multi_index container, i want to tell it saying that i need x amount of space so please do the allocation and keep it ready. Is such facility available.

No such facility, sorry. This preallocation stuff is not in the spirit of node-based STL containers. You might gain some performance by using custom allocators specialized for small node requests: take a look for instance at boost::fast_pool_allocator:

http://boost.org/libs/pool/doc/interfaces/pool_alloc.html

...
I will just incorporate- your sugguesion and check how it is working.

Thank you once again, I really appreciate your time. regards, aashit

Good luck,

Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

-- A a s h i t

7083

Age (days ago)

7084

Last active (days ago)

List overview

Download

5 comments

2 participants

participants (2)

Aashit Soni
Joaquín Mª López Muñoz