
brad higgins <bhiggins <at> arbor.net> writes:
I am using boost multi-index, with a hashed-non-unique lookup. In my data, I am expecting many hash collisions. When I erase the multi- index, I find very poor performance. On investigation, I found that the hash collision data structure is a single linked list, and that removing an element involves traversing the entire list. Removing all of the elements involves traversing the list many times. Since my data has many collisions, I am traversing a very long list very many times.
Hi Brad, Yep, that's a shortcoming of the singly linked approach in the non_unique case. I decided to keep it like that for simplicity (code is shared with the unique variant) and because the case of having many duplicates is definitely not a normal scenario for hashed containers. [...]
I tried editing boost/multi_index/detail/hash_index_node.hpp, to make it a double-linked list, but I am seeing one of the included tests fail (test_safe_mode).
Do you mean you're passing the rest of the tests? Wow, that's a feat taking into account that the internal structure of the code is not documented.
I can't spot any bugs in my hash_index_node.hpp changes. What, apart from hash_index_node.hpp, needs to be updated, to support a double- linked list? boost/multi_index/detail/bucket_array.hpp? Others?
Off the top of my head I couldn't say whether you need to change bucket_array, I'd say it depends on the way you're implementing the doubly linked stuff. If you send me the modified files (eiher privately or through the list) I can take a look and report back. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo