New subject: [multi_index] hashed_index erase(iterator) performance issue

16 Mar 2010

      Hello,
   I have encountered a performance issue with multi_index's  
hashed_index erase(iterator) routine.  The erase(iterator) call  
returns an iterator to the next element in the data structure.  When  
using this routine in a sparsely populated, large sized hash, it takes  
a very long time to complete.  For example, in my program, I have  
rehashed the hash table to 5 million buckets, and added 600,000  
elements.  Adding the elements takes a matter of seconds.  Then, I  
delete the elements.  Deleting the elements takes hours.
   I have profiled the code, and it looks to me like the performance  
hit is rooted in the increment of the iterator by the erase(iterator)  
call.  Since the hash is sparsely populated, the increment must  
advance over many empty buckets for each erase(iterator) call, in  
order to advance to the next valid element.
   My question is, why does the erase(iterator) function return an  
iterator at all?  (In comparison, sgi's stl::hash_map's erase()  
returns void.)  The hash is unordered, so the returned iterator is a  
random value, which I think decreases its usefulness (I expect most  
clients will discard the returned iterator.)  Additionally, the same  
behavior can be achieved by clients who desire this behavior if they  
post-increment their iterator argument themselves.  That is:

// After this line, cur_itr's initial element will be erased,
// AND cur_itr will be updated to the next element in the hash table
hash_table.erase(cur_itr++);

   Is there some way to remove elements from the hash without taking  
this iterator increment hit?  Would it be possible to add an erase  
routine (of some other name) which accepts an iterator argument, but  
does not automatically increment the argument?

Thanks,
Brad

[multi_index] hashed_index erase(iterator) performance issue

Brad Higgins

OvermindDL1

Brad Higgins

Joaquin M Lopez Munoz

tags

participants (3)