
On Nov 23, 2009, at 6:23 PM, Stefan Strasser wrote:
I think there is something very wrong with boost's implementation of unordered_set/map.
1 unordered_set<int> s; 2 for(int c=0;c<nr;++c) s.insert(c); 3 for(int c=nr-1;c>=0;--c) s.erase(s.find(c));
results:
nr: time 20000: 0m0.244s 40000: 0m0.940s 60000: 0m5.584s 80000: 0m4.232s 100000: 0m34.814s 120000: 0m33.486s
the function that's causing this is erase(), which is specified to be O(1) average, O(n) worst. I can't think of a reason why it should be O(n) in this case (perfect hash, erase doesn't rehash). but even if it was, the results should be at worst quadratic.
load_factor() is below 1, bucket_count(x) is 1.
if line 3 is changed to:
3 for(int c=0;c<nr;++c) s.erase(s.find(c));
20000: 0m0.008s 40000: 0m0.016s 60000: 0m0.028s 80000: 0m0.032s 100000: 0m0.040s 120000: 0m0.044s
as you would expect.
even if I do the very same thing as above using Boost.Intrusive (using a load factor of 1), I get expected results:
struct element : intrusive::unordered_set_base_hook<>{ ...
for(int c=0;c<nr;++c) s.insert(*new element(c)); for(int c=nr-1;c>=0;--c) s.erase(s.find(element(c)));
20000 0m0.008s 40000 0m0.016s 60000 0m0.024s 80000 0m0.024s 100000 0m0.032s 120000 0m0.040s
but surprisingly enough, the TR1 implementation shipped with libstdc++ also has a problem like that:
20000: 0m0.368s 40000: 0m1.448s 60000: 0m1.668s 80000: 0m7.160s 100000: 0m7.808s 120000: 0m13.349s
am I missing something or is there really the same bug in 2 major implementations of unordered_set? it effectively caused a deadlock in some of my code.
I'm reviving an old thread with new information: The LWG looked at this issue last week: http://home.roadrunner.com/~hinnant/issue_review/lwg-active.html#579 We were all over the map on it. We finally left it in Open status in hopes of learning more in the next few months. One solution I've been thinking of is a variant of what is currently proposed in the issue: void quick_erase(const_iterator); The solution I've thought about was actually invented to solve another problem. It is described here: http://home.roadrunner.com/~hinnant/issue_review/lwg-closed.html#839 (see the comment dated 2009-09-19). Here is a brief synopsis: node_ptr remove(const_iterator p); pair<iterator, bool> insert(node_ptr&& nd); where node_ptr is a nested typedef in the container to unique_ptr<__node, __node_deleter<__node_allocator>>. With this tool one can remove a node from a container, change its value, and insert it back into the container (or another which has an equal allocator). E.g.: auto p = m.remove(i); // no node deallocation p->first = new_key; // modify node external to the container m.insert(std::move(p)); // no node allocation It solves the quadratic behavior by: for(int c=nr-1;c>=0;--c) s.remove(s.find(element(c))); At each remove() the node_ptr is ignored/destructed, thus deallocating the node. There is no iterator increment. This is a performance-tested solution with your test program: 20000: 0m0.012s // your time is 0m0.244s 40000: 0m0.017s // your time is 0m0.940s 60000: 0m0.024s // your time is 0m5.584s 80000: 0m0.028s // your time is 0m4.232s 100000: 0m0.037s // your time is 0m34.814s 120000: 0m0.041s // your time is 0m33.486s 140000: 0m0.052s 160000: 0m0.057s 180000: 0m0.064s 200000: 0m0.069s 220000: 0m0.074s 240000: 0m0.080s 260000: 0m0.086s 280000: 0m0.091s 300000: 0m0.097s The bad news is that the LWG will consider this much too inventive at this late date. That being said, I believe it solves several real world performance problems (this one and others listed in LWG 839). I offer it here in case someone wants to implement it for boost containers. In case you don't like the "too inventive" response from the C++ committee, I'm the one to yell at. I will happily forward your response to the committee. -Howard