
on Wed Nov 26 2008, "Daniel James" <daniel_james-AT-fmail.co.uk> wrote:
2008/11/26 David Abrahams <dave@boostpro.com>:
on Wed Nov 26 2008, Ion GaztaƱaga <igaztanaga-AT-gmail.com> wrote:
For example, unordered_multisets have an additional pointer to group equal values in a singly linked list so that bucket traversing is linear to the number of different values instead of the number of values in the bucket. It's a nice feature and useful in several scenarios (I even added that option to Intrusive unordered containers), but you have to pay for it ;-)
Hmm, that doesn't make a lot of sense to me. Lots of equal values in a hash table can quickly destroy its O(1) lookup characteristics.
That's what it avoids. Because equal values are grouped together, you only need to examine a single key for all the equal values.
I understand.
Here's a quick diagram of a bucket in an unordered_multiset (assuming that the hash function is terrible and 5,3 and 2 all end up in the same bucket):
Yes, I figured it was something like that. It still doesn't make sense to me. If the number of repeated values in a bucket is small, it won't hurt to traverse all of them. If it is really large enough that traversing all equal values to get to the next value is expensive, it will destroy O(1) lookup in any buckets with collisions. Did anyone profile this in a real application and find it to be a win?
The black links represent the normal single linked list. The red links are a circular list running backwards through equal values. This results in a red link from the first node in a group of equals values to the last node - this lets you quickly skip over equal values. This structure is easy to maintain and has some nice results, such as a stable order when inserting.
Stable order when inserting is nice, but ought to be achievable without the overhead of an extra pointer per node. -- Dave Abrahams BoostPro Computing http://www.boostpro.com