
on Thu Nov 27 2008, "Daniel James" <daniel_james-AT-fmail.co.uk> wrote:
2008/11/27 David Abrahams <dave@boostpro.com>:
Yes, I figured it was something like that. It still doesn't make sense to me. If the number of repeated values in a bucket is small, it won't hurt to traverse all of them. If it is really large enough that traversing all equal values to get to the next value is expensive, it will destroy O(1) lookup in any buckets with collisions.
That's always an issue with hash tables. But you can improve matters by using a better hash function or lowering the maximum load factor.
Yes, but if you improve it enough, you don't end up with multiple key values in a single bucket, so your extra pointer is wasted.
But if your container is slowing down because you have elements with equal keys there's nothing you can do.
Please define "container is slowing down." I can't picture what you mean.
Maybe you could use unordered_map<key, vector<value> >?
Did anyone profile this in a real application and find it to be a win?
No, and nobody has found it to be a loss either.
It's a manifest loss; I don't need to profile it to see that the overhead per stored element can be significant!
It isn't about optimization, it's about meeting the complexity requirements.
Where did these requirements come from? In generic programming, complexity requirements should never be so strict that they impose significant efficiency cost. If that requirement is in the standard, it's a bug.
Stable order when inserting is nice, but ought to be achievable without the overhead of an extra pointer per node.
Stable order is not just nice, it's a requirement.
Again, whence the requirement?
But it was never the main purpose of this design, I implemented it before the standard changed,
Changed?
but it is a nice side product.
Sorry, I'm confused. -- Dave Abrahams BoostPro Computing http://www.boostpro.com