Re: [boost] [hash] regular behaviour of hash function for double values

31 Jan 2012

      On Tuesday, January 31, 2012 18:55:21 Daniel James wrote:
...
The requirements for the hash function are:
For two diﬀerent values
t1 and t2, the probability that h(t1) and h(t2) compare
equal should be very small, approaching 1.0 / numeric_-
limits<size_t>::max().
There's no requirement about how h(t1) and h(t2) are distributed
Right.
...
so it
must be the container's responsibility, since the container must work
well with any hash function that meets these requirements.
Wrong, there's no such requirement. It must work, not necessarilly well.
...
How else can you interpret it?
Simple: it is possible to design a hash function that will cause suboptimal 
performance with the container. No matter how you implement the container, 
with bit mixing or not, this statement still holds.

I understand your reasoning. You're trying to reduce the probability of 
unordered container being inefficient because of an inefficient hash function. 
This also approach has the benefit of simplification of hash functions 
definition.

What I'm saying is that this pessimization is the wrong thing to do because 
you make users pay for what they don't need. Most of the time, when you deal 
with hash containers, you have a pretty good idea of the key nature and the 
better ways to hash key values. There are times when the key values already 
have a good distribution (e.g. the keys are GUIDs) and there is no need in 
additional bit mixing. For other cases there are known good hashing algorithms 
so you often just use the one which better fits your needs. And for standard 
types I would expect std::hash to do the right thing. I know, the standard 
doesn't spell out that explicitly (which is a bad thing), but this is how it 
works in practice (from my experience).

Re: [boost] [hash] regular behaviour of hash function for double values

Andrey Semashev