
On Mar 13, 2005, at 11:13 AM, Peter Dimov wrote:
implementation to which you port your program. Your original tests are still valid.
The other side of predictability - that the hash function is not affected by types or the particular representation of a sequence of values - means that you don't necessarily have to re-run the performance benchmark of your unordered_maps when you refactor the key to use int[3] instead of struct { short, short, short }, or a wstring instead of a string.
Fundamentally, what makes for a good hash function is that things that are equal (however you are defining equality) always hash to the same value, while any small difference between two things should cause them to hash to completely different values. In an arbitrary program would I expect an int[3] to be equal to a struct { short, short, short } when the values happened to be equal? No. What if the change in representation you spoke of was to rotate the coordinate axes for my vector, would I expect values with the same coordinates in the two different coordinate-systems, represented by different structs to be equal? No. If I have two structs representing complex numbers, one using a Cartesian representation and the other polar would I expect them to be treated as equal because their coordinates are equal? No -- I would be appalled if they were. A general purpose hash function that behaves like this is a weak one at best (and in some situations it can be a horrendously bad one). Your argument is that one should allow those developers who are making one specific change in representation out of the hundreds of possible reasonable ones should be allowed to be able to get away with being slipshod (and yes, not re-running the performance benchmark even with this deliberate weakening of the benchmark would be slipshod, though justifiably under many conditions). I don't buy it. Now if you made an argument that the benefits of strengthening it were, in practice, too minor to consider and the costs (to, e.g., seed the hash combination sequence with a class dependent value) was too high, then I might agree. Doing something that is theoretically "wrong" because it is pragmatically the "right" thing to do, is good engineering. Arguing that something that is theoretically "wrong" is in some real sense the right thing to do is bad engineering, bad math and bad science. Topher