
In-Reply-To: <003501c52a54$a28c4b90$6601a8c0@pdimov> pdimov@mmltd.net (Peter Dimov) wrote (abridged):
size_t hash_value( int v ) { return size_t(v) + default_seed; }
I wonder what do we gain from this.
As you pointed out when I first suggested adding the constant in hash_combine: This would mean N fixes for a sequence of length N, when one simple fix is enough. The initial value of the seed is arbitrary so we can vary it at will. hash_combine, on the other hand, is based on research and has apparently been selected as one of the best amongst the shift+xor family. This doesn't mean that it's sacred, just that if we change it we'll no longer be able to just point at a PDF as a rationale but will need to do our own research (which might be a good idea anyway, though.) :-)
From the point of view of hash_combine the effect is the same
The difference shows when you have something like: struct A { int x, y; }; struct B { A a, b, c; }; size_t hash_value( const A &a ) { size_t hash = 0xbead1; hash_combine( hash, a.x ); hash_combine( hash, a.y ); return hash; } size_t hash_value( const B &b ) { size_t hash = 0xbead2; hash_combine( hash, b.a ); hash_combine( hash, b.b ); hash_combine( hash, b.c ); return hash; } Here hash_combine is called 9 times, but hash_value(int) is only called 6 times, so we get different results depending on where the constant is added.
and we now rely on the user overloads of hash_value to not produce a zero.
That's surely reasonable. The hash_value of any user-defined class should be defined in terms of the hash_values of primitives, and boost provides all of those.
This reflects their intended use. The two argument overload is used when one has a whole range and wants its hash value, as in the hash_value overload for std::vector, for example.
I actually think this is going to be fairly rare. This is mainly because there will usually be a constant thrown in to represent the type of the object. (I appreciate you won't do that for std containers, but I maintain its a good idea for user-defined types.) The 2-argument version is strictly redundant as it must be defined in terms of the 3-argument version. It's just a convenience function, used mainly by the std containers. Admittedly hash_combine is also (nearly) redundant, being definable as: void hash_combine( size_t &hash, const T &t ) { hash_range( hash, &t, &t+1 ); } if T does not overload address-of. Incidently, do you agree we will sometimes want to pass a hash value as the second argument to hash_combine? Like: hash_combine( hash, hash_range( first, last ) ); hash_combine( hash, obj.get_hash( some_arg ) ); If not, then maybe we should have a hash_combine that just combines hashes. At the moment the combining is mixed in with the getting; it's not a very orthogonal API. -- Dave Harris, Nottingham, UK