Re: [boost] Re: [review] hash functions

In-Reply-To: <01c301c52989$d50a2f40$6801a8c0@pdimov> pdimov@mmltd.net (Peter Dimov) wrote (abridged):
I think that it is better to fix the zero trap in hash_combine, because this does not depend on users remembering to initialize the seed to default_seed, instead of zero.
Did you see my suggestion of changing the primitive hash_value functions instead? That also does not depend on users remembering to initialise the seed. size_t hash_value( int v ) { return size_t(v) + default_seed; } And similarly for long, etc. Is there a reason to reject this?
2. Add a hash_range overload that takes a seed.
2a. void hash_range( size_t & seed, It first, It last ); 2b. size_t hash_range( size_t seed, It first, It last ); 2c. size_t hash_range( It first, It last, size_t seed );
which are listed in order of my preference.
So we're talking about a family of 3 functions: 2a: void hash_combine( size_t &seed, T value ); void hash_range( size_t &seed, It first, It last ); size_t hash_range( It first, It last ); 2b: void hash_combine( size_t &seed, T value ); size_t hash_range( size_t seed, It first, It last ); size_t hash_range( It first, It last ); 2c: void hash_combine( size_t &seed, T value ); size_t hash_range( It first, It last, size_t seed=0 ); I think 2b is very error-prone, because of the lack of consistency between how hash_combine and the seeded version of hash_range return their results. 2a is less error-prone, but it seems needlessly confusing to have the two versions of hash_range return their results in different ways. In general there's a lot of similarity between hash_combine and hash_range which those versions fail to bring out. I'd rather be more consistent, eg: (a) size_t hash_combine( size_t seed, T value ); size_t hash_range( size_t seed, It first, It last ); or: (b) void hash_combine( size_t &seed, T value ); void hash_range( size_t &seed, It first, It last ); (Perhaps with overloads to catch passing a non-size_t first argument.) I don't see much value in adding a hash_range which doesn't take a seed. Especially with (a), because we can easily pass 0 explicitly. -- Dave Harris, Nottingham, UK

Dave Harris wrote:
In-Reply-To: <01c301c52989$d50a2f40$6801a8c0@pdimov> pdimov@mmltd.net (Peter Dimov) wrote (abridged):
I think that it is better to fix the zero trap in hash_combine, because this does not depend on users remembering to initialize the seed to default_seed, instead of zero.
Did you see my suggestion of changing the primitive hash_value functions instead? That also does not depend on users remembering to initialise the seed.
size_t hash_value( int v ) { return size_t(v) + default_seed; }
And similarly for long, etc. Is there a reason to reject this?
I wonder what do we gain from this. From the point of view of hash_combine the effect is the same, and we now rely on the user overloads of hash_value to not produce a zero.
2. Add a hash_range overload that takes a seed.
2a. void hash_range( size_t & seed, It first, It last ); 2b. size_t hash_range( size_t seed, It first, It last ); 2c. size_t hash_range( It first, It last, size_t seed );
which are listed in order of my preference.
So we're talking about a family of 3 functions:
2a: void hash_combine( size_t &seed, T value ); void hash_range( size_t &seed, It first, It last ); size_t hash_range( It first, It last );
2b: void hash_combine( size_t &seed, T value ); size_t hash_range( size_t seed, It first, It last ); size_t hash_range( It first, It last );
2c: void hash_combine( size_t &seed, T value ); size_t hash_range( It first, It last, size_t seed=0 );
I think 2b is very error-prone, because of the lack of consistency between how hash_combine and the seeded version of hash_range return their results. 2a is less error-prone, but it seems needlessly confusing to have the two versions of hash_range return their results in different ways.
This reflects their intended use. The two argument overload is used when one has a whole range and wants its hash value, as in the hash_value overload for std::vector, for example. The three argument overload is used when one has an intermediate seed which should accumulate the next part of the input.
participants (2)
-
brangdon@cix.compulink.co.uk
-
Peter Dimov