
Andrey Semashev wrote:
2. Somewhat related, on the use case of extending the size of the hash value by repeatedly calling result(). Is this something that should be encouraged? I'm not a specialist, but my understanding is that extending the hash value this way does not improve its quality, as you get a series of bits in the value that are predetermined based on the lower bits. Isn't that effectively equivalent to just e.g. zero-extending the hash value?
No, it isn't. Hashes often have more internal state than they output as the final result, so extension can actually produce non-redundant bits. This is true, for example, for xxhash and siphash, and there's an example in the docs of extending xxhash.
Even if the effective number of bits doesn't change, it's still not the same as zero-extending, because a zero-extended hash is of very poor quality. (The constant zero bits, being constant and zero, are completely unaffected by the input.)
Somewhat related, if you ask get_integral_result for more bits than there are in the input value, it takes care to not just give you something zero- extended, for the same reason.
Ideally, each bit in the hash value should have about 50% probability of being 1. This is called "avalanche property".
So, for example, if you have a 32 bit hash function and you need 64 bits, this
is how you would do it, ordered from best to worst:
// as good as possible
uint64_t f1( Hash& hash )
{
uint32_t r1 = hash.result();
uint32_t r2 = hash.result();
return (uint64_t(r1) << 32) | r2;
}
// not that bad, but worse than the above
uint64_t f2( Hash& hash )
{
return get_integral_result