Re: [boost] [hash2] Formal Review Begins

12 Dec 2024

      Samuel Neves wrote:
...
A minor question I have regards the constants used in get_result_multiplier().
The documentation states that this is used to produce a uniform output on the
target range, but it's unclear from the documentation or source code the
rationale or criteria for the method or constants used here. This seems to get
into integer hash territory, but for example the 4 -> 1 case consists of x ->
(x*0x7f7f7f7f) >> 24, for which easy differentials exist, e.g., (x,
x^0xc0800000) collide with probability ~1/4.
It's pretty hard to come up with "good" multipliers here because it's not
possible to quantify the "good" part.

There are basically two cases, one where the 32 bit input is uniformly
distributed, and one in which it isn't.

And when it isn't, that is, it comes from a "bad" hash, it's not really possible
to optimize the multiplier without a known input distribution, and we don't
have that.

So I came up with some ad hoc criteria here

https://github.com/pdimov/hash2/blob/develop/test/get_integral_result_4.cpp

and tried to make the multipliers work for them.

Re: [boost] [hash2] Formal Review Begins

Peter Dimov