Re: [boost] [hash2] Question about cryptographic guarantees of hash_append_unordered_range

8 Dec 2024

      Ivan Matek wrote:
...
Hi everyone,
Implementation uses uint64_t in which it sums into hashes of particular
elements(each computed with "fresh" hasher so ordering does not matter),
then it finally hashes that sum and range size into original passed hasher.
I see no problem with logic this since it cleverly exploits the fact that uint64_t
overflow is defined and that sum is commutative.
But I wonder if this preserves the cryptographic quality of original hash.
For example if we use sha2 it seems to me that finding collisions becomes
easier since sum is "only" 64 bits, and length is "only" 64 bits(or 32 on some
systems), and may be even less if attacker knows unordered range was not
super long, e.g. number of items was less than 10000.
Ideally I'd use 128 or 256 bit integer there, but using __uint128_t portably is
still a pain, and hashing unordered containers is a very niche use case. You'd
probably never encounter it in practice... well, except for boost::json::object,
I suppose.

Note that if you have things in the message preceding the unordered range,
this effectively serves as a seed for the per-element hash function, which
makes engineering collisions harder.

Still, 64 bits are 64 bits. 128 would have been a good compromise between
speed and security, but std::uint128_t is not here yet. 

https://github.com/cplusplus/papers/issues/1793

(Please don't leave comments on the above issue, it's not for discussion,
but for tracking paper progress.)

Matt Borland is considering proposing boost::uint128_t to Core, but that
would mean depending on Core.

Re: [boost] [hash2] Question about cryptographic guarantees of hash_append_unordered_range

Peter Dimov