
Antony Polukhin wrote:
CONDITIONAL ACCEPT. The behavior of the default constructor must be secure.
Hi Antony, thanks for the review. I of course agree that HashDOS is a problem. The library documentation points this out in several places, that's why SipHash is included and almost no other non-crypto hashes are, and its use is recommended. However, I don't agree that the default constructor of the hash algorithm _concept_ is the correct place to attempt to make things secure (or in practice, less insecure than before.) First, this would place the responsibility on hash algorithm authors, which doesn't seem correct, or reliable. Second, hash algorithms aren't only used in hash tables, and injecting nondeterminism in their default constructors seems user-hostile. The hash algorithm adaptor, once it's supplied by the library, would be a better place to do that, if we decide to do it. Third, using a single process-wide seed is not good practice (as you yourself observe.) It makes things more secure compared to unseeded use, but not really secure. The correct approach is to use a random seed (preferably of size 192 bits or more) that varies per connection, or per container. One might in principle make the case that a default constructor shouldn't be included, but I think that this, combined with the other requirements that are already causing pushback, is way too much innovation for 2025.
`tag_invoke` seems to be an overkill.
Consider the second example in https://pdimov.github.io/hash2/doc/html/hash2.html#hashing_objects_user_defi... class Str { private: static constexpr std::size_t N = 32; std::uint8_t size_ = 0; char data_[ N ] = {}; // ... }; and let N be something bigger, say 256. Using your proposed hash_visit approach, there's no way to include the range [data_, data_ + size_) efficiently in the message; you'd have to use a per-character loop. This isn't good for performance. To fix this, you'd need to extend the visitor interface so that it allows the equivalent of hash_append_range, and then you'll arrive on something isomorphic to the current approach, e.g. visitor( hash_append_range_tag{}, data_, data_ + size_ ); In addition, this has the usual drawback of reserving the name "hash_visit" in all namespaces, which is what the tag_invoke pattern is meant to solve. ...
With some math knowledge the attack becomes even more simple see https://orlp.net/blog/breaking-hash-functions/
All non-crypto hash functions except SipHash should be assumed broken.