[hash2] Review of Hash2

13 Dec 2024

      ...
- What is your evaluation of the design?
The design is good, but there is a major issue with security/safety in
default constructors of the hash algorithms.

## HashDOS issue

The issue is related to HashDOS attacks. The default seed must not be 0.
Because with 0 the attacker could easily find
strings of the same size that result in the same trailing bits of the hash
result. As a result on those strings all
the operations with unordered containers degrade to O(N) for each string,
giving O(N*N) complexity for a set of strings.

The issue affects any parsing where the result is stored into an unordered
container: HTTP headers, HTTP cookies, JSON keys, XML...

Our in-house experiments show that a single laptop can easily put down a
few hundreds of servers just by sending the
same precomputed HTTP request. The servers start to use 100% CPU and do not
respond to other requests.

The problem is so critical, that for some time we write HasDOS tests to
make sure that our data structures are not
affected:
https://github.com/userver-framework/userver/blob/develop/universal/src/http...

With some math knowledge the attack becomes even more simple see
https://orlp.net/blog/breaking-hash-functions/

Quick fix for the HashDOS is to generate a random number at the program
start use it to initialize all the hashes on construction.

Is still leaves a theoretical possibility of attack if there's only one
instance of the application or the attacker
generates strings set for each instance and sends them in one go. We were
not patient enough to generate the exploit,
which does not actually mean that it is not possible.

A more mature fix is to generate a random seed for each request or
container. We were using that approach for quite some
time
https://github.com/userver-framework/userver/blob/d09b427b42fc80dcd66b8161db...
.

We also checked some other popular frameworks for the HasDOS issue. Too
many of them were affected (we reported to the authors,
but not all of them reacted to the problem).

All of the above leads to the following conclusion: developers are not
aware of that critical issue or do not take the issue seriously.
To not provoke vulnerabilities the hashing must be safe by default.

## tag_invoke is dreadful

tag_invoke is over complicated. It may make sense if there is a need to
specialize the behavior not just for the user
defined type, but also to specialize behavior of the user defined type
depending on the hashing algorithm and/or flavor.
Such functionality does not seem to be required here.

So instead of
```
    template<class Hash, class Flavor>
    friend void tag_invoke( boost::hash2::hash_append_tag const&,
        Hash& h, Flavor const& f, X const& v )
    {
        boost::hash2::hash_append(h, f, v.a);
        boost::hash2::hash_append(h, f, v.b);
    }
```

how about using a simpler approach:
```
    template<class Visitor>
    friend constexpr void hash_visit(Visitor visitor, X const& v)
    {
        visitor(v.a);
        visitor(v.b);
    }
```

where Visitor could be
```
[&h, f](const auto& value) {
    boost::hash2::hash_append(h, f, value);
}
```

## std::hash and hash_value

I'd appreciate functionality to reuse boost::hash2 functionality with
std::hash and hash_value. It would be ideal to
have something like:

```
template <class T>
  requires boost::hash2::supports_v<T>
struct std::hash {
    std::size_t operator()(const T& x) const {
        boost::hash2::default_hasher h{boost::hash2::on_program_start_seed};
        boost::hash2::hash_append(h, {}, x);
        return boost::hash2::get_integral_result<std::size_t>( h.result() );
    }
}

template <class T>
  requires boost::hash2::supports_v<T>
std::size_t hash_value(const T& x) { // put into Boost.Hash
    return std::hash<T>{}(x);
}
```

So that the user just needs to provide boost::hash2 customization for its
type, and after that it works with the
standard library hashing and old boost hash.
...
- What is your evaluation of the implementation?
Very good.
...
- What is your evaluation of the documentation?
Good.
...
- What is your evaluation of the potential usefulness of the library? Do
you already use it in industry?
Not used it. The library is very useful because it researches the constexpr
hashing, proposes hash combining and
attempts to solve the other std::hash issues.
...
- Did you try to use the library? With which compiler(s)? Did you have
any problems?
Not used it.
...
- How much effort did you put into your evaluation?
A quick reading of sources and docs.
...
- Are you knowledgeable about the problem domain?
Yes, but not an expert.
...
Ensure to explicitly include with your review: ACCEPT, REJECT, or
CONDITIONAL ACCEPT (with acceptance conditions).
CONDITIONAL ACCEPT. The behavior of the default constructor must be secure.
`tag_invoke` seems to be an overkill.

Antony Polukhin

Matt Borland

Peter Dimov

Vinnie Falco

Antony Polukhin

Peter Dimov

tags

participants (4)