[hash2] Difference from paper: HashAlgorithm interface vs. function object

11 Dec 2024

      Hi,

The HashAlgorithm interface defined in Boost.Hash2 is notably different
from the one described in N3980
(https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3980.html).
Specifically, the paper defines HashAlgorithm as a function object,
whose operator() is equivalent to Boost.Hash2's HashAlgorithm::update
method. The function object also provides a conversion operator to
result_type, which is equivalent to Boost.Hash2's HashAlgorithm::result
method.

Is this discrepancy intentional and are there plans (on the paper
authors' or the library authors' side) to eliminate it?

Personally, I don't think a function object is the right interface for a
hash algorithm. Update and finalize are two distinct operations provided
by hash algorithms and both should be performed explicitly by the user.
I don't think a conversion operator is the right name for finalize.
Conversion operators are normally used when one type can meaningfully
"impersonate" another, and this is not the case here. The function
object is also misusing result_type, as its operator() doesn't return it.

The paper discusses that the fact that a hash algorithm is a function
object can be used to wrap it in std::function, which can be useful with
pimpl idiom and for the purpose of type erasure. I don't think this will
actually work, at least not as simple as the paper describes it, because
std::function won't allow to extract the computed hash digest (i.e. you
won't be able to call the type conversion operator on the wrapped
function object). I think, the user will have to write some wrapper
classes anyway to retain access to the wrapped hash algorithm.

However, this is still an interesting use case. Does Boost.Hash2 intend
to provide utilities to address it?

For example, Boost.Hash2 could provide a hash-algorithm-agnostic
interface for feeding an algorithm with input data, and a wrapper for an
external algorithm object that would implement this interface. Something
along these lines:

  class polymorphic_hash_algorithm
  {
  public:
    polymorphic_hash_algorithm(
      polymorphic_hash_algorithm const&) = delete;
    polymorphic_hash_algorithm& operator=(
      polymorphic_hash_algorithm const&) = delete;

    virtual void update(const void* data, size_t size) = 0;

  protected:
    polymorphic_hash_algorithm() = default;
    ~polymorphic_hash_algorithm() = default;
  };

  template< typename Hash >
  class polymorphic_hash_algorithm_wrapper :
    public polymorphic_hash_algorithm
  {
  public:
    using hash_type = Hash;
    using result_type = typename hash_type::result_type;
    static constexpr size_t block_size = hash_type::block_size;

  private:
    Hash& h;

  public:
    explicit polymorphic_hash_algorithm_wrapper(hash_type& h)
      noexcept : h(h) {}

    void update(const void* data, size_t size) override
    {
      h.update(data, size);
    }

    result_type result()
    {
      return h.result();
    }
  };

This way, polymorphic_hash_algorithm could be used to abstract away the
hash algorithm type while still be compatible with hash_append and friends.

As a side note, this would be another use case where the hash algorithm
passed to hash_append is non-copyable. And while copyability could be
added, most likely it wouldn't be free.

Andrey Semashev

Peter Dimov

Ivan Matek

Peter Dimov

Julien Blanc

Peter Dimov

Julien Blanc

Joaquín M López Muñoz

Peter Dimov

Andrey Semashev

Peter Dimov

Vinnie Falco

tags

participants (6)