Re: [boost] [Hash2] Result extension

9 Dec 2024

      On 12/9/24 19:43, Peter Dimov via Boost wrote:
...
What's important here is that it's not possible to provide
an extended result of better quality from the outside; the
hash algorithm is in the best place to provide it because it
has access to more bits of internal state than it lets out.
This requirement effectively mandates that all _hash
algorithms_ be _extendable-output hash functions_:
https://en.wikipedia.org/wiki/Extendable-output_function
Only some hash functions are specified as extendable-output functions
(XOF). I mean "specified" as "in hash algorithm specification". The link
you posted says XOF is an extension and even lists a few examples of
functions that support it.

The fact that you can implement some hash functions such that the
implementation allows multiple finalization calls or even interleave
updates and finalization steps does not make that hash function a XOF.
That just a property of your particular implementation. A useful
property, but still beyond specification. A different implementation may
rightfully not support this property and be still compliant with the spec.

In my opinion, HashAlgorithm must support the latter implementation that
is compliant with the hash function specification and does not support
the digest extension. If you want to expose the XOF capability then
please create a separate concept, say HashAlgorithmXOF, and add a way to
detect whether a given algorithm supports result extension.

I'll add that XOF is supported by some implementations, but they are
also incompatible with the current HashAlgorithm concept. For example,
OpenSSL provides EVP_DigestFinalXOF, but it must also be called only
once. The difference from EVP_DigestFinal_ex is that EVP_DigestFinalXOF
accepts the size of the buffer to will with the digest. If you are going
to define HashAlgorithmXOF, please take existing implementations of this
feature into account.
...
Note that this is not the only innovation that the proposed
hash algorithm concept involves. All hash algorithms are
required to support seeding from uint64_t and from an
arbitrary sequence of bytes, which makes them effectively
_keyed hash functions_ (or _message authentication codes_).
Also note that the requirement that one can interleave calls
to `update` and `result` arbitrarily makes it possible to
implement byte sequence seeding (for algorithms that don't
already support it) in the following manner:
Hash::Hash( unsigned char const* p, size_t n ): Hash()
{
    if( n != 0 )
    {
        update( p, n );
        result();
    }
}
Subsequent `update` calls now start from an initial internal
state that has incorporated the contents of [p, p+n), and that
has been "finalized" (scrambled thoroughly) such that the
result is not equivalent to just prepending the seed to the
message (as would have happened if the result() call has been
omitted.)
The exact behavior of the hash algorithm's constructor is its
implementation details. It doesn't need to be specified in terms of
public update and result methods. And certainly, that one hash algorithm
supports this sort of operation ordering doesn't mean that all of them
should.

Re: [boost] [Hash2] Result extension

Andrey Semashev