Re: [boost] [Hash2] Result extension

9 Dec 2024

      ...
Vinnie Falco asked me the following on Slack:

...
...
I would ask, what is the motivating use-case for calling result
twice? This is not explained in the docs and no examples are
given. In fact, the one example given says "not to do this"

...

...
Calling result() twice (or more times) provides result extension;
the ability to extract variable number of bits from a hash
algorithm, instead of a fixed size value (e.g. 64 bit.)

...
This is in fact stated in the docs here

...
https://pdimov.github.io/hash2/doc/html/hash2.html#hashing_bytes_result

...
...
Note that result is non-const, because it changes the internal
state. It’s allowed for result to be called more than once;
subsequent calls perform the state finalization again and as a
result produce a pseudorandom sequence of result_type values.
This can be used to effectively extend the output of the hash
function. For example, a 256 bit result can be obtained from a
hash algorithm whose result_type is 64 bit, by calling result four
times.

...

...
and there is an example of doing that here

...
https://pdimov.github.io/hash2/doc/html/hash2.html#example_result_extension

...
All hash algorithms are required to support result extension,
because (in my opinion) this is extremely useful functionality
that is easy - even trivial - to provide, but is often withheld
either by accident or in some cases, even deliberately.

...
Hash algorithms typically have a "finalization" phase that
pads the message, mixes the length, scrambles the internal
state in a more thorough manner than in `update`, and then
derives a hash value from that state. (The hash value is often
shorter than the total amount of state.)

...
If this "finalization" phase is performed more than once, one
naturally gets the mandated `result()` behavior.

...
Falco continues:

...
...
I pointed out in the post I already made that the quality of
digest from calling result twice is dependent on the hash
algorithm, and there is no way the library can provide
assurances on the quality

...

...
That's of course correct, but it also applies to the quality of
calling `result()` only once; it's naturally dependent on the
implementation of the hash algorithm.

...
What's important here is that it's not possible to provide
an extended result of better quality from the outside; the
hash algorithm is in the best place to provide it because it
has access to more bits of internal state than it lets out.

...
This requirement effectively mandates that all hash
algorithms be extendable-output hash functions:

...
https://en.wikipedia.org/wiki/Extendable-output_function
For those unfamiliar with Extendable Output Functions (XOFs) FIPS 202 [1], and the reference implementation [2] provide good detail since the wiki article seems a bit short.

Matt

[1] https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf
[2] https://github.com/XKCP/XKCP/blob/master/usage-example.md

Re: [boost] [Hash2] Result extension

Matt Borland