
On Mon, Dec 9, 2024 at 8:43 AM Peter Dimov via Boost
What's important here is that it's not possible to provide an extended result of better quality from the outside; the hash algorithm is in the best place to provide it because it has access to more bits of internal state than it lets out.
This requirement effectively mandates that all _hash algorithms_ be _extendable-output hash functions_:
It seems to me that Hash2 is primarily aimed at users who wish to opt-in to a superior framework for use in unordered style containers. For this use-case, hash algorithms only need the digest finalized once. All of the work in the paper is focused on this use-case. What you have done, in your own words, is innovative. From the documentation:
If we’re using one of these algorithms to produce file or content checksums, do not tolerate collisions, and operate on a large number of files or items (many millions), it might be better to use a 128 bit digest instead.
This obviously is out of scope for unordered containers. I think it would be better to change HashAlgorithm to only allow one call to finalize() (and rename it from "result"), and no calls to update() after finalize, using the language or similar language to what I have already posted. And there is nothing inherently wrong with innovating, therefore I would propose that you simply add a new concept: ExtendableOutputHashAlgorithm And in this new concept impose your additional requirements. Most users won't care about extendable output and can just stick with the elements of the library needed for working with unordered containers. Those users will find it easy to adapt external implementations of hash algorithms, as they do not need to also have the expertise for ensuring that these external implementations meet the more stringent requirements of ExtendableOutputHashAlgorithm. There is an added benefit of relaxing the requirements for HashAlgorithm: users can call into existing algorithms without modifying them, often a requirement for security certifications. Thanks