
On Tue, Oct 23, 2018 at 10:19 AM degski <degski@gmail.com> wrote:
On Tue, 23 Oct 2018 at 08:45, Miguel Ojeda via Boost-users <boost-users@lists.boost.org> wrote:
On Mon, Oct 22, 2018 at 7:57 PM Shailja Prasad via Boost-users <boost-users@lists.boost.org> wrote:
I was trying to upgrade boost 1.53.0 to boost 1.68.0. But, it looks like hashing code generation has changed, since the following line gives two different hashcode for same string input.
Hm... why would you expect the hash to be always the same between releases, compilers, etc.?
Well, uhm, because that seems to be quite handy. All NIST implementations do exactly this.
No, sorry, that is a completely different use case. Crypto hashes are used, among other things, in network communications, persistent storage, etc. They need to be "fixed" functions, and their standards provide the exact definition. That is not the case at all with std::hash or Boost.Hash.
I cannot find it with a quick look at Boost.Hash's docs anything regarding a guarantee of that. If it is like std::hash, then it is only guaranteed to remain equal for the duration of the program.
Sort of: "Hash functions are only required to produce the same result for the same input within a single execution of a program". The standard states a minimum requirement [with an intended [narrow] use case in mind, std::ordered_map's].
Not sure what you mean. That is what I said.
In other words, you cannot rely on saving it nor comparing them to other hashes from other vendors, platforms, architectures, compiler releases, etc.
In my view this is an omission, the option to have exactly that should [have been] available.
Not really. You could argue, for instance, that precisely because std::hash (and Boost.Hash) is meant to be used in maps/hash tables/..., you should not be able to guess the values of the hash in advance, in order to prevent collision attacks. In other words, the implementation has even the freedom to provide a different hash function every run of your program. Not only that, but stating that the hash should remain constant across C++/Boost releases is basically stating the hash function should be fixed forever. That removes all the freedom for improvements when future hash functions are discovered or implemented, with better properties (which is what happened in the commits I linked). In summary: the hashes provided by Boost or the standard are not intended to be fixed functions; i.e. you shouldn't rely on the actual values returned, only on the properties of the function. Namely, this one: "For two different values t1 and t2, the probability that h(t1) and h(t2) compare equal should be very small, approaching 1.0 / numeric_limits<size_t>::max()." Cheers, Miguel