
Dave Harris wrote:
In-Reply-To: <d0kr8r$gl6$1@sea.gmane.org> nesotto@cs.auc.dk (Thorsten Ottosen) wrote (abridged):
It's a pleasure to announce that the review of Daniel James' hash functions library starts today (8th of March) and ends the 12th of March.
I have some questions about the design.
First, why does hash_combine have a void result, and have side-effects on its arguments? Wouldn't a pure, side-effect free function be better?
This is done to emphasize the fact that the hash value of the second argument is accumulated into the first (the seed). A pure function would entice you to define hash_value for (x, y) as hash_combine(x, y). The current design forces you to think in terms of an initial seed that is modified by each value.
Second, why doesn't hash_range depend on the length of the sequence? For example, the hash_range of { 12 } and { 0, 0, 0, 12 } are the same. I'd rather they gave different results.
Thirdly, the hash values love producing 0. For example, hash_value( (char *) NULL ) and hash_value( false ) are both 0. hash_combine on a range of 0s will produce another 0. Am I the only person who sees this as a problem?
We can fix this by initializing the seed in hash_range with something other than zero. It isn't a problem for strings, where \0 is rare, but may be a problem for arbitrary sequences.