
Christopher D. Russell wrote:
The requirement, that the input keys should be strictly positive isn't a big limitation, because it is always possible to accordingly transform key sets containing possibly negative numbers, ...
? By addition of a constant positive offset?
Sure.
or to derive the keys from strings to match.
? You lost me here. Would you explain this point in a little more detail please.
You're certainly right, that there is no generic and simple way to map strings to integers besides reinterpreting the bit pattern of the string as a (probably long) integer. What I've meant is, that sometimes you may use some simple mapping, for instance adding all the character values of the string given that these give no collisions for your input key set.
The memory requirement for the MPH generator is raising linearily with the number of keys: roughly N/3 * sizeof(int), where N is the number of keys.
? Is it possible to generalize to keys outside of unsigned int's range w/linear impact on generator performance? Specifically, I'm looking for an efficient hash function for an associative hash table keyed by 16-byte UUID's where I know the set of UUID keys a-priori, hash lookup time is constant (primary consideration), hash generation time is linear (secondary consideration).
This problem is very similar to the string mapping problem you've mentioned above. The alogorithm in principle could be applied to integers of arbitrary precision (given you have a plugin implementation for such a thing, which could be used in place of the builtin data type), but I'm not able to make any timing predictions for this. The only thing which seems to be obvious is that the general dependency from the key set size is still valid. Regards Hartmut