
On Jan 17, 2005, at 8:34 AM, Hartmut Kaiser wrote:
recently I've written a minimal perfect hash function generator, which generates very efficient hashing functions for arbitrary (reasonable large) key sets containing strictly positive integers to achieve a minimal and perfect mapping of these keys to hash values in the range [0..N), where N is the size of the key set (minimality requirement) and the hash values are unique (perfection requirement). [...]
The initial overhead is tolerable, though, here are some timings on my machine - Pentium M, 1.6GHz, 1GByte:
20 keys: 0.005 [s], 200 keys: 0.03 [s], 2000 keys: 0.15 [s], 20000 keys: 4.45[s]
where the time is raising slightly more worse than linear (actually it has a linear and an exponential component, but the exponential component kicks in for larger key sets only), depending on the number of keys. The memory requirement for the MPH generator is raising linearily with the number of keys: roughly N/3 * sizeof(int), where N is the number of keys.
I'm very interested if this scales well for a larger number of keys, of the order of 10^6 - 10^8 keys. I guess that for those sizes the scaling will already be exponential. Can you tell us what the maximum number of keys is that can be efficiently treated? Matthias