
Dean Michael Berris wrote:
Hi Milosz,
On Mon, Jun 15, 2009 at 2:49 PM, Milosz Marian HULBOJ<mhulboj@hulboj.org> wrote:
- It would be nice to have the option of using `double-hashing' schema, where all the hashes are generated according to: hash(i) = h1(x) + i*h2(x) or the squared/cubed variant: hash(i) = h1(x) + i*h2(x) + i*i (*i)
This is interesting. I'd think this would have to be implemented by the user of the bloom_filter in the HashFunctions template parameter.
- one hash value can be potentially used to generate many hash values. For example, if you need 16-bit hashes, then from the hash function producing 32-bit hash-value you could get 2 values. This can be in turn combined with the previous point - double hashing.
Hi Dean, The bottom line of the two points above is that in order to hash for the bloom filter (in the double-hashing schema) one only needs to: - invoke hash function once - split the bits and do few multiplications and additions Still I didn't have time to look into yours implementation to see whether the above can be used seamlessly. There are also variants of double hashing to be more cache-line friendly, but personally the idea presented above was sufficient for my needs (for the hashing functions I've been using the TR1 hash and murmur hash). Regards, Milosz