
Dean Michael Berris wrote:
Hi Everyone,
During the weekend, I got acquainted with and excited about Bloom Filters and the utility of such a data structure. I then proceeded to look at generic implementations of a bloom filter and I thought about implementing one myself instead as a simple exercise.
Attached is what I came up with which I'm submitting as a library for review (and uploading to the vault). Also attached is a sample program which uses the bloom_filter.
Hello, Few comments: - It would be nice to have the option of using `double-hashing' schema, where all the hashes are generated according to: hash(i) = h1(x) + i*h2(x) or the squared/cubed variant: hash(i) = h1(x) + i*h2(x) + i*i (*i) - one hash value can be potentially used to generate many hash values. For example, if you need 16-bit hashes, then from the hash function producing 32-bit hash-value you could get 2 values. This can be in turn combined with the previous point - double hashing. - From my point of view bloom filters are used in performance-critical applications, and each tiny optimisation counts. - intersection/union/difference operations Regards, Milosz