
vicente.botet wrote:
I need for my application bloom filters. Do you know of a good boostified generic implementation? Is there any interest in such a generic bloom filters library in Boost?
Hi Vicente, I have a simple generic Bloom filter here: http://svn.chezphil.org/libpbe/trunk/include/bloom_filter.hh This is GPL licensed but I'd be happy to change that to the Boost license if someone wanted to make a Boost library using it. After writing this code I eventually decided to use a different method, so I'm not currently using this code anywhere. The code is very straightforward. I spent much longer learning the theory of Bloom filters than I did implementing it. Unfortunately I've now forgotten much of the theory and all that is left is the code... What's not included is hash functions; you have to supply these as a template parameter. I used Boost.CRC and the hash function from std:: unordered containers. I seem to have lost the example code that did this. One general problem with this approach is to find enough independent hash functions. Say I'm storing strings and I need four hash functions; the temptation is to have one function and to compute e.g. f(x), f("123"+x), f("456"+x), f("789"+x). These will give different answers and may look like they're independent functions, but it's not really certain how independent they are. Cheers, Phil.