
About the implementation: If I understand your implementation, you do something like that: given a value V, you compute a number X (say the digest of V), then you compute K different values applying the transformation (X + offset) mod(M) I don't know if this produces k hash functions suitable for bloom filters, but, why don't you use Boost.hash to compute X? Also, it would be easier to define the requirements on the Input type. About the documentation: I think that the following paragraph in the documentation must be reformulated <snip> To find out whether we've encountered a certain URL before, we simply pass the URL to the same three hash functions. If we see that all the positions returned by the three hash functions are all 1, then we might have seen the URL before -- this is called a false positive. </snip> indeed, you cannot say that the URL is a false positive, since it could be equal to an URL stored previously. IMHO, you should describe the mathematical principles in more detail and, moreover, you should discuss in more detail the false positives and how to control their rate. Even if it isn't mandatory, it would be nice if you write your documentation in QuickBook format, since it will uniform both the documentation toolchain and the documentation look. Also, don't forgive to synchronize the documentation with the current interface. Manuel Fiorelli