
Phil Endecott wrote:
Sorry Arash but I don't understand your reply; what exactly is correct?
My point was that you are correct, the "and" does not provide anything of value. Only "or" and "xor" are practical for non-counting bloom filters.
I believe I have convinced myself that a bitwise AND on bloom filters gives something that is a superset of the intersection of the contents, i.e. the false positive rate is increased. A proof would be good. Regarding xor, it superficially looks like it might give the symetric set difference i.e. those elements in A or B but not both, but when you consider overlapping hashes I think it results in false negatives.
You're correct there as well, I should have qualified the difference type, as it is indeed a symmetric difference. That said, I'm not sure what you mean by overlapping hashes, the only rule for such operations is that the BFs be constructed similarly - that is: 1. same types of hash functions 2. same number of hash functions 3. same filter sizes 4. same hash value quantization value Otherwise such set operations simply wont work and become meaningless.