
Phil Endecott-48 wrote:
There is still no "adaptor" functionality, so I can't mmap() a file to use as the bloom filter's raw content, as I might want to do for e.g. a spellcheck, URL blocklist, etc. etc.
Could you describe this is more detail? I am familiar with the mmap() interface, but how would one go about providing an adaptor so that mmap() could be used as the Bloom filter's raw content? Would this be accomplished through a constructor, for example: // dynamic_basic_bloom_filter(void *addr, const size_t len); dynamic_basic_bloom_filter<std::string> bloom(address, length); Would you happen to know if there is any work being done on a Boost.Posix or any similar C++ project? Phil Endecott-48 wrote:
data() returns a std::bitset, but that doesn't provide access to its data in a form that I can write to a file (e.g. in preparing the data for the above examples). I consider this a fault of std::bitset. I believe you should use a std::vector or array instead.
data() returns the underlying type in each case. For the basic Bloom filter, this is a bitset (std:: or dynamic), and for counting Bloom filters, this is either a boost::array or an std::vector. I see the problem with std::bitset now. In order to serialize the bitset, it would take O(num_bits) operations, rather than the number of blocks (using operator[]). Using an std::vector, the serialization can be accomplished in O(num_elements). It also helps that boost.Serialization provides an implementation for std::vector. Thank you for the insight. I'll work on converting the underlying storage type next week. Phil Endecott-48 wrote:
I see that you still have intersection without discussing the error rate implications.
I will add a section in the tutorial that discusses the implications of union/intersect operations. Phil Endecott-48 wrote:
Your dynamic_bitset has a resize() method. Was that there before? I don't understand how it can work in any useful way.
Yes, this was there initially. The dynamic_basic_bloom_filter is the only dynamic Bloom filter that has the resize() method. I intend to remove this method, as users requiring a resize can instantiate a new dynamic Bloom filter to achieve the same effect. It was a design relic that should've been discarded much sooner, as it is more dangerous to users than useful. Phil Endecott-48 wrote:
The final documentation should mention the complexity of each method.
I've made sure to include the complexity of all methods. This is located at: http://svn.boost.org/svn/boost/sandbox/bloom_filter/trunk/libs/bloom_filter/... In the class references, do you think it would be convenient to make each function name clickable and make them link to the specific entry in the function reference? Thank you, -Alej -- View this message in context: http://boost.2283326.n4.nabble.com/Feedback-Towards-a-Better-Boost-BloomFilt... Sent from the Boost - Dev mailing list archive at Nabble.com.