
Alejandro Cabrera wrote:
Phil Endecott-48 wrote:
There is still no "adaptor" functionality, so I can't mmap() a file to use as the bloom filter's raw content, as I might want to do for e.g. a spellcheck, URL blocklist, etc. etc.
Could you describe this is more detail? I am familiar with the mmap() interface, but how would one go about providing an adaptor so that mmap() could be used as the Bloom filter's raw content? Would this be accomplished through a constructor, for example:
// dynamic_basic_bloom_filter(void *addr, const size_t len); dynamic_basic_bloom_filter<std::string> bloom(address, length);
It is normally preferable to pass a begin-end pair rather than address and length, but fundamentally yes I would like to be able to construct a read-only bloom filter from a pair of const_iterators i.e. const pointers in this case.
Would you happen to know if there is any work being done on a Boost.Posix or any similar C++ project?
Not relevant.
Phil Endecott-48 wrote:
data() returns a std::bitset, but that doesn't provide access to its data in a form that I can write to a file (e.g. in preparing the data for the above examples). I consider this a fault of std::bitset. I believe you should use a std::vector or array instead.
data() returns the underlying type in each case. For the basic Bloom filter, this is a bitset (std:: or dynamic), and for counting Bloom filters, this is either a boost::array or an std::vector.
I see the problem with std::bitset now. In order to serialize the bitset, it would take O(num_bits) operations, rather than the number of blocks (using operator[]). Using an std::vector, the serialization can be accomplished in O(num_elements). It also helps that boost.Serialization provides an implementation for std::vector. Thank you for the insight. I'll work on converting the underlying storage type next week.
I don't care about serialisation. I just want to be able to const T* p = &(*(bloom_filter.data().begin())); size_t len = sizeof(T) * bloom_filter.data().size(); write(fd,p,len); Regards, Phil.