
Alejandro Cabrera wrote:
Phil,
Phil Endecott-48 wrote:
It is normally preferable to pass a begin-end pair rather than address and length, but fundamentally yes I would like to be able to construct a read-only bloom filter from a pair of const_iterators i.e. const pointers in this case.
The input iterator constructor is already available.
No, I don't think so. What you have is a ctor that iterates over a sequence of values and inserts them: template <typename InputIterator> basic_bloom_filter(const InputIterator start, const InputIterator end) { for (InputIterator i = start; i != end; ++i) this->insert(*i); } What I want is a way to adapt a chunk of raw memory, which I've got from mmap(), and to treat it as a bloom filter. I haven't yet seen any "adaptor" functionality in your code.
All that remains is what you describe below.
Phil Endecott-48 wrote:
I don't care about serialisation. I just want to be able to
const T* p = &(*(bloom_filter.data().begin())); size_t len = sizeof(T) * bloom_filter.data().size(); write(fd,p,len);
Regards, Phil.
All this should take is changing the underlying storage type to something that supports the operations begin() and size(). The intent is to use std::vector, as with the dynamic counting Bloom filter implementations. The operations you've described above are serialization. This will be supported soon, as soon as I can work on the basic Bloom filters. I'll add a test to my suite to make sure there is a way to support this in both directions (write/read, store/load).
I really don't want to "load" the filter i.e. to do something that needs O(N) time and O(N) memory. I just want to mmap the file into memory and "declare" the bloom filter into existence in O(1) time. Then when I eventually use it, pages will be faulted-in on demand. The aim is to ensure that I can have a perhaps very large filter and yet to perform small numbers of queries quite quickly. Regards, Phil.