
вт, 13 мая 2025 г. в 12:48, Arnaud Becheler via Boost <boost@lists.boost.org>:
Dear Boost community,
The review of the Boost.Bloom library begins today May 13th, 2025, and will run through May 22nd, 2025.
This is my review of the proposed Boost.Bloom library. First of all, I want to disclose that I work for the C++ Alliance. Also, I don't have any particular knowledge of probabilistic theory or container implementations. Between reading documentation and tinkering with the tests, examples, and benchmarks, I'd say I've spent around 6 hours. The library is very to the point: it implements a particular data structure and provides ways to customise it for your needs. The documentation also explains why one would need to customise it. Customisation is very easily done, which is good. The API is small, but I looked at other libraries that implement Bloom filters and couldn't find anything that's missing. After thinking about it for several days I could only conceive of one possible extra operation: looking for several items at once may potentially be faster than doing it in sequence. The container provided by this library can be used to optimise search. This has a broad range of applications (the docs have a link to a paper that lists several), but on the other hand is limited by the speed of the search that is being optimised compared to the speed of Bloom filter. Which is why it is crucial for Bloom filters to be as fast as possible. The provided benchmarks compare bloom filters with boost::unordered_flat_set (AFAIK the fastest dictionary container in the world) and there are indeed configurations when bloom::filter wins even in that competition. Joaquin also assures that competitor libraries are at least not faster than his. So, I think that the library has utility. My only criticism of the documentation is that I don't believe it mentions why Bloom filters are called that. When I started doing my review I thought that there is some metaphorical connection between the container and flowers blooming. I am in general not a fan of one page documentation, but this one is short enough to work well in this format. I recommend we ACCEPT the library into Boost. I want to thank Joaquin for submitting the library and Arnaud for managing the review process.