Re: [boost] [Bloom] Some questions

19 May 2025

      El 18/05/2025 a las 23:38, Ivan Matek escribió:
...
Had a bit more time to think :) so here are my replies and few more 
questions.
>     5. Why is BOOST_ASSERT(fpr>=0.0&&fpr<=1.0); not
    > BOOST_ASSERT(fpr>0.0&&fpr<=1.0);
    >     , i.e. is there benefit of allowing calls with impossible
    fpr argument?
    fpr==0.0 is a legitimate (if uninteresting) argument value for
    capacity_for:
capacity_for(0, 0.0) --> 24
    capacity_for(1, 0.0) --> 18446744073709549592
The formal reason why fpr==0.0 is supported is because of symmetry:
    some calls to fpr_for actually return 0.0 (for instance,
    fpr_for(0, 100)).
This is a bit philosophical, but I actually do not feel this is correct.
First of all  is (0, 0.0) only usecase where fpr of 0.0 makes sense? 
i.e any time when n>0 fpr 0.0 is impossible(or I misunderstood 
something).
Yes, it is impossible: the capacity would have to be infinite --the maximum
attainable value is returned instead, though this is of little value as
OOM would ensue (as you point out below).
...
So assert could be implies(it is funny because we had discussion about 
implies on ML few months ago), so something like:
BOOST_IMPLICATION(fpr == 0.0, n == 0);
Similarly for (1, 0.0) I do not believe result should be size_t max 
value, as this is not correct value. Now we both know you will OOM 
before noticing this in reality,
but even if we imagine magical computer that can allocate that much 
memory fpr is not 0.0.
I understand your point and can relate to it, but consider this:

capacity_for(1, 1.E-200)

Is this legit? OOM will happen here, too. Where do we put the
limit?
...
If this library was not C++11 I would suggest std::optional as return 
type, but as it is boost::optional seems like best option.
Now I know people might think I am making API ugly for users, but I 
really do not think a bit more typing is such a big problem when
the alternative is people messing up(probably not when using constants 
in code, more likely when they dynamically compute something). Bloom 
filters are important, but they are not like json or protobuf where 
they are everywhere in codebase, users needing to use a bit uglier
API in 10LOC in 1M LOC codebase does not seem like a big deal to me.
So in examples above:
capacity_for(0, 0.0) - > min_possible_capacity
capacity_for(1, 0.0) - > nullopt
Few more questions I thought of recently, apologies in advance if I 
misunderstood something.
[...]
I will address these comments tomorrow (out of time today),
but I felt like answering to the first prt of your post now.

Joaquin M Lopez Munoz

Re: [boost] [Bloom] Some questions

Joaquin M López Muñoz