
On Thu, May 21, 2009 at 11:47 AM, John Maddock <john@johnmaddock.co.uk> wrote:
This is a feature request for the next version of Math/Statisical Distributions lib.
Currently, due to lack of input type information, discrete distributions can only be "emulated" by using the discrete_quantile policy. However, doing so the effective quantile type is still a real type.
In my opinion, this have at least two disadvantages:
I believe your disadvantages are more imagined than real.
1. Operations are slow since the underlying quantile type is still real. Instead, operations on really integral types are generally faster.
Unfortunately there is no way the quantile of discrete distributions can be calculated internally using all integer arithmetic (at least I can't think of a case other than maybe the trivial bernoulli distribution). Normally the result of the quantile is calculated as a real-number and then appropriately rounded acording to the policy in effect, in a few cases the result is calculated directly as an integer by summing CDF values (hypergeomentric for example), but the internal calculations still have to done using reals.
There's also no overhead from returning a real type (since it's usually returned in a register just like an integer type would be), there might be a tiny overhead if the user then casts to an integer, but if we internalised that cast by returning an integer type then everyone would pay that cost no matter what the use case :-(
BTW there are a few genuine use cases for returning a real-valued result from the quantile of a descrete distribution.
2. Quantile comparison might be inaccurate since we are comparing real types
Nope, not if you've requested an integer result (which is the default policy), as integers are represented exactly in floating point types: unless
I've missed that, given two floating point numbers x and y and the related floating point machine numbers fl(x) and fl(y): if x==y then fl(y)-eps < fl(x) < fl(y)+eps; and if x==y then round(fl(x)) == round(fl(y)); where eps is the unit roundoff error. I've only considered the first relation. Thank you!! Cheers, -- Marco