
At 05:11 AM 7/12/2006, Paul A Bristow wrote:
In other words, 1 - P. Right? One response is why do you need to define it, given how easy it is to get from the cumulative density function? Perhaps not really needed? Is there an accuracy reason for both?
| It depends how accurate you want to be: calculating 1-P incurs cancellation | error if P is very near 1, where as for most (all?) distributions we can | calculate Q directly without the subraction from unity.
| I think the "Boostified" name would be in all lower case: students_t or whatever.
Agree with this.
This brings to mind another function that, though easily derived would be good to have to allow internal computations less subject to round-off error. This is a two parameter function that is the cumulative probability between a lower and an upper bound. Mathematically this can always be computed as "CDF(x[ub]) - CDF(x[lb])" (read the square brackets as mathematical subscript notation) but numerically with very small intervals, you can easily end up with 0 when you want something close to "PDF((x[ub]+x[lb])/2)*(x[ub]-x[lb])". You don't need to make any general guarantee about precision and so could do initial implementations as the difference of the cumulative functions, but then go back and do better for individual distributions. I don't know any standard term for this off the top of my head. I would suggest just using a two argument version of whatever is decided on for the cumulative distribution. So, using my suggested function name: standard_normal.cdf(-1.0, 1.0) would return the probability that a random variate with a normal distribution is within one standard deviation of the mean. The only problem I have with this is that if we look at the one parameter version as being the two parameter version with one parameter defaulted its the *first* parameter that is defaulted since: dist.cdf(x) = dsit.cdf(-INFINITY, x) That would suggest using the complementary cdf instead, but that seems a lot less natural. Topher