
| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of Jeff Garland | Sent: 08 July 2006 17:49 | To: boost@lists.boost.org | Subject: Re: [boost] [math/staticstics/design] How best to | name statistical functions? | | John Maddock wrote: | > Paul Bristow has been toiling away producing some | statistical functions on top of some of my Math special functions, and we've | encountered a bit of a naming dilemma that I hope the ever resourceful Boosters | can solve for us | > :-) | | Possibly better, save him from writing them, possibly? Has | he looked at Eric Niebler's statistical accumulators? Indeed - on my TODO list. Some further background, before you all leap in with your favourite names ;-) This is to support my proposal A Proposal to add Mathematical Functions for Statistics to the C++ Standard Library Document number: JTC 1/SC22/WG14/N1069, WG21/N1668 Date:11 Aug 2004 Recent WG21 paper http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2003.html includes this response to my proposal http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1668.pdf (To be reissued revised as N2048 but missed this mailing): " N1668 A Proposal to add Mathematical Functions for Statistics to the C++ Standard Library Date: 2004-08-11 Status: Open. Lillehammer [2005-04]: The main argument against this proposal is that a high-quality implementation would be extremely hard; this is about 150 functions, most of which have several parameters. Issue: are we willing to standardize something with the expectation that most implementations will be low quality? Are these functions ones where poor accuracy is acceptable? (If so, we could do this for float only, and drop the double and long double versions.) Mixed interest. No consensus for bringing this forward at this meeting. What might change people's mind: 1. Reasoning for why to include these functions and exclude others. 2. A smaller set of functions. 3. If this is intended to support an easy-to-use statistical package, then show the interface for that statistical package first. " But I think after John's stunning work on the incomplete beta & gamma, the guts of the functions that you all need to get information from your data using statistics, we are close to meeting the WG21 'requirements' to accept this proposal. His work in the sandbox is functionally complete. I am just doing some 'grunt' work on cosmetics and the wrappers to provide the statistics functions in a format that is best for the end users. Before you jump to judgement on this issue, I invite (beg!) you to consider the end users' needs. They are NOT mathematicians, they are probably NOT professional statisticians, but are ordinary physicist, chemists, surgeons, social 'scientists', bee keepers, farmers ... Bear in mind too that these groups all have different customary names/jargons for many of these functions. So IMO the names have to be helpful as possible TO THE USERS - clarity before curtness. There is also the complication that the distributions have, so-called by some, 'mass' values and 'cumulative' for others, and these two are confusing and confused, especially if they have the same name! Ideally we would have wrappers which provide BOTH of these variants. For each function there are variants - complements, and inverses (more than one inverse if more than one argument - something I have NOT tackled in the list before and I have only realised the need when doing the wrappers!) The inverse functions have been tackled by John mainly using root finding methods - the incomplete beta inverse is as usual MUCH more difficult and John has a state-of-the-art solution by Professor Temme. [Example, the 'forward' functions are useful tell you the probability of a hypothesis, the 'inverse' is useful to tell you what something would be needed to achieve a certain probability, for example a number of measurements or samples, OR the variance (or accuracy of measurement)]. To complicate things futher, here are also annoying C99 precedents in erf and erfc, which by Boost convention of using _ should be erf_c. These are some of the reasons why I came up with the list of names below. But as John has explained FOR ONE FUNCTION Student's t, it is not really enough. Your suggestions are most welcome. Paul --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS pbristow@hetp.u-net.com Mathematical 'special' functions (only double versions are shown, overloads for float and long double will also be provided). double beta_distribution(double a, double b, float x)); // Beta distribution function. double beta_incomplete (double a, double b, double x); // Incomplete beta integral. double beta_incomplete_inv (double a, double b, double y); // Inverse of incomplete beta integral. double binomial (unsigned int k, unsigned int n, double p); // Binomial distribution function. double binomial_c (unsigned int k, unsigned int n, double p); // Binomial distribution function complemented. double binomial_distribution_inv(unsigned int k, unsigned int n, double y); // Binomial distribution function inverse. double binomial_neg_distribution (unsigned int k, unsigned int n, double p); // Negative binomial distribution . double binomial_neg_distribution_c (unsigned int k, unsigned int n, double p); // Negative binomial distribution complement. double binomial_neg_distribution_inv (unsigned int k, unsigned int n, double p); // Inverse of negative binomial distribution. double chi_sqr_distribution(double df, double x); // Chi-squared distribution function. double chi_sqr_distribution_c(double df, double x); // Chi-squared distribution function complemented. double chi_sqr_distribution_c_inv(double df, double p); // Inverse of Chi-squared distribution function complemented. double digamma(double x); // psi or digamma function. double fisher_distribution(unsigned int ia, unsigned int ib, double c); // Fisher F distribution. double fisher_distribution_c(unsigned int ia, unsigned int ib, double c); // Fisher F distribution complemented. double fisher_distribution_c_inv(double dfn, double dfd, double y); // Inverse of complemented Fisher F distribution. double gamma_distribution (double a, double b, double x); // Gamma probability distribution function. double gamma_distribution_c (double a, double b, double x); // Gamma probability distribution function complemented. double gamma_incomplete (double a, double x); // Incomplete gamma function. double gamma_incomplete_c (double a, double x); // Incomplete gamma function complemented. double gamma_incomplete_inv (double a, double y0); // Inverse of incomplete gamma integral. double gamma_incomplete_c_inv (double a, double y0); // Inverse of complemented incomplete gamma integral. double gamma (double x); // gamma function (or tgamma as in C99 math.h?) double lgamma (double x); // log gamma function name as C99. double normal_distribution (double a); // Normal distribution function. double normal_distribution_inv (double a); // Inverse of normal distribution function. double poisson_distribution (unsigned int k, double m); // Poisson distribution. double poisson_distribution_c(unsigned int k, double m); // Complemented Poisson distribution. double poisson_distribution_inv(unsigned int k, double y); // Inverse Poisson distribution. double students_t (double df, double t); // Student's t. double students_t_inv (double df, double p); // Inverse of Student's t. double students_t (unsigned int df, double t); // Student's t. double students_t_inv(unsigned int df, double p); // Inverse of Student's t. Distribution function probabilities and quantiles double normal_probability(double z); // Probability of quantile z. double normal_quantile(double p); // Quantile of probability p. double students_t_probability(double t, double df, double ncp);// Probability of quantile. double students_t_quantile(double p, double df, double ncp); // Quantile of probability p. double chi_sqr_probability(double x, double df, double ncp); // Probability of quantile. double chi_sqr_quantile(double p, double df, double ncp); // Quantile of probability p. double beta_probability(double x, double a, double b); // Probability of x, a, b. double beta_quantile(double p, double a, double b); // Quantile of double fisher_probability(double f, double dfn, double dfd, double ncp); // Probability of quantile. double fisher_quantile(double p, double dfn, double dfd, double ncp); // Quantile of probability p. double binomial_probability(double x, double n, double pr); // Probability of x. unsigned int binomial_first(double p, unsigned int n, double r); // 1st k for probability >= p double neg_binomial_probability(double x, double n, double pr); // Probability of quantile. double poisson_probability(double x, double lambda); // Probability of quantile. double poisson_quantile(double p, double lambda); // Quantile of probability p. double gamma_probability(double x, double shape, double scale); // Probability of x. double gamma_quantile(double p, double shape, double scale); // Quantile of probability p. double smirnov(int n, double p); // Exact Smirnov statistic. double smirnov_inv(int n, double x); // Exact Smirnov statistic. double kolmogorov ( double ); // Kolmogorov statistic. double kolmogorov_inv (double p); // Kolmogorov statistic inverse.