
| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of Kevin Lynch | Sent: 09 July 2006 11:49 | To: boost@lists.boost.org | Subject: Re: [boost] [math/staticstics/design] How best to | name statistical functions? | | John Maddock wrote: | > Paul Bristow has been toiling away producing some | statistical functions on | > top of some of my Math special functions, and we've | encountered a bit of a | > naming dilemma that I hope the ever resourceful Boosters | can solve for us :-) | Why not hide the functions behind a class interface? After all, the | various functions are "properties" of the distributions. Hence: | | class students_t { | students_t(double mu); | double P(double x); | double Q(double x); | double invP(double p); (or perhaps inverseP or Pinv or | something) | ..... | } | | class normal { | normal(double mu, double sigma); | double P(double x); | double Q(double x); | double invP(double x); | ...... | } Rather interesting idea. | | This interface has a few major benefits over raw functions: | | 1) Since Paul is using your C++ special functions library in the | implementation, there's no argument on the implementation side for C | compatibility. Without C compatibility as a driving force, you don't | need to stick with free functions and the corresponding combinatorial | explosion of hard to remember names. Agreed. | 2) A class interface also lets you carry around data specific to the | current "in use" distribution in one place, rather than | needing to stuff | it into every call (the mean in the case of Student's t, the mean and | deviation for the Normal, etc). | 3) This "normalizes" the interface for the calls to the distribution | functions - every call for "P" has exactly one argument, and | not two or three or four depending on the distribution in use. How would you envisage this working with Fisher, for example which has degrees of freedom 1 and 2, and a variance ratio. Is this a 1D or 2D or 3D? Its inversion will return df1 (given df2 and F and Probability) or df2 (given df1, F and Prob) or F (given Df1 and df2 and Prob) WOuld you like to flesh out how you suggest handling all these? | 4) The consistent interface is of course easier to document, | teach and learn, and easier to use. Yes, usability is a major requirement to allow all and sundry to USE this. | You might also want to provide a | function to obtain the non-cumulative distribution value (perhaps | operator() or dist() or something). Yes - most desriable - but this project is getting bigger, day by day ;-) (as an aside, John has devised a way to avoid bloat caused by the expectation that one can provide degrees of freedom as an integer OR a floating-point. Without his meta-magic, a serious downside of a fully templated version would be instantiation of many variants of functions). | Of course, you would probably templatize and you might want | to inherit | from 1D or 2D abstract base classes if you plan to provide | multidimensional distributions (or maybe not ...) and functions that | operate on distributions. | | In any case, I look forward to the results.... Watch this space... Paul --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS pbristow@hetp.u-net.com