
| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of John Maddock | Sent: 11 July 2006 17:26 | To: boost@lists.boost.org | Subject: Re: [boost] [math/staticstics/design] How | besttonamestatisticalfunctions? | | >> As I mentioned before, these should be member functions, | >> which could be called "density" (also called 'mass') | | Or distribution :-) This seems quite clear to me - both density and mass sound too physical to me, though they are in common use. What is important is that the documentation gives ALL the other possible names. | >> The inverse function could be called "inverse_cumulative" | > But excessively long :-( | True, how about "persentile", or is that to ambiguous? Percentile might be better - it is in the dictionary ;-)) But quantile is a more modern term and doesn't raise any questions about multiplying /dividing by/with 100, a source of unnecessary confusion - as we have found with Boost.Test. So I'm strongly in favour of quantile. But I also wonder if 'fraction' is a possible name? | >> 1) Define ad hoc inverse functions for each specific | >> distribution. So | >> for the Students T distribution, you would define a member | >> function of the form: | >> | >> double degrees_of_freedom(double cumulative_probability, double | >> random_variable) const; | | That could be a static member function, since we're solving | for the degrees of freedom parameter. OK | It would also be more natural to me for the | cumulative_probability parameter to come last in the list. Why? Quantile is also cumulative? | > But I still worried that the whole scheme will lead to much bigger | > code compared to a set of names of (template) functions | > (because code that isn't in fact used will be generated). | | For template classes member functions are only instantiated | when used, so if | you only use one member, then that's the only one instantiated. What that's what I thought - but I wanted expert reassurance before driving into a dead-end ;-) So my worry turns into a killer feature - keeping the cost of calling a single student's t down to reasonable levels is crucially important. Compared to linking to a "All_the_stats_functions_you_could_ever_want'.dll it should be easily 'affordable', as they say. Which also means that the cost of a Q or complement function is nothing unless you use it. (and you probably won't use the P version as well).
In other words, 1 - P. Right? One response is why do you need to define it, given how easy it is to get from the cumulative density function? Perhaps not really needed? Is there an accuracy reason for both?
| It depends how accurate you want to be: calculating 1-P incurs cancellation | error if P is very near 1, where as for most (all?) distributions we can | calculate Q directly without the subraction from unity. | I think the "Boostified" name would be in all lower case: students_t or whatever. Agree with this. Paul --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS pbristow@hetp.u-net.com