Re: [boost] [math/staticstics/design] How besttonamestatisticalfunctions?

12 Jul 2006

      |  -----Original Message-----
|  From: boost-bounces@lists.boost.org 
|  [mailto:boost-bounces@lists.boost.org] On Behalf Of John Maddock
|  Sent: 11 July 2006 17:26
|  To: boost@lists.boost.org
|  Subject: Re: [boost] [math/staticstics/design] How 
|  besttonamestatisticalfunctions?
|  
|  >>  As I mentioned before, these should be member functions,
|  >>  which could be called "density" (also called 'mass')
|  
|  Or distribution :-)

This seems quite clear to me - both density and mass sound too physical to
me,
though they are in common use.

What is important is that the documentation gives ALL the other possible
names.

|  >>  The inverse function could be called "inverse_cumulative"
|  > But excessively long :-(
|  True, how about "persentile", or is that to ambiguous?

Percentile might be better  - it is in the dictionary ;-))

But quantile is a more modern term and doesn't raise any questions about
multiplying /dividing by/with 100, a source of unnecessary confusion - as we
have found with Boost.Test.

So I'm strongly in favour of quantile.

But I also wonder if 'fraction' is a possible name?

OK

|  It would also be more natural to me for the 
|  cumulative_probability parameter to come last in the list.

Why?  Quantile is also cumulative?

|  > But I still worried that the whole scheme will lead to much bigger
|  > code compared to a set of names of (template) functions
|  > (because code that isn't in fact used will be generated).
|  
|  For template classes member functions are only instantiated 
|  when used, so if 
|  you only use one member, then that's the only one instantiated.

What that's what I thought - but I wanted expert reassurance before driving
into a dead-end ;-)

So my worry turns into a killer feature - keeping the cost of calling a
single student's t down to reasonable levels is crucially important.

Compared to linking to a "All_the_stats_functions_you_could_ever_want'.dll
it should be easily 'affordable', as they say.

Which also means that the cost of a Q or complement function is nothing
unless you use it.
(and you probably won't use the P version as well).
...
...
In other words, 1 - P. Right? One response is why do you
 need to define
 it, given how easy it is to get from the cumulative density
function?
Perhaps not really needed? Is there an accuracy reason for both?
| It depends how accurate you want to be: calculating 1-P incurs
cancellation 
| error if P is very near 1, where as for most (all?) distributions we can 
| calculate Q directly without the subraction from unity.

| I think the "Boostified" name would be in all lower case: students_t or
whatever.

Agree with this.

Paul

---
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
+44 1539561830 & SMS, Mobile +44 7714 330204 & SMS
pbristow@hetp.u-net.com