
Paul A Bristow wrote:
Well, if you regard the degrees of freedom as fixed, or the probability as fixed, often 95%,
then yes,
but, I would say that they are 2D (and others 3D) distributions.
To keep it simpler, lets go back to the students t which I have implemented (actually templates but ignore that for now) as
double students_t(double degrees_of_freedom, double t)
t is roughly a measure of difference between two things (means for example)
this returns the probability that the things are different.
If degrees_of_freedom are small (you only measured 3 times, say),
then t can be big, but it still doesn't mean much.
But if you made a 100 measurements, it probably does.
When you do the inverse, you may want to say, I want to be 95% confident, and I already have fixed the degrees_of_freedom, so what is the corresponding value for t. This is what the ubiquitous styudent's t tables do.
On the other hand, sometimes you may decide you want 95% confidence, and you have already made some measurements of t, but you want to know how many (more probably) measurements (degrees_of_freedom) you would have to make to get this 95%.
This is common problem - and often reveals in drug trials, for example, that there are not enough potential patients available to carry out a trial and achieve a 95% probability.
If you accept this, then the problem is how to name the two, or three 'inverses' (and complements).
students_t_inv_t and students_t_inv_df ???
I think you're confusing *the* inverse cumulative distribution function with other possible inverse functions that can be defined for each specific distribution. This is why I really dislike a name like "students_t_inv_t", which tells me very little about what it is. So let's use the Students T distribution as an example. The Students T distribution is a *family* of 1-dimensional distributions that depend on a single parameter, called "degrees of freedom". Given a value, say, D, for the degrees of freedom, you get a density function p_D and integrating it gives you the cumulative density function P_D. As I mentioned before, these should be member functions, which could be called "density" and "cumulative". The cumulative density function is a strictly increasing function and therefore can be inverted. The inverse function could be called "inverse_cumulative", which is a completely unambiguous name. I would say that these three member functions should be common to all implemented distributions. Other common member functions might include "mean", "variance", and possibly others. Finally, you observe that it is often useful to specify the cumulative probability for a given value of the random variable and solve for the parameter (the "degrees of freedom" for a Students T distribution) that determines the distribution. Since each family of distributions depends on a different set of parameters (for example, normal distributions depend on two parameters, the mean and variance), the interface for this is trickier to define. I can think of two possibilities (I prefer the first): 1) Define ad hoc inverse functions for each specific distribution. So for the Students T distribution, you would define a member function of the form: double degrees_of_freedom(double cumulative_probability, double random_variable) const; 2) Always specify distribution parameters (other than the random variable itself) in the constructor using a tuple (a 1-tuple for the Students T and a 2-tuple for the normal). You could then define templated inverse functions: template <unsigned int index> double inverse(double cumulative probability, double random_variable) const; Each function would hold all other parameters fixed (as set by the constructor) and solve for the parameter specified by the index. (I don't like using tuples as an input type, because it means I always have to be very careful about the order of the parameters.) Deane