Ann: Statistical distributions / Mathematical Special Functions

This package consists of 3 related components, and we're now seeking further feedback on these before the final push prior to a review: Statistical Distributions ~~~~~~~~~~~~~~~~~~~~~~~~~ Following feedback from the previous preview, we have re-organised so that distributions are now C++ classes. See the docs and especially the tutorial for illustrations on how this will be useful in real life. We would especially welcome feedback from those with experience of using statistics, and indeed from professional statisticians who are probably not Boosters. Mathematical Special Functions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ These are currently focused on those functions that are useful for statistical applications: Floating point classification. Gamma/lgamma/digamma/beta/erf/erfc Ratios of gamma functions Factorials The incomplete gamma and beta functions. The inverses of the gamma, beta, and error functions. Derivatives of the incomplete gamma and beta functions. Compared to the previous release, most of the changes involve accuracy and quality of implementation issues. Error handling has now been revised to provide a comprehensive package-wide error handling scheme complete with meaningful error messages. Toolkit ~~~~~~~ Provides tools to assist in the implementation of numerical methods: Infinite series evaluation. Continued fraction evaluation. Polynomial and rational function evaluation. Root finding with derivatives (Newton/Halley/Schroeder methods). Root finding without derivatives. Function minimisation. Misc Tools ~~~~~~~~~~ These are strictly experimental, but are used in the development of approximations: they are provided to hopefully encourage others to develop further special function implementations. Tools include: Polynomial arithmetic and manipulation. Minimax approximations (the Remez algorithm). Helper functions used in testing etc. Documentation and source downloads are available online at www.johnmaddock.co.uk/toolkit Regards, John Maddock.

Looks fine. Paul | -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of John Maddock | Sent: 04 October 2006 18:43 | To: Boost mailing list; boost-users; boost-announce | Subject: [boost] Ann: Statistical distributions / | Mathematical SpecialFunctions | | This package consists of 3 related components, and we're now | seeking further | feedback on these before the final push prior to a review: | | Statistical Distributions | ~~~~~~~~~~~~~~~~~~~~~~~~~ | | Following feedback from the previous preview, we have re-organised | so that distributions are now C++ classes. | | See the docs and especially the tutorial for illustrations on how | this will be useful in real life. | | We would especially welcome feedback from those with experience of | using statistics, and indeed from professional statisticians who are | probably not Boosters. | | Mathematical Special Functions | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | These are currently focused on those functions that are useful for | statistical applications: | | Floating point classification. | Gamma/lgamma/digamma/beta/erf/erfc | Ratios of gamma functions | Factorials | The incomplete gamma and beta functions. | The inverses of the gamma, beta, and error functions. | Derivatives of the incomplete gamma and beta functions. | | Compared to the previous release, most of the changes | involve accuracy and | quality of implementation issues. Error handling has now | been revised to | provide a comprehensive package-wide error handling scheme | complete with | meaningful error messages. | | Toolkit | ~~~~~~~ | | Provides tools to assist in the implementation of numerical methods: | | Infinite series evaluation. | Continued fraction evaluation. | Polynomial and rational function evaluation. | Root finding with derivatives (Newton/Halley/Schroeder methods). | Root finding without derivatives. | Function minimisation. | | Misc Tools | ~~~~~~~~~~ | | These are strictly experimental, but are used in the development of | approximations: they are provided to hopefully encourage | others to develop | further special function implementations. Tools include: | | Polynomial arithmetic and manipulation. | Minimax approximations (the Remez algorithm). | Helper functions used in testing etc. | | Documentation and source downloads are available online at | www.johnmaddock.co.uk/toolkit | | Regards, | | John Maddock. | | | | | _______________________________________________ | Unsubscribe & other changes: | http://lists.boost.org/mailman/listinfo.cgi/boost |

This looks really good. After just a short look, 2 points: It's great to see that the functions section shows the algorithms used and gives an indication of accuracy. I didn't see this in the distributions though. wish list - there is a central chi-square. I could use also non-central chi-square.

Neal Becker wrote:
This looks really good. After just a short look, 2 points:
It's great to see that the functions section shows the algorithms used and gives an indication of accuracy. I didn't see this in the distributions though.
Yep, the lack of that info for the distributions has been vaguely nagging at the back of my brain. The implementations are mostly trivially in terms of other functions, so it'll probably just be "accuracy: see incomplete beta" or whatever.
wish list - there is a central chi-square. I could use also non-central chi-square.
I was afraid someone was going to ask for that :-) Paul and I have been deliberately avoiding the non-central distributions ('cos they're hard!), but if anyone has any references / implementation pointers I'll at least take a look and see if it's feasible. Thanks for the comments, John.

John Maddock said: (by the date of Wed, 4 Oct 2006 18:43:00 +0100)
Mathematical Special Functions
how about sigmoid, double sigmoid and their derivatives? Used for neural networks. Just a suggestion. They are so simple, that perhaps you can assume that the user can write them himself (as it was done in the past). Unless you know some optimized way to implement them.... -- Janek Kozicki |

Janek Kozicki wrote:
John Maddock said: (by the date of Wed, 4 Oct 2006 18:43:00 +0100)
Mathematical Special Functions
how about sigmoid, double sigmoid and their derivatives? Used for neural networks.
Those are new ones to me.
Just a suggestion. They are so simple, that perhaps you can assume that the user can write them himself (as it was done in the past).
Unless you know some optimized way to implement them....
I suspect you will never improve on a call to exp() ? Of course if you want 1 - sigmoid(x) then you can rearrange things to avoid cancellation error. But I don't know how important such things are. One of the problems we have is that there are a virtually unlimited number of special functions, not least all those in TR1. So it's a case of trying to draw the line in the right place, but do feel free to lobby for your favorites :-) John.

On Fri, 2006-10-06 at 10:04 +0100, John Maddock wrote:
how about sigmoid, double sigmoid and their derivatives? Used for neural networks.
Those are new ones to me.
Well, this not so strange. As far as I know, the term sigmoid does not describe a function but merely a family of functions of a similar shape (the actual equation is not so important). Eg 1+tanh(x). IMHO unless there is a more formal or accurate concept of what is a sigmoid, it should not be added to the library. Theo.

John Maddock said: (by the date of Fri, 6 Oct 2006 10:04:56 +0100)
Janek Kozicki wrote:
Mathematical Special Functions how about sigmoid, double sigmoid and their derivatives? Used for neural networks. Those are new ones to me.
yes, actually there is a wide family of sigmoid functions, they all have a similar shape. However there is one function called "sigmoid" and unless otherwise specified what sigmoid function is used, this one is assumed: sigmoid(x)=1/(1+exp(-x)) Theodore, please note that: (tanh(x/2)+1)/2 = 1/(1+exp(-x)) It's the same function, only written in different way. First parameter to tweak here, is the steep: sigmoid(x,s)=1/(1+exp(-x/s)) But there are other sigmoid functions. Double sigmoid is this: double_sigmoid(x,d,s) = sign(x-d)(1-exp(-((x-d)/s)^2)) d - function centre s - steep factor Actually it's all here: http://en.wikipedia.org/wiki/Sigmoid_function http://en.wikipedia.org/wiki/Gaussian_curve Both sigmoid functions and bell curves (e.g. gaussian) are useful in neural networks. But also it is necessary to have their first derivative. The most popular learning algorithm (called backpropagation) uses a gradient descent in n-dimensional space, and there the derivatives are necessary. Please note that it's the function shape that is important, not its exact vaules at any point. So if you know a 'faster' function that has similar shape and has a first derivative I'd like to know about it too! -- Janek Kozicki |

Janek Kozicki wrote:
Please note that it's the function shape that is important, not its exact vaules at any point. So if you know a 'faster' function that has similar shape and has a first derivative I'd like to know about it too!
Oh. I guess if x is small, then you could use a truncated tailor series, and either evaluate it as a polynomial, or maybe even just as: 1/(2 - x + x*x) but other than that, I'd be surprised if there's anything much faster than exp()? John.

John Maddock said: (by the date of Sun, 8 Oct 2006 16:12:02 +0100)
Janek Kozicki wrote:
Please note that it's the function shape that is important, not its exact vaules at any point. So if you know a 'faster' function that has similar shape and has a first derivative I'd like to know about it too!
I guess if x is small, then you could use a truncated tailor series, and either evaluate it as a polynomial, or maybe even just as:
1/(2 - x + x*x)
but other than that, I'd be surprised if there's anything much faster than exp()?
(to readers: this function is a bell curve.]) hmm.. you are right. My knowledge of neural networks is limited, I haven't had a dedicated course, so it's only what I managed to learn myself from articles I have found. And I learned only enough to get my job done, and have some fun with it altogether. My conculusion is that maybe I'm wrong to say that it is the shape only (although I belive one of the papers I read said exactly that). But then - why people are not using a simple polymonial for that? Oh, I know why.... its integral is this: 2*atan((2*x-1)/sqrt(7))/sqrt(7) (this is a sigmoid function) I belive that backpropagation will work correctly when the derivatives/integrals are exact. Neural networks use both: the sigmoid and it's derivative (a bell curve), and they both need to be fast. Oh, and I found another reason right here in wikipedia :) " A reason for sigmoid popularity in neural networks is because the sigmoid function satisfies this property: sigmoid_derivative(x) = sigmoid(x)*(1-sigmoid(x)) " So I was wrong saying that it's only the shape that is important :) So, are you going to add those? - sigmoid, - sigmoid_derivative, - double_sigmoid, - double_sigmoid_derivative The derivative of double sigmoid can be done as a connection of two sigmoid_derivatives :) Another question: how about Fast Fourier Transform? is it by definition already out of the scope of your library, or maybe not? Or maybe it's already there but I haven't found it? :) -- Janek Kozicki |

Janek Kozicki wrote:
My conculusion is that maybe I'm wrong to say that it is the shape only (although I belive one of the papers I read said exactly that). But then - why people are not using a simple polymonial for that?
Right after I posted, I thought "why not use a rational approximation", or better a polynomial approximation: the derivatives of the latter are of course trivial. I did a quick Google and found http://lib.tkk.fi/Diss/2005/isbn9512275279/article8.pdf#search=%22sigmoid%20... Which is basically about implementing the sigmoid in hardware, but Eq15 suggests that: 1 - 0.5 * (1 - (0.25 * x)^2) is a viable approximation. If ldexp is faster than a multiply then the multiplication by powers of 2 could be replaced with ldexp.
Oh, I know why.... its integral is this:
2*atan((2*x-1)/sqrt(7))/sqrt(7)
(this is a sigmoid function)
I belive that backpropagation will work correctly when the derivatives/integrals are exact.
Neural networks use both: the sigmoid and it's derivative (a bell curve), and they both need to be fast.
Oh, and I found another reason right here in wikipedia :)
" A reason for sigmoid popularity in neural networks is because the sigmoid function satisfies this property:
sigmoid_derivative(x) = sigmoid(x)*(1-sigmoid(x)) "
So I was wrong saying that it's only the shape that is important :)
So, are you going to add those?
- sigmoid, - sigmoid_derivative, - double_sigmoid, - double_sigmoid_derivative
The derivative of double sigmoid can be done as a connection of two sigmoid_derivatives :)
No, I believe it's outside the scope of the current library: providing support for the statistical code. Or to put it another way: there's too much else to do!
Another question: how about Fast Fourier Transform? is it by definition already out of the scope of your library, or maybe not?
Yes, definitely outside the scope, and also patent protected I believe. John.

John Maddock said: (by the date of Sun, 8 Oct 2006 18:39:07 +0100)
So, are you going to add those?
- sigmoid, - sigmoid_derivative, - double_sigmoid, - double_sigmoid_derivative
The derivative of double sigmoid can be done as a connection of two sigmoid_derivatives :)
No, I believe it's outside the scope of the current library: providing support for the statistical code.
Or to put it another way: there's too much else to do!
uhh... so maybe rename component "Mathematical Special Functions" into "Statistical Special Functions" ? I think it was natural for me to search for my favorites inside a component named like that... -- Janek Kozicki |

Janek Kozicki wrote:
uhh... so maybe rename component "Mathematical Special Functions" into "Statistical Special Functions" ?
I think it was natural for me to search for my favorites inside a component named like that...
Fair comment, however there's a fairly well defined set of special functions listed in A&S, most of which are hoisted into TR1 that (IMO) should be implemented before domain specific functions like these. There's also the problem that since I'm not familiar with the domain, I have no way of judging what constitutes a "good" implementation: what potential users would be looking for in other words. Not that this stops you from providing a submission of those of course :-) One of things I'd like to try and do is encourage others to get involved in writing more special functions. Now that the toolkit has support for rational-approximations (Remez) I'm hoping the bar might be a little lower... or maybe not, as Remez is so notoriously flakey to use :-( John.

"John Maddock" <john@johnmaddock.co.uk> wrote:
Another question: how about Fast Fourier Transform? is it by definition already out of the scope of your library, or maybe not?
Yes, definitely outside the scope, and also patent protected I believe.
FFT is way too old to be patent protected. There are a number of free, high quality implementations. Cheers, Walter Landry wlandry@ucsd.edu

At 01:43 PM 10/4/2006, you wrote:
Documentation and source downloads are available online at www.johnmaddock.co.uk/toolkit
Regards,
John Maddock.
I've just started reading the documentation but I want to point out that phrases like: "we conclude that there is no significant difference, and accept the null hypothesis" are likely to interfere with any statistician taking the package seriously (unjustly, I think -- the statistics may be weak but you've obviously worked hard at the numerics, which is what you are supplying). One never, ever "accepts the null hypothesis." One collects evidence and on that basis you either reject the null hypothesis or fail to reject it. The point is that you don't ever really have evidence *for* the null hypothesis, only a lack of evidence against it. It is quite a different thing to say "any difference in the means in this test is statistically insignificant" than to say "This test gives me an objective reason to believe that the difference in the means is exactly 0 (i.e., the null hypothesis) rather than, say, 1.0E-23 (which is as much a part of the alternate hypothesis as is 1.0E+23)". Or in other words, the lack of evidence of a difference should not be taken as evidence of a lack of difference. Topher

Topher Cooper wrote:
At 01:43 PM 10/4/2006, you wrote:
Documentation and source downloads are available online at www.johnmaddock.co.uk/toolkit
Regards,
John Maddock.
I've just started reading the documentation but I want to point out that phrases like:
"we conclude that there is no significant difference, and accept the null hypothesis"
are likely to interfere with any statistician taking the package seriously (unjustly, I think -- the statistics may be weak but you've obviously worked hard at the numerics, which is what you are supplying).
One never, ever "accepts the null hypothesis." One collects evidence and on that basis you either reject the null hypothesis or fail to reject it. The point is that you don't ever really have evidence *for* the null hypothesis, only a lack of evidence against it. It is quite a different thing to say "any difference in the means in this test is statistically insignificant" than to say "This test gives me an objective reason to believe that the difference in the means is exactly 0 (i.e., the null hypothesis) rather than, say, 1.0E-23 (which is as much a part of the alternate hypothesis as is 1.0E+23)". Or in other words, the lack of evidence of a difference should not be taken as evidence of a lack of difference.
Ah, very good point. Thanks for raising this, looks like another editing session is needed, I'm fairly sure there will be other areas where we've got the terminology wrong as well. John.

| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of Topher Cooper | Sent: 06 October 2006 00:01 | To: boost@lists.boost.org | Subject: Re: [boost] Ann: Statistical distributions / | Mathematical Special Functions | | | | At 01:43 PM 10/4/2006, you wrote: | I've just started reading the documentation but I want to point out | that phrases like: | | "we conclude that there is no significant difference, and accept the | null hypothesis" | | are likely to interfere with any statistician taking the package | seriously (unjustly, I think -- the statistics may be weak | but you've | obviously worked hard at the numerics, which is what you are | supplying). | | One never, ever "accepts the null hypothesis." One collects | evidence | and on that basis you either reject the null hypothesis or fail to | reject it. The point is that you don't ever really have evidence | *for* the null hypothesis, only a lack of evidence against | it. It is | quite a different thing to say "any difference in the means in this | test is statistically insignificant" than to say "This test gives me | an objective reason to believe that the difference in the means is | exactly 0 (i.e., the null hypothesis) rather than, say, 1.0E-23 | (which is as much a part of the alternate hypothesis as is | 1.0E+23)". Or in other words, the lack of evidence of a difference | should not be taken as evidence of a lack of difference. It is really valuable to have professional statisticans input (correction!) like this because we do want acceptance by professionals. But we also have a much bigger audience/potential 'amateur' users who start off being massively repelled by statistics-speak words like 'null hypothesis': meeting the requirements of both at the same time is not so easy. And we'll try to put this right and expose to your scrutiny again. Paul --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS pbristow@hetp.u-net.com
participants (7)
-
Janek Kozicki
-
John Maddock
-
Neal Becker
-
Paul A Bristow
-
Theodore Papadopoulo
-
Topher Cooper
-
Walter Landry