
John Maddock wrote:
John Maddock wrote:
Which leads on to a quick comparison I ran against the "known good" data here: http://www.itl.nist.gov/div898/strd/univ/homepage.html The test program is attached, and outputs the relative error in the statistics calculated.
Oops, forgot the attachment, here it is.
After some more investigation: 1) I fell into the N vs N-1 standard deviation trap myself, corrected test file attached: hopefully right this time! The output is now: PI data: Error in mean is: 0 Error in SD is: 3.09757e-016 Lottery data: Error in mean is: 6.57202e-016 Error in SD is: 0 Accumulator 2 data: Error in mean is: 9.25186e-015 Error in SD is: 0.000499625 Accumulator 3 data: Error in mean is: 5.82076e-016 Error in SD is: 0.071196 Accumulator 4 data: Error in mean is: 9.87202e-015 Error in SD is: -1.#IND 2) I re-ran the last calculation using NTL::RR at 1000-bit precision, the final test case does give a sensible answer now rather than a NaN. But... 3) The results for standard deviation (taken as the square root of the variance) are still off, In the last "torture test" set of data from http://www.itl.nist.gov/div898/strd/univ/data/NumAcc4.dat I see: Test | Result | Rel Error Your code: | 0.1046776164 | 0.04677616448 Naive RMSD | 0.09995003803 | 0.0004996197271 True Value | 0.1 | 0 The "Naive RMSD" just does a very naive "root mean square deviation from the mean" calculation. I believe (but haven't checked) that the remaining difference between this "naive" calculation and the true value results from the inputs having inexact binary representations - I would need to lexical_cast everything from a string representation to an NTL::RR rather than storing as an intermediate double to verify. Can't be bothered to test this at present I'm afraid :-( It's still rather alarming though that the "naive" method appears to be 100 times more accurate than the accumulator. Hoping I'm doing something wrong yours, John.