
Long e-mail here. Lot's of information and need your inputs.
My attempts with recursive Karatsuba in base-10 were a disaster. The expected benefit was not achieved---all the way up to 100,000 decimal digits.
:-(
I experimented with FFTs. We need your inputs if we *really* want to make very high-precision available at this stage. The community might expect it. But I really don't know if you would like to introduce a potentially poorly tested FFT-based multiplication scheme at this late stage in the game or if you would prefer to do it in a more controlled and better tested fashion after review, potential acceptance and establishment.
A *dumb* FFT was slow, not out-performing O(N^2) until over 50,000 decimal digits. A merely *naive* FFT of Danielson-Lanczos type was better, out-performing O(N^2) at around 10,000 decimal digits. It's low-impact, clean code, maybe about 30 lines of code.
Nice, had no idea it could be implemented in such a small space!
Still, the naive FFT gives a slow multiplication. Performing 100 individual 100,000 digit square-root calculation with the naive FFT requires 12 seconds on my machine. State-of-the-art MPFR does the same test square-root in about 0.8 sec. This puts us off of the best by a factor of 15. My goal with base-10, no Karatsuba and no Toom-Cook would be, say, 3-4 times worse. Still, the naive FFT is a good start---not embarrassing, rather, shows the potential and paves the way for improvement over the years.
My program crashed for a million digit square-root calculation.
I could potentially make a much faster FFT (factor 3 or 4) in fall or winter.
The square-root algorithm required some slight modifications to properly converge for high digits. Other Newton-iterations may need similar tuning if we decide to make high-precision available. Plus ultra-high precision needs other algos for atan, exp, etc., as Taylor loses effectiveness. Would need to build these up over years if you would like to go that way.
Sigh.... yes I forgot about those, so it sounds like for now at least these more advanced algorithms aren't going to help much - at least not for cpp_dec_float's target ordiance - those wanting higher precision than long double, but not crazy digit counts? So my gut feeling is to hold off for now, and maybe do things properly later (dynamic allocation, better std lib support etc)?
So maybe examples are a higher priority for now?
We have some examples now. I am awaiting your input if you think I should contribute more floating-point examples.
I think the existing ones work out quite nicely now thanks, I could use some ideas for integer and maybe rational number examples, but that's a whole other ball game ;-)
P.S. I also have a potential solution to my excessive guard digits.
I resolved the excessive guard digit issue and reduced the guard digits to the necessary amount (and provide rationale in a code comment).
It may require testing the test cases again because tiny tolerances might shift. Not expected, but maybe.
With your permission, I can check it in on Tuesday.
Sure, or mail me a patch and I'll run the full tests (or you can - just cd into libs/multiprecision/test and do a "bjam --enable-specfun" and then wait for a long time!) Thanks, John.