
Maybe. My understanding is that FFT doesn't become viable until the digit count grows truely huge? Would the Karatsuba algorithm be viable sooner and help more users? My gut feeling is that the current implementation is perfectly fine up to a few hundred (and maybe a thousand?) decimal places, which will probably take care of most use cases.... well my use cases anyway!
In my previous unpublished work (mp_cpp, also using base-10^8), I tuned to the following: static const INT32 mp_elem_karatsuba_min = static_cast< INT32>(40); // Approx. 280 digits. static const INT32 mp_elem_fft_min = static_cast< INT32>(129); // Approx. 900 digits.
I've got a recursive Karatsuba template available in my catalog already. Maybe we should try it out.
For sure!
The FFT mentioned above was FFTW (a sadly non-BPL wonder of mankind) so a vanilla FFT would be, maybe, 2 or 3 times slower.
Unless someone wants to implement the same optimisations over again as a template library? Way beyond our scope to do that though...
So maybe examples are a higher priority for now?
Yes, you are right. I just don't want to get into trouble with the community. Everyone wants to compute a million digits of pi, and they might get mad at us if boost can't do it.
Million digits of PI? Look 'em up I say :-)
We could also go through the current the code looking for optimisation oportunities - for example would the inversion code be more efficient using Halley rather than Newton iteration (see for example http://numbers.computation.free.fr/Constants/Algorithms/inverse.html)?
Interesting...
Lot's of good stuff explained on that site including binary splitting.
Also multiplication and division by integer (relatively common operations?) as special cases.
Fortunately, we already have this. The routines mul_unsigned_long_long() and div_unsigned_long_long() utilize O(N) linear mul/div by integer. And we support them in your interface functions eval_multiply() and eval_divide().
My mistake, excellent. Cheers, John.