
On Fri, 1 Jun 2012, John Maddock wrote:
Support for fused multiply-add is on my TODO list.
Consider the case of modular arithmetic, where you use an int64_t to do the operations, and then reduce modulo some 30 bit prime number. When you perform many operations in a row, it is interesting to avoid computing the modulo every time but only do it often enough that there won't be an overflow. FMA is a special case where it will remove one modulo. But in a sum of a dozen numbers, you would want a single modulo at the end. Expression templates should be able to help, but I guess this is a wrong fit for this library (which is completely fine, I just thought I'd mention it). (in the special case where we compute modulo 2^30 instead of a prime number, and use an uint32_t for the computation, a finalization function would be enough) --------- I just had a look at the updated introduction, and I realize I said something wrong about the example with 11 temporaries: the optimization to get 0 temporary in gmp-5.1 is only for mpz and mpq, not mpf. Sorry for the misunderstanding. I may try to add the same optimization for mpf some time (mixed precisions give me headaches when I try to handle that code), but that's not really a plan. -- Marc Glisse