Re: [boost] BOOST::SIMD - handling double : precision vs speed

2 Apr 2009

      On Thu, 2 Apr 2009, Joel Falcou wrote:
...
SIMD algorithms for double precision seem to be rather hard to do right.
It's difficult to get the right precision with respect to the scalar
reference as scalar algorithm take advantages of the internal 80 bits
floating points register, thus leading comparison between our
implementation and the reference to yields things like 3000 ulp (ie
10^-13 RMS instead of 10^16).
...
Discussions welcome.
My understanding is that that the problem lies with Intel's 80-bit 
"internal" precision.  I've seen people force a copy out of the FP 
registers to counteract this, but I forget the full logic behind why. 
Maybe just to achieve cross-platform repeatability.

For your purposes, it might be best to have "slow, IEEE-compliant" scalar 
ops for checking results and "fast, Intel-specific" scalars for comparing 
timings.

- Daniel

Re: [boost] BOOST::SIMD - handling double : precision vs speed

dherring＠ll.mit.edu