Re: [boost] BOOST::SIMD - handling double : precision vs speed

2 Apr 2009


      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 02 April 2009, Joel Falcou wrote:
...
SIMD algorithms for double precision seem to be rather hard to do right.
It's difficult to get the right precision with respect to the scalar
reference as scalar algorithm take advantages of the internal 80 bits
floating points register, thus leading comparison between our
implementation and the reference to yields things like 3000 ulp (ie
10^-13 RMS instead of 10^16).
Fixing this is difficult and even if it's possible for some algorithms,
the average speed-up then drop to less than 10% - ie as fast as an
unrolled scalar call over the SIMD vector elements.
What should we enforce : precision or speed ? Or is the 10^-13 RMS enough ?
Why would you want your 64 bit SIMD floating point calculations to act like 
they are going through an x86 processor's 80 bit floating point unit?  I 
think the important thing is just conformance to the ieee floating point 
standards.  My impression was some of the early generations of SIMD 
instructions were not compliant, but the newer versions all are.  If you're 
just worried about comparison against some non-SIMD reference code, maybe it 
would help to use compiler flags to disable internal 80 bit rounding when 
compiling the reference code (I think you can do this on gcc at least) .
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAknU+yQACgkQ5vihyNWuA4XASgCfet2JbgYOGP1iUFZT/FKZg8TZ
C64AoMaQO5PvoI5nWGvawMxyoV73s8PP
=4rVX
-----END PGP SIGNATURE-----

Re: [boost] BOOST::SIMD - handling double : precision vs speed

Frank Mori Hess