Re: [boost] BOOST::SIMD - handling double : precision vs speed

2 Apr 2009


      ...
On Thu, 2 Apr 2009, Joel Falcou wrote:
...
SIMD algorithms for double precision seem to be rather hard to do right.
It's difficult to get the right precision with respect to the scalar
reference as scalar algorithm take advantages of the internal 80 bits
floating points register, thus leading comparison between our
implementation and the reference to yields things like 3000 ulp (ie
10^-13 RMS instead of 10^16).
...
Discussions welcome.
My understanding is that that the problem lies with Intel's 80-bit 
"internal" precision.  I've seen people force a copy out of the FP 
registers to counteract this, but I forget the full logic behind why. 
Maybe just to achieve cross-platform repeatability.
For your purposes, it might be best to have "slow, IEEE-compliant" 
scalar ops for checking results and "fast, Intel-specific" scalars for 
comparing timings.
- Daniel
_______________________________________________
Unsubscribe & other changes: 
http://lists.boost.org/mailman/listinfo.cgi/boost
Yes I agree, I am working with joel and the problem is one I try to
dherring@ll.mit.edu a écrit :
treat. It seems to me that we must have speedy algorithm for simd 
double, because accracy will be slower than mere scalar mapping through 
simd vectors.
It also seems to me that simd is not still mature for double : 2 element 
is too less to hope a big gain with branchless algorithm for math 
functions...

           Jean-thierry

Re: [boost] BOOST::SIMD - handling double : precision vs speed

jtl