
On 20-12-2012 12:52, Mathias Gaunard wrote:
You can choose which extension to generate code for at compile time.
The approach we recommend is to compile various versions of your function with different settings, and then choose the right one at runtime depending on the host capabilities. We provide functions to easily check whether an extension is supported.
That seems cool. Can illegal instructions be in a binary as long as they are not exeuted?
B. Floating point precision is crucial in some of our software. Maybe you could provide kahan_sum and high-precision normalization functions? Do you know if there are any problems about the precision in SIMD code?
We took great care of guaranteeing high precision with all our functions, along with nan/inf support and, when reasonably supported by the hardware, denormal support. Some of these have performance costs, and can be disabled.
The main difference when writing SSE code is that float operations are really done in single-precision and double in double-precision, which is not the case when using the x87 FPU for example, which may use up to 80 bits of precision for intermediate computations.
Sure, the 80 bits are one reason that a naive sum is often as precise as a kahan sum on x87. Let me ask another way: compilers often provide flags that affect how code is generate for fp. For example, we use fp:precise on VS. Do fp:precise affect SIMD code or? -Thorsten