
on 11.05.2010 at 20:20 joel falcou wrote :
DE wrote:
here a question arises: since a compiler is able to generate very fast code involving simd instructions is one supposed to provide simd-enabled implementation of a generic library? personally i think now that it is worthless
Yeah sure, let's look how icc is able to vectorize arbitrary large arithmetic function (like cos, sqrt etc that don't have a 1-1 SIMD intrinsic mapping). i think i agree
Other think. If you work on dynamic memory, the compiler won't be doing any vectorization as the memory may not be aligned properly. icc11 does it for dynamically allocated memory both for C and C++ code i think it's thanks to 'new' which allocates memory 8 byte aligned
Add to that loop with dependencies that human know how to rewirte (but not compiler), maybe
arbitrary function support, SoA support for thing like complex or other ADS ... what's SoA? and ADS?
Oh and, if your code as right, then both would have been inlined. That's what happen in our library (which lead us to rmeove the unroll settings) don't get it
-- Pavel