
DE wrote:
here a question arises: since a compiler is able to generate very fast code involving simd instructions is one supposed to provide simd-enabled implementation of a generic library? personally i think now that it is worthless
Yeah sure, let's look how icc is able to vectorize arbitrary large arithmetic function (like cos, sqrt etc that don't have a 1-1 SIMD intrinsic mapping). Other think. If you work on dynamic memory, the compiler won't be doing any vectorization as the memory may not be aligned properly. Add to that loop with dependencies that human know how to rewirte (but not compiler), arbitrary function support, SoA support for thing like complex or other ADS ... Oh and, if your code as right, then both would have been inlined. That's what happen in our library (which lead us to rmeove the unroll settings) -- ___________________________________________ Joel Falcou - Assistant Professor PARALL Team - LRI - Universite Paris Sud XI Tel : (+33)1 69 15 66 35