Re: [boost] Boost SIMD beta release

25 Dec 2012

      On Tue, Dec 25, 2012 at 7:10 PM, Joel Falcou <joel.falcou@gmail.com> wrote:
...
Le 25/12/2012 15:43, Peter Dimov a écrit :
...
Mathias Gaunard wrote:
...
The shifted iterator and the shifted load allow to do aligned loads if
you statically know the misalignment of the memory.
Does this have any performance advantage over just using an unaligned
load? I'd expect the microcode to do whatever the shifted load does, but
I haven't measured it.
Shifted load is a couple of aligned load + bit shuffling. This is a
technique steming from way back on Altivec. Experiments done on 1D filtering
using both show some benefits over unaligned load on pre-Nehalem CPUs.
AFAIK, even on post-Nehalem CPUs unaligned loads (and stores) are
slower if the operation spans across the cache line boundary. I don't
have the numbers though.

Will shifted_iterator use palignr from SSSE3?

Re: [boost] Boost SIMD beta release

Andrey Semashev