Re: [boost] [gsoc] boost.simd news from the front.

10 Jun 2011

      On 10/06/11 15:16, David A. Greene wrote:
...
I don't think this presentation makes the case for this library.  That
said, I am very glad you and others are thinking about these problems.
Sorry then
...
Almost everything the compiler needs to vectorize well that it does not
get from most language syntax can be summed up by two concepts: aliasing
and alignment.
No. How can a compiler vectorized a function in another binary .o ?
Like who is gonna vectorize cos and its ilk ?
...
I don't see how pack<>  addresses the aliasing problem in any way that is
not similar to simply grabbing local copies of global data or
parameters.  Various C++ "restrict" extensions already address the
latter.  We desperately need something much better than "restrict" in
standard C++.  Manycore is the future and parallel processing is the new
normal.
If you read the slides, you would have seen that pack is like the 
messenger of the whole sidm range system which is fitting right into
*higher level of abstraction* and not some piggy backing of the compiler.
...
pack<>  does address alignment, but it's overkill.  It's also
pessimistic.  One does not always need aligned data to vectorize, so the
conditions placed on pack<>  are too restrictive.  Furthermore, the
alignment information pack<>  does convey will likely get lost in the
depths of the compiler, leading to suboptimal code generation unless
that alignment information is available elsewhere (and it often is).
Well, my benchmarks disagree with this. See this old post of mine one 
year ago about the same subject. If getting 95% of peak performances
is pessimistic, then sorry.
...
I think a far more useful design of this library would be providing
standard ways to assert certain conditions.  For example:
No. Range that accept SIMD operations are a perfect HL feature. We are
writing a library not an extension for compilers.
...
What's under the operators on pack<>?  Is it assembly code?
No as naked assembly prevent proper inlining and other register based 
compiler optimisation. We use w/e intirnsic is avialable for the current
compiler/architecture at hand.
...
I wonder how pack<T>  can know the best vector length.  That is highly,
highly code- and implementation-dependent.
No. On SSEx machine, SIMD vector are 128 bits, this means pack<T, 
sizeof(T)/16> is optimal so a simple meta-function finds it.
...
How does simd::where define pack<>  elements of the result where the
condition is false?  Often the best solution is to leave them undefined
but your example seems to require maintaining current values.
This make no sense. False is [0 ... 0] True is [ ~0 ... ~0]. Period.
SIMD is all about branchless, so everything is computed in the whole 
vector. I seems to me you didnt get that pack is NOT a data container 
but a layer above SIMD registers that then get hidden under concept of 
ContiguousRange.
...
How portable is Boost.simd?  By portable I mean, how easy is it to move
the code from one machine to another get the same level of performance?
Works on gcc, msvc, sse and altivec, and we started looking at ARM NEON.
Most of these have the same level of performance
...
I don't mean to be too discouraging.  But a library to do this kind of
stuff seems archaic to me.  It was archaic when Intel introduced MMX.
If possible, I would like to see this evolve into a library to convey
information to the compiler.
I'll keep my archaic stuff giving me a x4-x8
speed up rather than waiting for compiler based solution nobody were
able to give me since 1999 ...

We already had this discussion two years ago, so i am not keen to go all 
over again as it clearly seems you are just retelling the same FUD that 
last time.

Re: [boost] [gsoc] boost.simd news from the front.

Joel falcou