
Mathias Gaunard <mathias.gaunard@ens-lyon.org> writes:
On 06/06/2011 12:10, Robert Jones wrote:
From reading a few of the nt2 webpages, and wikipedia on SSE2, the business of exploiting SIMD capability seems to be in the domain of the compiler. How does this look from a library perspective? What are the mechanisms you'll use/consider?
This has been discussed several times in the past on this mailing list already.
I suggest you take a look at the Boostcon 2011 presentation: <https://github.com/boostcon/2011_presentations/raw/master/thu/simd.pdf>
I don't think this presentation makes the case for this library. That said, I am very glad you and others are thinking about these problems. Almost everything the compiler needs to vectorize well that it does not get from most language syntax can be summed up by two concepts: aliasing and alignment. I don't see how pack<> addresses the aliasing problem in any way that is not similar to simply grabbing local copies of global data or parameters. Various C++ "restrict" extensions already address the latter. We desperately need something much better than "restrict" in standard C++. Manycore is the future and parallel processing is the new normal. pack<> does address alignment, but it's overkill. It's also pessimistic. One does not always need aligned data to vectorize, so the conditions placed on pack<> are too restrictive. Furthermore, the alignment information pack<> does convey will likely get lost in the depths of the compiler, leading to suboptimal code generation unless that alignment information is available elsewhere (and it often is). I think a far more useful design of this library would be providing standard ways to assert certain conditions. For example: simd::assert(simd::is_aligned(&v[0], 16)) for (...) { } (Of course one can do the above today in standard C++ but the above is more readable.) and: simd::assert(simd::no_overlap(v, w)) for (...) { } or even for the old vectorheads: simd::assert(simd::ivdep) for (...) { } Provide simple things the compiler can recognize via pattern matching and we'll be a long way to getting the compiler to autovectorize. I like simd::allocator to provide certain guarantees to memory managed by containers. That plus some of the asserts described above could help generic code a lot. Other questions about the library: What's under the operators on pack<>? Is it assembly code? I wonder how pack<T> can know the best vector length. That is highly, highly code- and implementation-dependent. How does simd::where define pack<> elements of the result where the condition is false? Often the best solution is to leave them undefined but your example seems to require maintaining current values. How portable is Boost.simd? By portable I mean, how easy is it to move the code from one machine to another get the same level of performance? I don't mean to be too discouraging. But a library to do this kind of stuff seems archaic to me. It was archaic when Intel introduced MMX. If possible, I would like to see this evolve into a library to convey information to the compiler. -Dave