
----- "David A. Greene" <greened@obbligato.org> a écrit :
Mathias Gaunard <mathias.gaunard@ens-lyon.org> writes:
Making data parallelism simpler is the goal of NT2. And we do that by removing loops and pointers entirely.
First off, I want to apologize for sparking some emotions. That was not my intent. I am deeply sorry for not expressing myself well.
NT2 sounds very interesting! Does it generate the loops given calls into generic code?
Let me try to clarify my thinking about boost.simd a bit.
- There is a place for boost.simd. It's important that I emphasize this upfront.
- Here is how I see a typical programmer using it, given a loop nest:
- Programmer tries to run the compiler on it, examines code - Code sometimes (maybe most of the time) executes poorly - If not, done
- Programmer restructures loop nest to expose parallelism - Try compiler directives first, if available (tell compiler which loops to interchange, where to cache block, blocking factors, which loops to collapse, etc.) - Otherwise, hand-restructure (ouch!)
- Programmer tries compiler again on restructured loop nest - Code may execute poorly - If not, done
- Programmer adds directives to tell the compiler which loops to vectorize, which to leave scalar, etc. - Code may still execute poorly - If not, done
- Programmer uses boost.simd to write vector code at a higher level than provided compiler intrinsics
Does that seem like a reasonable use case?
I hope I won't disappoint you, but in my case I'd put boost.simd a few ranks up. Because, from what I've seen in slides, it's a quite easy and really clean way to do things. I've always been taught to dive into dirty things as a last resort, so I'd use boost.simd first, and eventually, if performance is still not met, try hand-written hints. Regards, Ivan