
Hi David,
- Programmer tries to run the compiler on it, examines code - Programmer restructures loop nest to expose parallelism - Try compiler directives first, if available (tell compiler which loops to interchange, where to cache block, blocking factors, which loops to collapse, etc.)> - Programmer tries compiler again on restructured loop nest - Programmer adds directives to tell the compiler which loops - Programmer uses boost.simd to write vector code at a higher level than provided compiler intrinsics
Does that seem like a reasonable use case?
In theory, yes. In practice, the code can stop working when you change a compiler (sometimes just a new version of the same compiler), or something changes in the code around the loop, not mentioning the very usual case when somebody makes a simple and innocent-looking change to the loop code, and the auto-vectorizer silently switches off. I think explicitly using Boost.SIMD (or any other explicit solution like BLAS/MKL/whatever) is much more robust in practice. Thanks, Maxim