
On 11/10/2010 17:54, DE wrote:
hi there if you focus on simd-aware - and more specifically x86 simd - implementation _DON'T_ read further consider OpenCL as a general way to speed up computations (which uses simd as one of backends as well as gpu shader units etc.) i think it will be much more general and useful
OpenCL is a possible implementation, albeit we choose to call the various SIMD instructions ourselves to have more control on the toolchain and the end result. OpenCL will however be our main backend for GPU targets which we will be supporting in the future. Or maybe we will just target it through Clang and LLVM. NT2 (also know as the crazy frenchman library), upon which this effort is based, only supports x86 (SSE, ..., SSE4, AVX) and PowerPC (AltiVec). ARM (NEON, VFP) is being added. An effort has been made in its design so that instructions could register themselves for a particular (type, cardinal) pair, all of which ranked according to a category to select the best candidate. It heavily uses meta-programming, including rewritten bits of MPL to augment its compile-time performance. It also uses expression templates with Proto to detect certain patterns, such as fused multiply-add, for which x86 is introducing new instructions soon. Therefore it is very tunable and extensible. The bit I wanted to discuss here, however, is not the implementation, but rather the interface that the library provides to the programmer. We aim to provide an interface in modern C++ that integrates well with the standard library and Boost in order to allow developers to make use of SIMD in an easy and fairly high-level fashion, potentially using meta-programming to write an algorithm with parametric register and cache sizes. OpenCL is not an interface that satisfies those criteria.