
It sounds to me like such a thing would need to employ archetecture specific techniques such as compiler intrinsics and inline-assembly to be in any way competitive in terms of performance with what probably exists in closed source development shops. Also, these things are best offloaded to the GPU if possible, and that code isn't written in C++. simd intrinsic is actually enough but last time I proposed something on
Simonson, Lucanus J a écrit : those lines, it was dismissed as useless .... Gpu are limited by the speed of the bus and you'll end up spending too much time sending data back and forth. -- ___________________________________________ Joel Falcou - Assistant Professor PARALL Team - LRI - Universite Paris Sud XI Tel : (+33)1 69 15 66 35