
DE wrote:
i'll run through that links If mor eis needed, feel free to mail me in private ;) for now listen as i can see you are familiar to simd programming
among other ;)
so when i researched the expression template based impementation of vector operations i tried to use sse2 though compiler intrinsics (since they are de facto portable) but i did not get any benefit for doubles ... i'm interested about what you can say on thi For simpel operation you can't get more than 20 or 30% speedup an dmost of the time you get none. You can however speed up things like transendental or trigo functions by 2 or 3.
Search the archive for my extended SIMD performances chart using the SIMD layer from NT2. We also target multicore using openMP (rather trivial) and are starting GPUs this year with a new post-doctoral grant. SO I hope to get everythign working together to get some compelte, be-all end-all matrix library out of that. -- ___________________________________________ Joel Falcou - Assistant Professor PARALL Team - LRI - Universite Paris Sud XI Tel : (+33)1 69 15 66 35