
Also, after a rough glance through multi_array and uBLAS it looks
like
GIL, uBLAS, and multi_array seem to repeat very similar concepts (and seem to use different syntax). Isn't that kind of bad? I have been playing with GIL in the past but now I am scratching my head about these other libraries.
Beware that running ''some tests'' on arbitrary large array of any kind must take into account the fact that you may induces cache misses especially in the above example where I don't think a quarter of the data fit into the L1 cache. Considering cache misses may lead to a ten fold computing time, this may render performance tests unusable. I have struggled a few with optimized array and tried many way to do so. Seems the easiest way to have : - chained operator[ ] access syntax - correct performance in L1 cache - easy way to perform tiled computation - comaptible with other high-performances settings (padding & alignement for SIMD for example) is not easy. Best way so far is to use a compact Numeric Recipes like structure aka for a D1*D2*..*Dn array, you have to allocate a D1*D2*..*Dn memory block and n-1 array of pointer of different level to each dimension. Then returning the top level pointer (a T**.** with n *) grants you a way to acces element i1,i2,..,in through n[] call while having no stride computation to do manually and so, making tiling easy to write as a multiple array loop nest. And most of the time, a container that can't support tiling easily is useless in most real life application as this simple techniques can significantly increase runtime performances. I actually wrote a thesis on the subject and still struggle to get a decent library out of all my experiments.There is some prototype of it but it's nowhere completed (NT2 on sourceforge for reference) in which I use Epression tempalte to evaluates at compile time a somewhat optimal tiling size. If needed I can reinject my experiments results here so we can all brainstorm about this but I think the problem of "easy-to-use and good performances" numeric data container is not easy and not currently addressed anywhere.