
Le Jeu 7 août 2008 21:09, Andrea Denzler a écrit :
Working with indexes is defintively easier and more confortable. And probably in many situation the overhead is not worth to be optimized at all.
I only was surprised when you said there is a
indexes. Again my hands were faster
That's what my experiments shown for months performance gain using than my thoughts. Apologizes for the misunderstanding
You are loosing performance unless the compiler is so intelligent to
avoid the multiplication at each step. I yet have to find a compiler not that intelligent. From VC6 to ICC and g++ since v 3.x, i never had to complain about the compiler output. Then again, most computation is prefetched during the array allocation ( see nrc_alloc_array function in my source).
When performance is really an issue then I first re-think the algorithm algorithmic optimisations are always best anyway
after I check the produced assembler listing to see what the compiler did. I was used to do this but I really think it's not necessary by now with modern compiler.
If you are lucky the compiler can produce code that does this in one instruction set, if not you will get an overhead. Again if h and w are low values you will not notice it. And that's exactly what happens when you do loop tiling, the inner h and w are rather small (tile often occupies less than half the L1 cache in size)
This example is of course much faster, and yes, it is not elegant nor clear. int *p=array,*pend=&array[h][w]; while (p < pend) *p++ = uni() ;
Well, I copy/pasted this in my array.cpp in place of the loop nest. My 2D array take 9.938s to iterate 10000 over a 512*512 image of float, while your "while loop" took 9.891. I only lose ~0.5% by using NRC allocation + indexing. It's indeed faster but not by that much and, indeed, far less elegant. If needed, I can try to implement a simili-multi_array using my method and compare it to the original one in term of performance.