Jiří- On 16:36 Tue 09 Apr , Jiří Vyskočil wrote:
I am unable to write a minimal test case - if I compare using multi_array directly vs. using this GridArray class in a simple test program (compiled with -O2), the performance is the same. But the real code shows about 20% slowdown - it's a numerical simulaiton, I create several arrays at the beginning and then do computations with them. The rest of the code hasn't been touched - I only chanded the constructors for the arrays and replaced a header which contained some typedefs with a header containing this class.
Do you have any idea what might be going on? Is there a better way to write an interface for successive [] operators?
I suspect that the compiler isn't able to optimize certain accesses in the GridArray case. Could you give us some more details on the numerical kernel you're running? What kind of operation do you run on the members? Is it some kind of stencil code where you update each member depending on its neighbors? I'm asking this as a compiler will typically try to avoid the potentially costly repetitive address computations and will cache the results in registers. As compilers are easily confused (e.g. via possible aliasing), and registers are scarce, even seemingly inconspicuous code changes may lead to performance degradations. Best -Andreas -- ========================================================== Andreas Schäfer HPC and Grid Computing Chair of Computer Science 3 Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany +49 9131 85-27910 PGP/GPG key via keyserver http://www.libgeodecomp.org ========================================================== (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination!