
on 18.08.2009 at 20:05 joel wrote :
fewer instructions doesn't mean faster code Thanks to learn me my job. to find ideal ratio you must take intel arch manual and count the clocks or you can actually measure running time for a variety of cases Time benchmark is always better. With pipeline in OOCPPU, countign clock in a linear fashion is dumb. i naively (lack of time) implemented nrc (how it is spelled btw?) for 2d case and it performs worse then computed indeces in general i'll try to reimplement it properly and measure again It shouldn't. 2D is worse case and perform like only 3-5% less than index. Proper test case is up to 3D. ah, ok you might be right (actually i have no chance to prove the contrary)
so why not compute indeces for 2d case and use nrc for N>2? i think it's trivial to implement -- Pavel