
on Thu Oct 16 2008, "Simonson, Lucanus J" <lucanus.j.simonson-AT-intel.com> wrote:
Do we know that loop unrolling of iteration over a linked list is profitable?
Nope, not for sure.
I would think that a conscientious programmer interested in loop unrolling of linked lists should pursue an analysis to measure the potential benefits that might be realized with current compilers and hardware before implementing a general solution along those lines. This analysis would be somewhat challenging, because it is likely that a toy testcase would fit in L1 cache and show promising results while realistic usage of lists would lead to their elements being fragmented in the heap and a large proportion of L1 and L2 cache misses that dominates runtime for loop iteration. If the long latency cache miss dominates the overhead of checking if itr == list.end() you will realize no benefit from loop unrolling on a modern architecture which can mask the latency of the comparison and overhead of the branch instruction that depends upon it within the latency of the cache miss of the next iteration of the loop under speculative execution. It is conceivable that the benefit would be negligible in the common case. We won't know, however, unless someone does the analysis.
Good points. I first realized the importance of O(1) size in the context of TMP, and translated my conclusions to the runtime world without considering cache effects. Testing is definitely needed. -- Dave Abrahams BoostPro Computing http://www.boostpro.com