
With the following code I am unable to induce the compiler to ever fail to inline and produce a symbol in the symbol table. It doesn't matter if you pass by const ref, or what you name the function if it is always inlined and that stuff is just compiled out of existence. I think this whole thread may be solving a problem that doesn't exist. If the compiler can succeed to inline all N recursive function calls it doesn't matter whether it is 4, or 8 or what they are named. In the end what (might) matter is whether you have many or few multiplication operations (with the split recursive algorithm.) That same algorithm can be implemented as a loop in a single function that that takes the exponent by value as a runtime argument and the compiler would still have the opportunity to unroll the loop when the exponent is a constant where the function is inlined. I tried that and the compiler (gcc 4.3.2 -O2) fails to unroll the loop upon inspection of the assembly. I also inspected the assembly generated for my template and noticed something else that is interesting. The compiler takes the O(N) multiplications my lazily implemented positive_power template generates and optimizes them upon instantiation and inlining for N=21 to only 6 multiplications, meaning that the compiler is actually rewriting the algorithm for you to reduce the number of multiply operations. In this thread we have just been speculating about how best to do for the compiler what it is perfectly capable of doing on its own. I vote we leave the pow template alone, since it isn't broken and doesn't need fixing. (Yes I know my vote doesn't count.) Thanks, Luke //recursive template version template <int N> struct positive_power { template <typename T> static T result(T base) { return positive_power<N-1>::result(base) * base; } }; template <> struct positive_power<0> { template <typename T> static T result(T base) { return 1; } }; template <> struct positive_power<1> { template <typename T> static T result(T base) { return base; } }; int main(int argc, char** argv) { std::cout << positive_power<21>::result(argc); //assembly generated (note only 6 multiplies) main: 0x89 movl %edi,%esi 0x004007a2: 0x48 subl $8,%rsp 0x004007a6: 0x0f imull %edi,%esi 0x004007a9: 0x0f imull %esi,%esi 0x004007ac: 0x0f imull %edi,%esi 0x004007af: 0x0f imull %esi,%esi 0x004007b2: 0x0f imull %esi,%esi 0x004007b5: 0x0f imull %edi,%esi 0x004007b8: 0xbf movl $0x600c20,%edi //loop version inline int pow(int base, int exponent) { if(exponent == 0) return 1; int result = base; for(int i = exponent; i > 1; i /= 2) { result *= result; if(i % 2) result *= base; } return result; } int main(int argc, char** argv) { std::cout << pow(2, 8) << std::endl; //assembly (note one multiply in a loop) main: pushl %r13 0x004008a2: movl $8,%ecx 0x004008a7: movl $3,%edx 0x004008ac: sarl 1,%ecx 0x004008ae: movl $4,%esi 0x004008b3: pushl %r12 0x004008b5: pushl %rbp 0x004008b6: pushl %rbx 0x004008b7: subl $264,%rsp 0x004008be: subl $1,%edx 0x004008c1: je 0x4008d6 0x004008c3: imull %esi,%esi 0x004008c6: testb $1,%cl 0x004008c7: roll $141,(%rcx) 0x004008ca: addb $54,%al 0x004008cc: cmovne %eax,%esi 0x004008cf: sarl 1,%ecx 0x004008d1: subl $1,%edx 0x004008d4: jne 0x4008c3 0x004008d6: movl $0x600e20,%edi