
John Moeller wrote:
Because people should be able to use it as an example for how to write their own generic functions is precisely the reason that we should worry about >how many template instantiations are performed. Template instantiations cost development time, and just because they aren't felt at runtime, doesn't mean they aren't important. If you can reduce them *in every case* (with a simple realignment of templates), then you should. At the same time, if you can reduce the number of excess temporaries in every case just by a wise choice >of parameter type, then you should.
It's not as though we're talking about creating compiler-specific code specializations just to squeeze out an extra 5% speed out of pow<>. We're talking about writing good generic code, and good generic code considers >the compile-time factor too.
Can we really predict compile speed difference between one implementation and another? Take, for instance, this template based condition for odd exponent: template <int N> struct even_odd { template <typename T> static const T& result(const T& t, const T& t2) { return t2; } }; template <> struct even_odd<1> { template <typename T> static T result(const T& t, const T& t2) { return t * t2; } }; template <int N> struct positive_power { template <typename T> static T result(T base) { return even_odd<N%2>::result(base, positive_power<N/2>::result(base) * positive_power<N/2>::result(base)); } }; is it slower to compile than this version which depends on the if statement being compiled away because it has compile time conditional: template <int N> struct positive_power { template <typename T> static T result(T base) { if(N%2) return base * positive_power<N/2>::result(base) * positive_power<N/2>::result(base); return positive_power<N/2>::result(base) * positive_power<N/2>::result(base); } }; I don't know. It depends on the order the compiler optimizes and whether it takes longer to instantiate even_odd or to compile away the compile time conditional. It is subject to change as compiler improve. I don't think it matters. I would write it the second way if I were writing it because it is less code, and I value that. I would also try to avoid code duplication over reducing the number of template instantiations. I rely on the compiler here to compile away the multiply by 1 identity: template <int N> struct positive_power { template <typename T> static T result(T base) { return positive_power<N/2>::result(base) * positive_power<N/2>::result(base) * (N%2) ? base : 1; } }; but simplify the code still further. Is that better? Again, it depends what you think is important. So, as we weigh the value of achieving one objective at the expense of another it comes down to subjective judgments about what is best. It is subjective because we often can't quantify things and also because we all value different things to a different degree. I probably undervalue compile speed. Thanks, Luke