[math] pow instantiates a number of template proportional to the exponent

AMDG I just looked at the implementation of pow<N> in boost/math/special_function.hpp. Line 54: return (N%2) ? base*positive_power<N-1>::result(base) in positive_power causes positive_power to be instantiated with all the integers from N down to 0. I think that this should use template specialization instead of ?: to keep the number of instantiations down, even though this probably won't affect the runtime cost. In Christ, Steven Watanabe

Steven Watanabe wrote:
AMDG
I just looked at the implementation of pow<N> in boost/math/special_function.hpp. Line 54: return (N%2) ? base*positive_power<N-1>::result(base)
in positive_power causes positive_power to be instantiated with all the integers from N down to 0. I think that this should use template specialization instead of ?: to keep the number of instantiations down, even though this probably won't affect the runtime cost.
Good catch: will fix shortly. Cheers, John.

Johan Råde wrote:
John Maddock wrote:
Steven Watanabe wrote:
return (N%2) ? base*positive_power<N-1>::result(base)
Shouldn't positive_power<N>(base) be reduced to square(positive_power<N/2>(base) or base * square(positive_power<N/2>(base)) depending on N%2?
--Johan
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Now I see. Ignore my last mail. --Johan

That would also have been my implementation: template <int N> struct positive_power { template <typename T> static typename tools::promote_args<T>::type result(T base) { typename tools::promote_args<T>::type power = positive_power<N/2>::result(base); return (N%2) ? base * power *power : power * power; } }; with the two specializations: template <> struct positive_power<0> { template <typename T> static typename tools::promote_args<T>::type result(T base) { return 1; } }; template <> struct positive_power<1> { template <typename T> static typename tools::promote_args<T>::type result(T base) { return base; } }; (strictly speaking, <0> only is needed, but the compiler may not know to optimize multiplying by T(1), which is one criticism of previous post by John Moeller, who proposes essentially the same solution but with a factor ((N%2) ? base : 1) that I don't like). Johan: what's wrong with it? -- Hervé Brönnimann hervebronnimann@mac.com On May 14, 2008, at 1:49 PM, Johan Råde wrote:
John Maddock wrote:
Steven Watanabe wrote:
return (N%2) ? base*positive_power<N-1>::result(base)
Shouldn't positive_power<N>(base) be reduced to square(positive_power<N/2>(base) or base * square(positive_power<N/2>(base)) depending on N%2?
--Johan
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/ listinfo.cgi/boost

Hervé Brönnimann <hervebronnimann <at> mac.com> writes:
That would also have been my implementation:
template <int N> struct positive_power { template <typename T> static typename tools::promote_args<T>::type result(T base) { typename tools::promote_args<T>::type power = positive_power<N/2>::result(base); return (N%2) ? base * power *power : power * power; } };
with the two specializations:
template <> struct positive_power<0> { template <typename T> static typename tools::promote_args<T>::type result(T base) { return 1; } };
template <> struct positive_power<1> { template <typename T> static typename tools::promote_args<T>::type result(T base) { return base; } };
(strictly speaking, <0> only is needed, but the compiler may not know to optimize multiplying by T(1), which is one criticism of previous post by John Moeller, who proposes essentially the same solution but with a factor ((N%2) ? base : 1) that I don't like).
Fair enough. I think, though, if you're going to try to make sure that the compiler doesn't miss multiplication by 1, you may as well go all the way and add another template parameter to capture N%2, and get rid of the ternary statement: template <int N, int M = N%2> struct positive_power; template <int N> struct positive_power<N, 0> { template <typename T> static typename tools::promote_args<T>::type result(T base) { typename tools::promote_args<T>::type power = positive_power<N/2,(N/2)%2>::result(base); return power * power; } }; template <int N> struct positive_power<N, 1> { template <typename T> static typename tools::promote_args<T>::type result(T base) { typename tools::promote_args<T>::type power = positive_power<N/2,(N/2)%2>::result(base); return base * power * power; } }; with the two specializations: template <> struct positive_power<0,0> { template <typename T> static typename tools::promote_args<T>::type result(T base) { return 1; } }; template <> struct positive_power<1,1> { template <typename T> static typename tools::promote_args<T>::type result(T base) { return base; } }; Now all the compiler has to do is spit out multiplications, assuming that things are inlined well.

John Moeller <fishcorn <at> gmail.com> writes:
template <> struct positive_power<0,0> { template <typename T> static typename tools::promote_args<T>::type result(T base) { return 1; } };
In fact, you could get rid of the zero-power case of positive_power and make it a specialization of the power_if_positive template. That way you could also check for the 0^0 case, which just returns 1 right now.

On May 15, 2008, at 10:40 AM, John Moeller wrote:
Fair enough. I think, though, if you're going to try to make sure that the compiler doesn't miss multiplication by 1, you may as well go all the way and add another template parameter to capture N%2, and get rid of the ternary statement:
To be clear, I'm not concerned about optimizing multiplication by 1 for builtin types, of course. But it seems to me that the compiler shouldn't really be able to optimize for user-defined types (what if multiplication isn't inlined, for instance...) -- Hervé Brönnimann hervebronnimann@mac.com

Hervé Brönnimann <hervebronnimann <at> mac.com> writes:
On May 15, 2008, at 10:40 AM, John Moeller wrote:
Fair enough. I think, though, if you're going to try to make sure that the compiler doesn't miss multiplication by 1, you may as well go all the way and add another template parameter to capture N%2, and get rid of the ternary statement:
To be clear, I'm not concerned about optimizing multiplication by 1 for builtin types, of course. But it seems to me that the compiler shouldn't really be able to optimize for user-defined types (what if multiplication isn't inlined, for instance...)
And regardless of whether the type is builtin or not, branching with template parameters will accomplish the same task as letting the compiler decide to inline a ternary statement, except that it will be explicit. There is zero chance of the compiler generating more instructions than are needed (unless inlining breaks down). I know that this ternary statement's condition may be evaluated at compile time, so there is little chance of more than one branch being generated, but why not just make it explicit? (From an aesthetic standpoint, I think branching with a template parameter expresses the recursion very elegantly (two branches, one base condition), and doesn't get in the way at all. Elegance isn't really what matters here, though.) My response wasn't motivated emotionally; I agree with your argument about UDT multiplication (as I did when I wrote my response in the first place). I'm just saying it would be advantageous to follow the process further. -- John Moeller

Hi,
Line 54: return (N%2) ? base*positive_power<N-1>::result(base)
in positive_power causes positive_power to be instantiated with all the integers from N down to 0.
Yep you're right. At that moment I was a bit disturbed by this runtime evaluation but I didn't think precisely about this issue so I left it like that. Adding a second integral template argument receiving N%2 to specialize on it works (see attachment) but I feel there's a more elegant solution. Maybe using enable_if? Regards Bruno

Bruno Lalande <bruno.lalande <at> gmail.com> writes:
Hi,
Line 54: return (N%2) ? base*positive_power<N-1>::result(base)
in positive_power causes positive_power to be instantiated with all the integers from N down to 0.
Yep you're right. At that moment I was a bit disturbed by this runtime evaluation but I didn't think precisely about this issue so I left it like that.
Adding a second integral template argument receiving N%2 to specialize on it works (see attachment) but I feel there's a more elegant solution. Maybe using enable_if?
What about: template <int N> struct positive_power { template <typename T> static typename tools::promote_args<T>::type result(T base) { return ((N%2) ? base : 1) * positive_power<2>::result( positive_power<N/2>::result(base) ); } }; John Moeller

Bruno Lalande wrote:
Hi,
Line 54: return (N%2) ? base*positive_power<N-1>::result(base)
in positive_power causes positive_power to be instantiated with all the integers from N down to 0.
Yep you're right. At that moment I was a bit disturbed by this runtime evaluation but I didn't think precisely about this issue so I left it like that.
Adding a second integral template argument receiving N%2 to specialize on it works (see attachment) but I feel there's a more elegant solution. Maybe using enable_if?
That's more or less what I did in the Sandbox version, but Hervé Brönnimann's solution looks more elegant maybe? John.

That's more or less what I did in the Sandbox version, but Hervé Brönnimann's solution looks more elegant maybe?
Yep that's exactly what I was just answering :-) It's much smarter. I'll be off until sunday evening unfortunately, but since I don't have trunk access I wouldn't be of much help I guess. I can test performances on my computer when I come back though (I remember the fact to add or remove a specialization could have big effects on perfs for some exponents). BTW, with this implementation we should be able to remove the specialization in <2>, I remember it was there only to avoid a compile-time infinite loop due to the explicit call to positive_power<2>. Bruno
participants (6)
-
Bruno Lalande
-
Hervé Brönnimann
-
Johan Råde
-
John Maddock
-
John Moeller
-
Steven Watanabe