
On Sun, Jan 17, 2010 at 10:36 AM, John Maddock
For a (cell) simulation code I have to calculate lots of cubic roots in every timestep. So I tried to improve performance on my first guess, namely pow(x, 1/3) using boost::math::cbrt(). To my astonishment, this is much slower than the original code. I wrote a small test program to check this claim; compiling with g++ (Ubuntu 4.4.1-4ubuntu9) 4.4.1 (-O3) on a Intel Core i7 CPU I get the following timings (averaging the time over 10 trials): average time to compute 5000000 roots with pow(): 0.603 s. average time to compute 5000000 roots with boost::cbrt(): 1.087 s. average time to compute 5000000 roots with improved boost::cbrt(): 1.015 s. average time to compute 5000000 roots with exp(1/3*log()): 0.541 s.
FYI SVN Trunk now has an updated algorithm that is very competitive - within 1-2% of ::cbrt (gcc-4.4.1 on Ubuntu Linux):
Testing cbrt 1.025e-07 Testing cbrt-c99 1.001e-07 Testing cbrt-pow 1.611e-07
Unfortunately msvc performance compares less well (even though it's much better than before, and does at least outperform the cephes lib):
Testing cbrt 1.970e-007 Testing cbrt-cephes 2.676e-007 Testing cbrt-pow 1.072e-007
This reflects the poor performance of std::frexp on that compiler... annoying that :-(
As I recall, MSVC saved and restores the floating point rounding type on most floating point calls like that, which causes its speed hit, you can work around it by making your own in assembly or use SSE or so, as I recall that is...