[multiprecision] Some Performance gains

Following the review of the multiprecision lib, I've been experimenting with some of the issues raised, currently focusing on performance improvements. Anyhow, I now have some results for improved *front end* performance, I haven't tried to seriously optimize the cpp_int backend yet (which was one of the criticisms), but for more on that see below. So... by way of measuring front end performance - the "abstraction penalty" if you like I've used a special backend type "arithmetic_backend<A>" where A can be any builtin arithmetic type. For tests I've used Boost.Math's special function test data for floating point types, and Phil Endecott's Delaunay code for integer performance. Win32 VC10 first: Delaunay test: int64_t : 9.45241 mp_number<arithmetic_backend<int64_t>, false> : 9.66815 Bessel Functions Test: Time for Bessel Functions - double = 0.015859 seconds Time for Bessel Functions - arithmetic_backend<double> - no expression templates = 0.0225565 seconds Time for Bessel Functions - arithmetic_backend<double> = 0.0376449 seconds Non Central T: Time for Non-central T - double = 0.456739 seconds Time for Non-central T - arithmetic_backend<double> - no expression templates = 0.459957 seconds Time for Non-central T - arithmetic_backend<double> = 1.08256 seconds Ubutunu Linux x64, GCC-4.6.1: Delaunay Test: int64_t 2.3969 mp_number<arithmetic_backend<int64_t>, false> 2.38334 Bessel Functions Test: Time for Bessel Functions - double = 0.0224617 seconds Time for Bessel Functions - arithmetic_backend<double> - no expression templates = 0.020384 seconds Time for Bessel Functions - arithmetic_backend<double> = 0.0220476 seconds Non central T: Time for Non-central T - double = 0.622985 seconds Time for Non-central T - arithmetic_backend<double> - no expression templates = 0.554527 seconds Time for Non-central T - arithmetic_backend<double> = 0.800129 seconds So to all intents and purposes there's now no abstraction penalty at all, except when using expression templates (which should be reserved for heavyweight backends anyway). Notice how the mp_number code is sometimes slightly faster, and sometimes slightly slower than the native type - I've observed that completely trivial changes to the test/measuring code can change this around - I suspect, but can't prove that the abstraction layer changes the decisions made by the compiler's optimizer resulting in somewhat capricious differences between results. With regard to the cpp_int backend, I still have work to do, however things do actually look pretty good on Linux x64 for the Delaunay test: Running calculations for: int64_t, int128_t (this is Phil's 128-bit special-case code) Number of flips = 23892000 Total execution time = 3.28337 Running calculations for: int64_t, mp_int128_t Number of flips = 23892000 Total execution time = 3.05434 Running calculations for: int64_t, mp_int128_t (ET) Number of flips = 23892000 Time per calculation = 3.02904e-08 Running calculations for: int64_t, cpp_int Number of flips = 23892000 Total execution time = 5.41737 On Win32, I'm still a touch behind Phil's code, however, I haven't written a dedicated NxN->2N multiplication routine (yet): Running calculations for: int64_t, int128_t Number of flips = 23892000 Total execution time = 14.2189 Running calculations for: int64_t, mp_int128_t Number of flips = 23892000 Total execution time = 24.5114 Running calculations for: int64_t, mp_int128_t (ET) Number of flips = 23892000 Total execution time = 19.741 Running calculations for: int64_t, cpp_int Number of flips = 23892000 Total execution time = 33.5 All the new code is in the sandbox, but I haven't documented any of the changes yet... Regards, John.

John wrote:
Following the review of the multiprecision lib, I've been experimenting with some of the issues raised, currently focusing on performance improvements.
Many thanks, John. What an amazing piece of work! But with all your excellent coding, what happened to your *plot* to watch more Olympics? <snip>
All the new code is in the sandbox, but I haven't documented any of the changes yet... Regards, John.
I ran the tests with --enable-specfun, for a "second opinion". All of the tests passed on my PC for VS2010. Sorry, I can't do much more than that right now. Best regards, Chris.

Following the review of the multiprecision lib, I've been experimenting with some of the issues raised, currently focusing on performance improvements.
Many thanks, John. What an amazing piece of work!
Thanks, actually I realised while dozing of to sleep last night that there are actually a couple of buglets in the test code... shouldn't effect the overall results though.... but will fix for the next round of tests.
But with all your excellent coding, what happened to your *plot* to watch more Olympics?
For some reason the commentators seem to be talking more than the athletes are competing :-( John.

John Maddock wrote:
Following the review of the multiprecision lib, I've been experimenting with some of the issues raised, currently focusing on performance improvements.
Many thanks, John. What an amazing piece of work!
Thanks, actually I realised while dozing of to sleep last night that there are actually a couple of buglets in the test code... shouldn't effect the overall results though.... but will fix for the next round of tests.
But with all your excellent coding, what happened to your *plot* to watch more Olympics?
For some reason the commentators seem to be talking more than the athletes are competing :-(
lol - perhaps boost and the olympics have someting in common. Robert Ramey
John.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

John wrote:
Following the review of the multiprecision lib, I've been experimenting with some of the issues raised, currently focusing on performance improvements.
<snip>
All the new code is in the sandbox, but I haven't documented any of the changes yet... Regards, John.
I ran the tests with --enable-specfun, for a "second opinion". All of the tests passed on my PC for VS2010. Sorry, I can't do much more than that right now.
Best regards, Chris.
Updated at revision 79871. All tests pass with VS2010, SP1. Thank you for your work, John. I can't keep up with your pace but from what I can perceive, you have brought the codes into a much more well organized form. Best regards, Chris.
participants (3)
-
Christopher Kormanyos
-
John Maddock
-
Robert Ramey