[Review Request] Multiprecision Arithmetic Library

newer
[function] why do I function1 for...

John Maddock

1 Apr 2012 1 Apr '12

3:55 p.m.

Folks, I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox. Features: * Expression template enabled front end. * Support for Integer, Rational and Floating Point types. Supported Integer backends: * GMP. * Libtommath. * cpp_int. cpp_int is an all C++ Boost licensed backend, supports both arbitrary precision types (with Allocator support), and signed and unsigned fixed precision types (with no memory allocation). There are also some integer specific functions - for Miller Rabin testing, bit fiddling, random numbers. Plus interoperability with Boost.Rational (though that loses the expression template frontend). Supported Rational Backends: * GMP * libtommath * cpp_int (as above) Supported Floating point backends: * GMP * MPFR * cpp_dec_float cpp_dec_float is an all C++ Boost licensed type, adapted from Christopher Kormanyos' e_float code (published in TOMS last year). All the floating point types, have full std lib support (cos sin exp, pow etc), as well as full interoperability with Boost.Math. There's nothing in principal to prevent extension to complex numbers and interval arithmetic types (plus any other number types I've forgotten!), but I've run out of energy for now ;-) Code is in the sandbox under /big_number/. Docs can be viewed online here: http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/ht... And of course, I'm looking for a review manager ;-) Many thanks, John.

Show replies by date

Daniel Trebbien

1 Apr 1 Apr

5:29 p.m.

Hello John, That is some impressive-looking work, especially the Performance Comparison pages. I have briefly looked through the documentation and have a few questions/comments: 1. Is there a fairly complete example program that uses Boost.Multiprecision? I saw the Miller-Rabin test program, which is good. Perhaps a few more examples will help users to understand the basics of working with the library. 2. Do the round functions of mp_number (round, iround, lround, llround) behave like the POSIX round functions in that they always round away from zero? That might be good to document. 3. Does Boost.Multiprecision support a choice of rounding mode? Overall, the API looks sensible and complete. Excellent work! Daniel Trebbien On 2012-04-01, John Maddock <boost.regex@virgin.net> wrote:

...

Folks,

I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox.

Features:

* Expression template enabled front end. * Support for Integer, Rational and Floating Point types.

Supported Integer backends:

* GMP. * Libtommath. * cpp_int.

cpp_int is an all C++ Boost licensed backend, supports both arbitrary precision types (with Allocator support), and signed and unsigned fixed precision types (with no memory allocation).

There are also some integer specific functions - for Miller Rabin testing, bit fiddling, random numbers. Plus interoperability with Boost.Rational (though that loses the expression template frontend).

Supported Rational Backends:

* GMP * libtommath * cpp_int (as above)

Supported Floating point backends:

* GMP * MPFR * cpp_dec_float

cpp_dec_float is an all C++ Boost licensed type, adapted from Christopher Kormanyos' e_float code (published in TOMS last year).

All the floating point types, have full std lib support (cos sin exp, pow etc), as well as full interoperability with Boost.Math.

There's nothing in principal to prevent extension to complex numbers and interval arithmetic types (plus any other number types I've forgotten!), but I've run out of energy for now ;-)

Code is in the sandbox under /big_number/.

Docs can be viewed online here: http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/ht...

And of course, I'm looking for a review manager ;-)

Many thanks, John.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Paul A. Bristow

2 Apr 2 Apr

8:37 a.m.

...

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Daniel Trebbien Sent: Sunday, April 01, 2012 6:30 PM To: John Maddock Cc: boost@lists.boost.org Subject: Re: [boost] [Review Request] Multiprecision Arithmetic Library

That is some impressive-looking work, especially the Performance Comparison pages.

I have briefly looked through the documentation and have a few questions/comments:

1. Is there a fairly complete example program that uses Boost.Multiprecision? I saw the Miller-Rabin test program, which is good. Perhaps a few more examples will help users to understand the basics of working with the library.

I'm impressed by this - both Chris's and John's work. It proves my thesis that no Maddock program is complete without lots of templates and some MPL as condiment! Boost has needed this for a long time, both integer and real. And now we have both. To be able to use the GMP and other packages optionally is a big bonus. I've also said that we must have more examples. This is going to be used by the 'great unwashed' when they run out of bits on the built-in types. So they won't read the manual (probably even if all else fails!) and will rely on examples to get they going, and to help out when they fall into some of the pits awaiting the unwary user (I fell :-( ). Suggestions for examples? Paul --- Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com

Christopher Kormanyos

8:35 p.m.

<snip>

...

I'm impressed by this - both Chris's and John's work.

Thank you. There's still a long way to go in terms of digits, but it's *definitely* a good start. <snip>

...

Boost has needed this for a long time, both integer and real. And now we have both. To be able to use the GMP and other packages optionally is a big bonus.

Yeah, for *real*! <snip>

...

Suggestions for examples?

Well, I did work out 8 examples for the original publication in the ACM. We could adapt some of these for big_number to lend a more example-oriented quick start to the hands-on user. Best regards, Chris.

John Maddock

3 Apr 3 Apr

6:22 p.m.

...

Well, I did work out 8 examples for the original publication in the ACM. We could adapt some of these for big_number to lend a more example-oriented quick start to the hands-on user.

Good thinking - I just pinched example 3, and reworked it here: http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/ht... John.

Christopher Kormanyos

7:44 p.m.

...

...
Well, I did work out 8 examples for the original publication

...

...
in the ACM. We could adapt some of these for big_number to lend a more example-oriented quick start to the hands-on user.

...

Good thinking - I just pinched example 3, and reworked it here: http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/ht... John.

While researching generic numeric programming, I recently added two examples to my catalog that I find quite intuitive while remaining non-trivial, and while still retaining example-like fashion. These are a generic function derivative using a 3-point central difference rule and a generic recursive trapezoid numerical integration. I will work out the codes on the weekend and jot down descriptive notes. Looks like the Easter Bunny is hiding algorithms now! Best regards, Chris

Vicente J. Botet Escriba

1 Apr 1 Apr

6:28 p.m.

Le 01/04/12 17:55, John Maddock a écrit :

...

Folks,

I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox.

Features:

* Expression template enabled front end. * Support for Integer, Rational and Floating Point types.

Hi John, very good work.

...

Supported Integer backends:

* GMP. * Libtommath. * cpp_int.

cpp_int is an all C++ Boost licensed backend, supports both arbitrary precision types (with Allocator support), and signed and unsigned fixed precision types (with no memory allocation).

I'm wondering if the following compiles and what is the type of c mp_uint128_t a; mp_uint256_t b; auto c = a + b; From the documentation I have the impression that we can not mix back-ends. It seems to me that fixed precision and allocation are orthogonal features. Why the fixed precision types can not use an allocator?

...

There's nothing in principal to prevent extension to complex numbers and interval arithmetic types (plus any other number types I've forgotten!), but I've run out of energy for now ;-)

I'm working on a fixed_point library (see the sandbox code, sorry, but there is no doc yet but http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3352.html is quite close ). I would like to see if your front-end could also improve the performances in this case. I will try to adapt it to the back-end constraints soon.

...

And of course, I'm looking for a review manager ;-)

Good luck, Vicente

Andrii Sydorchuk

10:52 p.m.

Hi John, Nice work! It is a great first step for Boost on having its own multiprecision library. * libtommath Is this library really required? From your benchmarks I see that cpp_int type you've implemented is much better on almost all tests. * Support for Integer, Rational and Floating Point types. Are integer and floating-point types interconnected within the library? Wouldn't it be better to come up with two separate libraries? cpp_int is an all C++ Boost licensed backend, supports both arbitrary

...

precision types (with Allocator support), and signed and unsigned fixed precision types (with no memory allocation).

I would like to clarify following question (didn't notice this in docs). Let's say I want to create cpp_int fixed type to handle 129 bit signed integer (1 bit for a sign and 128 to represent the value itself). Am I fine with cpp_int_backend<128, true, void>? What is the size of cpp_int_backend< 128, true, void> structure? Does the unary minus operator overflows for the smallest possible fixed integer value? All the floating point types, have full std lib support (cos sin exp, pow

...

etc), as well as full interoperability with Boost.Math.

What is the precision of the following operations: *, /, +, -, sqrt for the cpp_dec_float type? What is the memory usage? Any limitations on the maximum/minimum value of exponent? Thanks, Andrii Sydorchuk Unsubscribe & other changes: http://lists.boost.org/** mailman/listinfo.cgi/boost<http://lists.boost.org/mailman/listinfo.cgi/boost>

John Maddock

2 Apr 2 Apr

9:10 a.m.

...

Nice work! It is a great first step for Boost on having its own multiprecision library.

* libtommath

Is this library really required? From your benchmarks I see that cpp_int type you've implemented is much better on almost all tests.

You need to read the notes and caveats - libtommath performance on *Linux x64* is on a par with cpp_int (as it should be really). So it may be of use if: * You need something more mature than the Boost code. * You're working with very big numbers, when libtommath's more advanced multiplation routines may come into their own.

...

* Support for Integer, Rational and Floating Point types.

Are integer and floating-point types interconnected within the library? Wouldn't it be better to come up with two separate libraries?

Interconnected in the sense that they use the same front end code. If the library was to be split it would have to be in three: * The front end (unusable by end users just wanting a big number type). * The integer backends. * The floating point backends. Personally I'd rather have a single coherent library, I think ultimately that's easier for end users to understand (once you've used one type from the library, the others all behave in the same way).

...

cpp_int is an all C++ Boost licensed backend, supports both arbitrary

...
precision types (with Allocator support), and signed and unsigned fixed precision types (with no memory allocation).

I would like to clarify following question (didn't notice this in docs). Let's say I want to create cpp_int fixed type to handle 129 bit signed integer (1 bit for a sign and 128 to represent the value itself). Am I fine with cpp_int_backend<128, true, void>?

Yes, it's sign magnitude representation.

...

What is the size of cpp_int_backend< 128, true, void> structure?

Implemenation detail ;-) In practice there's always one extra limb compared to the number you would expect (even for unsigned types), so the actual size depends on the size of the limbs, which in turn varies by platform. So it's: sizeof(limb_type) * (1 + (bits_requested / bits_per_limb) + (bits_requested % bits_per_limb ? 1 : 0)) If you really really want to know!

...

Does the unary minus operator overflows for the smallest possible fixed integer value?

No. std::numeric_limits would tell you that as well, as max() == -min()

...

All the floating point types, have full std lib support (cos sin exp, pow

...
etc), as well as full interoperability with Boost.Math.

What is the precision of the following operations: *, /, +, -, sqrt for the cpp_dec_float type?

Good question, we don't really guarentee any at present. In practice they're *usually* accurate to 1eps, but they're not intended to compete with MPFR's guarentees. But Chris would know more?

...

What is the memory usage?

cpp_dec_float doesn't allocate any memory - everything is stored within the class - so it's best suited to small-medium digit counts.

...

Any limitations on the maximum/minimum value of exponent?

Must fit in a long long. That still gives some pretty big numbers ;-) Just thinking out loud... is there ever any use case for big-integer exponents? HTH, John.

Christopher Kormanyos

8:49 p.m.

...

...
All the floating point types, have full std lib support (cos sin exp, pow

...

...
etc), as well as full interoperability with Boost.Math.

What is the precision of the following operations: *, /, +, -, sqrt for the cpp_dec_float type?

...

Good question, we don't really guarantee any at present. In practice they're *usually* accurate to 1eps, but they're not intended to compete with MPFR's guarantees. But Chris would know more?

My basic operations (add, sub, mul, div) are on-par or faster for a variety of reasons. For example, I don't allocate---just use std::array "on the stack". Those GMP and MPFR guys can not program in a professionally adherent fashion. But they sure do have efficient algorithms! Their square root, exponential, logarithmic and trigonometric functions beat mine hands down. I'd like to get a hold of their algorithms and program them in a readable, portable fashion. I think they use, for example, binary splitting for trigonometric functions, whereas I use Taylor.

...

...
Any limitations on the maximum/minimum value of exponent?

...

Must fit in a long long. That still gives some pretty big numbers ;-)

In other words, approx. 10^(10^19) for 64-bit long long. The Python language MP stuff has unlimited exponents. I forgot what that thing is called, but it's another cool tool. Best regards, Chris.

Andrii Sydorchuk

10:29 p.m.

...

Personally I'd rather have a single coherent library, I think ultimately that's easier for end users to understand (once you've used one type from the library, the others all behave in the same way).

My point is that big integer and floating-point types are kind of two separate concepts. They have different application areas and use different external interfaces (e.g. there is no point to define cos function for big integer class or modulus operator for floating-point type). Also if the community decides that it should be a single library I would suggest that library should include type converters from integer to floating-point types and vice versa. So it's:

...

sizeof(limb_type) * (1 + (bits_requested / bits_per_limb) + (bits_requested % bits_per_limb ? 1 : 0)) If you really really want to know!

Well, that's what I expected and I don't think that there is anything bad in this. However user should be aware that fixed integer of 31 bits has 8 bytes size (if sizeof(limb_type) = 4). Good question, we don't really guarentee any at present. In practice

...

they're *usually* accurate to 1eps, but they're not intended to compete with MPFR's guarentees. But Chris would know more?

This is an important point, so would be good at least to mention it in the docs. I'd like also to see more details on this type in the docs (e.g. examples, areas of application). Must fit in a long long. That still gives some pretty big numbers ;-)

...

Just thinking out loud... is there ever any use case for big-integer exponents?

Yes, those are very large numbers :-). I would say that int32 exponent should be good enough (I don't mean that it should be used). I am eager to see applications of int64 exponent range. Andrii Unsubscribe & other changes: http://lists.boost.org/** mailman/listinfo.cgi/boost<http://lists.boost.org/mailman/listinfo.cgi/boost>

Topher Cooper

3 Apr 3 Apr

4:37 a.m.

On 4/2/2012 6:29 PM, Andrii Sydorchuk wrote:

...

Yes, those are very large numbers:-). I would say that int32 exponent should be good enough (I don't mean that it should be used). I am eager to see applications of int64 exponent range.

Andrii

Yeah -- the diameter of the observable universe (not the observed universe -- the physically observable universe) in Plank units (the smallest physically meaningful length according to QM) is about 1.5*2^204, so 9 bits (positive and negative exponents) is all that is required to express the exponent for any physical distance in any physically meaningful units. Volumes would reach binary exponent of 613 so 11 bits covers this with generous room to spare. I doubt you will find any physical quantities that will be much larger. Combinatorics can easily generate very large quantities, of course, but if you are getting into those ranges you probably will be using logarithms. So 16 bits is probably as much as you can reasonably expect to need (lots of bits in the mantissa is, of course, justifiable to preserve precision in intermediate results), and 32 would be as much as you can *un*reasonably expect to need -- and more. I'm not actually arguing against the choice of int64 representation for the exponents either. There are other issues then utility in efficient design and a few unused bits are worth it for performance or even just code simplicity. Just thought I would put things in perspective. (I'm partially responding with amusment here, because I remember when we wondered whether there really was any point to the 15 bit exponents when the HFloat (H = Huge) format was introduced the the VAX architecture on request/recommendation of some members of the scientific community, the GFloat format (G=Grand) introduced at the same time for the same reason had 11 bit exponents). Topher

Christof Donat

6:12 a.m.

Hi,

...

...
Just thinking out loud... is there ever any use case for big-integer exponents?

Yes, those are very large numbers :-). I would say that int32 exponent should be good enough (I don't mean that it should be used). I am eager to see applications of int64 exponent range.

Mathematics has some need to work with really very big numbers. That is whenever you try to disprove a hypothesis like e.g. the Rieman hypothesis by experimentally searching for a case where it does not hold. That is nothing I am doing, but it to my mind when I thought about where to use exponents bigger than int32. Is there any reason not to make the exponent type a template parameter so that the user can chose arbittrarily instead of trying to anticipate future uses of the library? Christof -- okunah gmbh Software nach Maß Werner-Haas-Str. 8 www.okunah.de 86153 Augsburg cd@okunah.de Registergericht Augsburg Geschäftsführer Augsburg HRB 21896 Christof Donat UStID: DE 248 815 055

Thomas Heller

7:12 a.m.

On Tue, Apr 3, 2012 at 8:12 AM, Christof Donat <cd@okunah.de> wrote:

...

Hi,

...
...
Just thinking out loud... is there ever any use case for big-integer exponents?

Yes, those are very large numbers :-). I would say that int32 exponent should be good enough (I don't mean that it should be used). I am eager to see applications of int64 exponent range.

Mathematics has some need to work with really very big numbers. That is whenever you try to disprove a hypothesis like e.g. the Rieman hypothesis by experimentally searching for a case where it does not hold.

That is nothing I am doing, but it to my mind when I thought about where to use exponents bigger than int32.

Is there any reason not to make the exponent type a template parameter so that the user can chose arbittrarily instead of trying to anticipate future uses of the library?

Just a crazy idea ... what about if the exponent is a big integer as well? And the mantissa ... guess you can go all crazy, hope this is supported by the library!

...

Christof

Andrii Sydorchuk

9:05 a.m.

...

Just a crazy idea ... what about if the exponent is a big integer as well? And the mantissa ... guess you can go all crazy, hope this is supported by the library!

My main point against this idea would be that this might drop performance a lot. As far as I know GMP library also uses 64 bit exponent for floating-point types. Is there any reason not to make the exponent type a template parameter so

...

that the user can chose arbittrarily instead of trying to anticipate future uses of the library?

I would suggest this to be a good solution. At least support of int32 and int64 via template parameter would be vital. Yeah -- the diameter of the observable universe (not the observed universe

...

-- the physically observable universe) in Plank units (the smallest physically meaningful length according to QM) is about 1.5*2^204, so 9 bits (positive and negative exponents) is all that is required to express the exponent for any physical distance in any physically meaningful units.

I have some experience with implementing multiprecision geometric predicates. At some point I needed to cast 2048 bit big integer to 64 bit double and the problem is that double is not capable to handle it. This is so far the only issue I encountered with limitation of 11 bits exponent. Also we might consider the fact that IEEE754 architecture for 128 bit double has 15 bit exponent the same as for the 80 bit double. Andrii _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Christof Donat

7:20 p.m.

Hi,

...

...
Just a crazy idea ... what about if the exponent is a big integer as well? And the mantissa ... guess you can go all crazy, hope this is supported by the library!

My main point against this idea would be that this might drop performance a lot.

I think, that with speicalization that should at least be possible have special, high performance implementations for e.g. int32 and int64 as exponent. With the type of the exponent as a template parameter that sould work well. Christof -- okunah gmbh Software nach Maß Werner-Haas-Str. 8 www.okunah.de 86153 Augsburg cd@okunah.de Registergericht Augsburg Geschäftsführer Augsburg HRB 21896 Christof Donat UStID: DE 248 815 055

Christopher Kormanyos

7:52 p.m.

...

...
...
Just a crazy idea ... what about if the exponent is a big integer as

...

...
...
well? And the mantissa ... guess you can go all crazy, hope this is supported by the library!

My main point against this idea would be that this might drop performance a lot.

...

I think, that with speicalization that should at least be possible have special, high performance implementations for e.g. int32 and int64 as exponent. With the type of the exponent as a template parameter that sould work well.

...

Christof

Well, I like it... And I don't like it---right now. The increased testing effort would throw this library back, maybe, almost a year (at least my float back-end part of it). I tested that back-end with automatically generated test cases from Mathematica in combination with state-of-the-art code coverage off and on for the better part of a year. Realistically, one would recommend keeping a flexible exponent on the *very interesting* list, yet moving on with the tested implementation that we have today. If the exponent were to be templated in the future, a default template parameter could be used to retain back compatibility. Best regards, Chris.

John Maddock

11:49 a.m.

...

Just a crazy idea ... what about if the exponent is a big integer as well? And the mantissa ... guess you can go all crazy, hope this is supported by the library!

Not yet, but not impossible (other than it messing up std::numeric_limits<>::max_exponent). Of course making the mantissa arbitrary precision is definitely not possible - unless you can tell me how exactly many digits Pi has ;-) John.

Jeffrey Lee Hellrung, Jr.

1:01 p.m.

On Tue, Apr 3, 2012 at 4:49 AM, John Maddock <boost.regex@virgin.net> wrote:

...

Just a crazy idea ... what about if the exponent is a big integer as

...
well? And the mantissa ... guess you can go all crazy, hope this is supported by the library!

Not yet, but not impossible (other than it messing up std::numeric_limits<>::max_exponent). Of course making the mantissa arbitrary precision is definitely not possible - unless you can tell me how exactly many digits Pi has ;-)

Well I'd imagine that maybe you could evaluate the precision of, e.g., pi on-demand and lazily. I would consider that a form of arbitrary precision. I'd also imagine such a technique outside the (present) scope of this library. - Jeff

Christopher Kormanyos

7:33 p.m.

...

...
Just a crazy idea ... what about if the exponent is a big integer as

...

...
well? And the mantissa ... guess you can go all crazy, hope this is supported by the library!

...

Not yet, but not impossible (other than it messing up std::numeric_limits<>::max_exponent). Of course making the mantissa arbitrary precision is definitely not possible - unless you can tell me how exactly > > many digits Pi has ;-) John.

I wouldn't want to slow down the thing for a big exponent. But your mileage may vary... Best regards, Chris.

John Maddock

11:47 a.m.

...

Is there any reason not to make the exponent type a template parameter so that the user can chose arbittrarily instead of trying to anticipate future uses of the library?

Probably not actually a bad idea, or hard to do, other than upping testing times yet more.... John.

Christopher Kormanyos

7:30 p.m.

...

...
Is there any reason not to make the exponent type a template parameter so that

...

...
the user can chose arbittrarily instead of trying to anticipate future uses of the library?

...

Probably not actually a bad idea, or hard to do, other than upping testing times yet more.... John.

Probably best, then, if the exponent has a known and fixed compile-time width, resulting with a known maximum which is mandatory for things such as std::numeric_limits<my_type>::max_eponent10, etc. Best regards, Chris

John Maddock

11:22 a.m.

...

...
Personally I'd rather have a single coherent library, I think ultimately that's easier for end users to understand (once you've used one type from the library, the others all behave in the same way).

My point is that big integer and floating-point types are kind of two separate concepts.

I think that's where we disagree somewhat, as I believe there is also enormous overlap (in the way that arithmetic is handled).

...

They have different application areas and use different external interfaces (e.g. there is no point to define cos function for big integer class or modulus operator for floating-point type).

True, in this library non-member functions that are number-kind specific (like cos and sin etc) are protected by enable_if so they can only be called with numbers of the correct "kind".

...

Also if the community decides that it should be a single library I would suggest that library should include type converters from integer to floating-point types and vice versa.

Conversions are supported: http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/ht...

...

So it's:

...
sizeof(limb_type) * (1 + (bits_requested / bits_per_limb) + (bits_requested % bits_per_limb ? 1 : 0)) If you really really want to know!

Well, that's what I expected and I don't think that there is anything bad in this. However user should be aware that fixed integer of 31 bits has 8 bytes size (if sizeof(limb_type) = 4).

Nod.

...

Good question, we don't really guarentee any at present. In practice

...
they're *usually* accurate to 1eps, but they're not intended to compete with MPFR's guarentees. But Chris would know more?

This is an important point, so would be good at least to mention it in the docs. I'd like also to see more details on this type in the docs (e.g. examples, areas of application).

Nod.

...

Must fit in a long long. That still gives some pretty big numbers ;-)

...
Just thinking out loud... is there ever any use case for big-integer exponents?

Yes, those are very large numbers :-). I would say that int32 exponent should be good enough (I don't mean that it should be used). I am eager to see applications of int64 exponent range.

Thanks for the comments, John.

John Maddock

2 Apr 2 Apr

8:53 a.m.

...

I'm wondering if the following compiles and what is the type of c

mp_uint128_t a; mp_uint256_t b; auto c = a + b;

From the documentation I have the impression that we can not mix back-ends.

Correct, it's a compiler error. Allowing mixed type arithmetic opens up a whole can of worms it seems to me, plus I don't even want to think about how complex the operator overloads would have to be to support that! So for now, if you want to conduct mixed arithmetic, you must explicitly cast one of the types.

...

It seems to me that fixed precision and allocation are orthogonal features. Why the fixed precision types can not use an allocator?

Good question. It seemed to me that the most common use case for fixed precision types was for types "just a bit wider than long long", where it was important not to be allocating memory. Plus if you're going to allow memory allocation, wouldn't it be safer to use the arbitary precision type and avoid the risk of overflow altergether?

...

I'm working on a fixed_point library (see the sandbox code, sorry, but there is no doc yet but http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3352.html is quite close ).

That paper is interesting, particularly in it's allowance of mixed-mode arithmetic.

...

I would like to see if your front-end could also improve the performances in this case. I will try to adapt it to the back-end constraints soon.

Unless your number types perform memory allocation, the chances are that expression templates may well not be an optimization (or even a slight dis-optimization). Even when memory allocation is performed and temporaries are expensive to construct, subjectively, the big performance gains only really happen when the system is under load - makes sense when you think about it - not sure how to go about quantifying that though :-( BTW, it's not documented, and may be slightly out of date now, but there's a trivial adapter template in boost/multiprecision/arithmetic_backend.hpp that allows any type with the usual arithmetic operators to be used with the expression template front end, so use would be: mp_number<arithmetic_backend<YourType> > It certainly won't give you optimized performance, but it may give you a quick and dirty idea as to whether writing a proper backend is worthwhile. HTH, John.

Vicente J. Botet Escriba

10:04 p.m.

Le 02/04/12 10:53, John Maddock a écrit :

...

...
I'm wondering if the following compiles and what is the type of c

mp_uint128_t a; mp_uint256_t b; auto c = a + b;

From the documentation I have the impression that we can not mix back-ends.

Correct, it's a compiler error.

Allowing mixed type arithmetic opens up a whole can of worms it seems to me, plus I don't even want to think about how complex the operator overloads would have to be to support that!

So for now, if you want to conduct mixed arithmetic, you must explicitly cast one of the types. Humm, casting seems to be not the good option for fixed point arithmetic. The type of a fixed_point operation depends on the quantization of its arguments.

With fixed_point arithmetic the type of fixed_point<7,0> + fixed_point<8,0> is fixed_point<9,0>. Note that no overflow is needed as the resulting type has enough range. Another example fixed_point<10,2> a; fixed_point<2,3> b; fixed_point<2,2> c; auto d = (a + b) * c; The type of d should be something like fixed_ponit<13,5>. If the pixed_point operation works only with parameters with the same range and resolution the overflow problem appears on temporaries. Using cast the expression result in something unreadable auto d = (fixed_ponit<13,5>(a) + fixed_ponit<13,5>(b)) * fixed_ponit<13,5>(c); Overflow and rounding is needed only when you assign types with different quantizations. For example fixed_point<12, 4> d = (a + b) * c; will need to round from 5 to 4 resolution bits and to reduce the scale from 13 to 12.

...

...
It seems to me that fixed precision and allocation are orthogonal features. Why the fixed precision types can not use an allocator?

Good question. It seemed to me that the most common use case for fixed precision types was for types "just a bit wider than long long", where it was important not to be allocating memory. Plus if you're going to allow memory allocation, wouldn't it be safer to use the arbitary precision type and avoid the risk of overflow altergether?

Knowing the range and resolution (i.e. precision) at compile time is a must in some domains. In the context I could use this, I will not use the allocator as the size stay reasonale, but for me the interface should let the choice to the user.

...

...
I'm working on a fixed_point library (see the sandbox code, sorry, but there is no doc yet but http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3352.html is quite close ).

That paper is interesting, particularly in it's allowance of mixed-mode arithmetic.

...
I would like to see if your front-end could also improve the performances in this case. I will try to adapt it to the back-end constraints soon.

Unless your number types perform memory allocation, the chances are that expression templates may well not be an optimization (or even a slight dis-optimization). Even when memory allocation is performed and temporaries are expensive to construct, subjectively, the big performance gains only really happen when the system is under load - makes sense when you think about it - not sure how to go about quantifying that though :-(

are you saying that the front_end is not useful for cpp_int when no allocator is used? Note also that some processors could offer intrinsics for (a + b) * c. But this may be out of the scope of your library.

...

BTW, it's not documented, and may be slightly out of date now, but there's a trivial adapter template in boost/multiprecision/arithmetic_backend.hpp that allows any type with the usual arithmetic operators to be used with the expression template front end, so use would be:

mp_number<arithmetic_backend<YourType> >

It certainly won't give you optimized performance, but it may give you a quick and dirty idea as to whether writing a proper backend is worthwhile.

I'll think a little bit more on how and if expression templates could improve the performances of fixed_point arithmetic that don't use memory allocation. I'll study your expression templates work to see if adapting it to mixed range+resolution could be an option. Best, Vicente

Neal Becker

3 Apr 3 Apr

11:13 a.m.

Vicente J. Botet Escriba wrote:

...

Le 02/04/12 10:53, John Maddock a écrit :

...
...
I'm wondering if the following compiles and what is the type of c

mp_uint128_t a; mp_uint256_t b; auto c = a + b;

From the documentation I have the impression that we can not mix back-ends.

Correct, it's a compiler error.

Allowing mixed type arithmetic opens up a whole can of worms it seems to me, plus I don't even want to think about how complex the operator overloads would have to be to support that!

So for now, if you want to conduct mixed arithmetic, you must explicitly cast one of the types. Humm, casting seems to be not the good option for fixed point arithmetic. The type of a fixed_point operation depends on the quantization of its arguments.

With fixed_point arithmetic the type of fixed_point<7,0> + fixed_point<8,0> is fixed_point<9,0>. Note that no overflow is needed as the resulting type has enough range.

Another example

fixed_point<10,2> a; fixed_point<2,3> b; fixed_point<2,2> c; auto d = (a + b) * c;

I have implemented my own fixed_point, and I debated whether the result of operations should automatically be a wider type. In the end, I decided it was best to use explicit casting. Explicit is better than implicit. It's a bit more verbose, but I can live with that. Would you advocate that in C++, int8 * int8 -> int16??

Vicente J. Botet Escriba

4:01 p.m.

Le 03/04/12 13:13, Neal Becker a écrit :

...

Vicente J. Botet Escriba wrote:

...
Le 02/04/12 10:53, John Maddock a écrit :

...
...
I'm wondering if the following compiles and what is the type of c

mp_uint128_t a; mp_uint256_t b; auto c = a + b;

From the documentation I have the impression that we can not mix back-ends. Correct, it's a compiler error.

Allowing mixed type arithmetic opens up a whole can of worms it seems to me, plus I don't even want to think about how complex the operator overloads would have to be to support that!

So for now, if you want to conduct mixed arithmetic, you must explicitly cast one of the types. Humm, casting seems to be not the good option for fixed point arithmetic. The type of a fixed_point operation depends on the quantization of its arguments.

With fixed_point arithmetic the type of fixed_point<7,0> + fixed_point<8,0> is fixed_point<9,0>. Note that no overflow is needed as the resulting type has enough range.

Another example

fixed_point<10,2> a; fixed_point<2,3> b; fixed_point<2,2> c; auto d = (a + b) * c;

I have implemented my own fixed_point, and I debated whether the result of operations should automatically be a wider type. In the end, I decided it was best to use explicit casting. Explicit is better than implicit. It's a bit more verbose, but I can live with that.

Would you advocate that in C++, int8 * int8 -> int16??

Hi, I can not change C/C++. But why not do it for user defined types as a fixed point class? This avoids at least unexpected overflows while been efficient. Vicente

Andrii Sydorchuk

6:07 p.m.

On Tue, Apr 3, 2012 at 1:13 PM, Neal Becker <ndbecker2@gmail.com> wrote:

...

Vicente J. Botet Escriba wrote:

...
Le 02/04/12 10:53, John Maddock a écrit :

...
...
I'm wondering if the following compiles and what is the type of c

mp_uint128_t a; mp_uint256_t b; auto c = a + b;

From the documentation I have the impression that we can not mix back-ends.

Correct, it's a compiler error.

Allowing mixed type arithmetic opens up a whole can of worms it seems to me, plus I don't even want to think about how complex the operator overloads would have to be to support that!

So for now, if you want to conduct mixed arithmetic, you must explicitly cast one of the types. Humm, casting seems to be not the good option for fixed point arithmetic. The type of a fixed_point operation depends on the quantization of its arguments.

With fixed_point arithmetic the type of fixed_point<7,0> + fixed_point<8,0> is fixed_point<9,0>. Note that no overflow is needed as the resulting type has enough range.

Another example

fixed_point<10,2> a; fixed_point<2,3> b; fixed_point<2,2> c; auto d = (a + b) * c;

I have implemented my own fixed_point, and I debated whether the result of operations should automatically be a wider type. In the end, I decided it was best to use explicit casting. Explicit is better than implicit. It's a bit more verbose, but I can live with that.

Would you advocate that in C++, int8 * int8 -> int16??

I would say that the result of the expression should not grow automatically. Consider example of summation of 1000 int32. The result value would have 1031 bits size, while the actual value it holds would be at most int42. There is also another template trick that allows resulting type to be at least as big as each of the operands: template <int N, int M> big_int<(N>M?N:M)> add(const big_int<N>& a, const big_int<M>& b) { big_int<(N>M?N:M)> result; // Implementation follows. return result; } I used this approach for my personal fixed int routines. The main point of a fixed integer type is its performance. It allows user to write complex math formulas without thinking about slow memory allocations. I would say that users are responsible to ensure that fixed int doesn't overflow. If somebody is not satisfied with such behavior they could use big integer types with dynamic memory allocation. _______________________________________________

...

Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Vicente J. Botet Escriba

8:30 p.m.

Le 03/04/12 20:07, Andrii Sydorchuk a écrit :

...

On Tue, Apr 3, 2012 at 1:13 PM, Neal Becker<ndbecker2@gmail.com> wrote:

...
Vicente J. Botet Escriba wrote:

...
Le 02/04/12 10:53, John Maddock a écrit :

...
...
I'm wondering if the following compiles and what is the type of c

mp_uint128_t a; mp_uint256_t b; auto c = a + b;

From the documentation I have the impression that we can not mix back-ends. Correct, it's a compiler error.

Allowing mixed type arithmetic opens up a whole can of worms it seems to me, plus I don't even want to think about how complex the operator overloads would have to be to support that!

So for now, if you want to conduct mixed arithmetic, you must explicitly cast one of the types. Humm, casting seems to be not the good option for fixed point arithmetic. The type of a fixed_point operation depends on the quantization of its arguments.

With fixed_point arithmetic the type of fixed_point<7,0> + fixed_point<8,0> is fixed_point<9,0>. Note that no overflow is needed as the resulting type has enough range.

Another example

fixed_point<10,2> a; fixed_point<2,3> b; fixed_point<2,2> c; auto d = (a + b) * c;

I have implemented my own fixed_point, and I debated whether the result of operations should automatically be a wider type. In the end, I decided it was best to use explicit casting. Explicit is better than implicit. It's a bit more verbose, but I can live with that.

Would you advocate that in C++, int8 * int8 -> int16??

I would say that the result of the expression should not grow automatically. Consider example of summation of 1000 int32. The result value would have 1031 bits size, while the actual value it holds would be at most int42. Well all this depends on how you write the summation. Just note that

...

There is also another template trick that allows resulting type to be at least as big as each of the operands:

template<int N, int M> big_int<(N>M?N:M)> add(const big_int<N>& a, const big_int<M>& b) { big_int<(N>M?N:M)> result; // Implementation follows. return result; } I used this approach for my personal fixed int routines.

The main point of a fixed integer type is its performance. It allows user to write complex math formulas without thinking about slow memory allocations. I agree. My library doesn't provide arbitrary ranges, so at the end I need to fix a bound. What is the result type of operator+ on this upper bound range? Currently the compiler fails. I will explore your idea to saturate the result range to this upper bound. Note that operator+= do already something similar in my library, but the user needs to know which has the larger range. I would say that users are responsible to ensure that fixed int doesn't overflow. If somebody is not satisfied with such behavior they could use big integer types with dynamic memory allocation. I agree that the user must master overflow. This doesn't means that the

int32+int32+int32+int32+int32+int32+int32+int32 could be written as ((int32+int32)+(int32+int32))+((int32+int32)+(int32+int32)) and in this case has type int35 instead of int39. This is where maybe expression templates could help. library can not help him to make the good decisions. My fixed_point class has a overflow template parameter that the user can set to ignore when is know that overflow can not occur. See example below. I recognize that writing a loop summing 1000 int32 (when the result of the operator+ grows is not so evident). The question is what would be the range of the result. fixed_point<32,0> v[1000]; fixed_point<range,0> sum; // ... for (int i=0; i<1000;++i) sum+=v[i]; When the size of the vector is know at compile time range is around log2(1000)+32 in this case. A library could provide a meta-function giving the correct type as for example accumulate<fixed_point<32,0>, 1000 /*times*>::type sum; The problem with the previous code is that overflow will be check at every addition. With the ignore overflow template parameter std::array<fixed_point<32,0>,1000> v; fixed_point<range,0,overflow::ignore> sum; A less friendly function could also be used to request no overflow check for (int i=0; i<1000;++i) sum.add(no_overflow_check,v[i]); Note that the library could check for overflow however in debug mode. If the size of the collection is know only at run-time, you will need to use a compile time size and overflow checking will manage the issue. std::vector<fixed_point<32,0>> v; fixed_point<64,0> sum; // ... for (int i=0; i<v.size();++i) sum+=v[i]; I hope this answers to your argument against arithmetic operators growing automatically the range of the result. Best, Vicente

Neal Becker

4 Apr 4 Apr

11:17 a.m.

New subject: [fixed-pt] (was [Review Request] Multiprecision Arithmetic Library)

Here is my current run-time fixed-pt. In this version, fixed-pt bit widths are set at run-time, becuase I use this from python. It is simple to convert this to compile-time if desired. The code is based on the boost::constrained_value http://pastebin.com/tXkeGAMf

Vicente J. Botet Escriba

8:43 p.m.

New subject: [fixed-pt] (was [Review Request] Multiprecision Arithmetic Library)

Le 04/04/12 13:17, Neal Becker a écrit :

...

Here is my current run-time fixed-pt. In this version, fixed-pt bit widths are set at run-time, becuase I use this from python. It is simple to convert this to compile-time if desired.

The code is based on the boost::constrained_value

http://pastebin.com/tXkeGAMf

Hi, I have no doubt of the utility of a fixed point library with constraints given at run-time, but I will prefer that the interface be based on Range and Resolution (as in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3352.htm) instead of in IntegerBits and FractionalBits. I need sometimes to have fixed points that have less bits than fractional bits and give negative numbers for IntegerBits or FractionalBits seems to me counterintuitive. Of course this is just a point of view and conversion between the different formats could be provided by the library. Best, Vicente

Neal Becker

10:48 p.m.

New subject: [fixed-pt] (was [Review Request] Multiprecision Arithmetic Library)

Vicente J. Botet Escriba wrote:

...

Le 04/04/12 13:17, Neal Becker a écrit :

...
Here is my current run-time fixed-pt. In this version, fixed-pt bit widths are set at run-time, becuase I use this from python. It is simple to convert this to compile-time if desired.

The code is based on the boost::constrained_value

http://pastebin.com/tXkeGAMf

Hi,

I have no doubt of the utility of a fixed point library with constraints given at run-time, but I will prefer that the interface be based on Range and Resolution (as in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3352.htm) instead of in IntegerBits and FractionalBits.

I need sometimes to have fixed points that have less bits than fractional bits and give negative numbers for IntegerBits or FractionalBits seems to me counterintuitive. Of course this is just a point of view and conversion between the different formats could be provided by the library.

Best, Vicente

1. It's easy to convert to compile-time constraints. Actually, this code was compile-time, and I converted to run-time. 2. I use this code for modelling fixed point hardware design (FPGA). In that case, we must be very explicit about the bit widths of every operation, and every temporary result. One reason I prefer not to have any automatic promotion. Also because, only in simple cases can the compiler guess the correct sizes needed for results, so it's better for the user to be forced to be explicit.

Vicente J. Botet Escriba

5 Apr 5 Apr

4:12 p.m.

New subject: [fixed-pt] (was [Review Request] Multiprecision Arithmetic Library)

...

Vicente J. Botet Escriba wrote:

...
Le 04/04/12 13:17, Neal Becker a écrit :

...
Here is my current run-time fixed-pt. In this version, fixed-pt bit widths are set at run-time, becuase I use this from python. It is simple to convert this to compile-time if desired.

The code is based on the boost::constrained_value

http://pastebin.com/tXkeGAMf

Hi,

I have no doubt of the utility of a fixed point library with constraints given at run-time, but I will prefer that the interface be based on Range and Resolution (as in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3352.htm) instead of in IntegerBits and FractionalBits.

I need sometimes to have fixed points that have less bits than fractional bits and give negative numbers for IntegerBits or FractionalBits seems to me counterintuitive. Of course this is just a point of view and conversion between the different formats could be provided by the library.

Best, Vicente

1. It's easy to convert to compile-time constraints. Actually, this code was compile-time, and I converted to run-time.

2. I use this code for modelling fixed point hardware design (FPGA). In that case, we must be very explicit about the bit widths of every operation, and every temporary result. I understand why in your case it could be better to don't adapt automatically the result type to the arguments and you need to master

Le 05/04/12 00:48, Neal Becker a écrit : the result type. Maybe on your FPGA design fp<2,5>+fp<2,5> -> fp<2,5> but other can design it as <2,5>+<2,5> -> <3,5>. Note also that the automatic promotion don't force you to use temporaries so fp<2,5> a,b,c; c = a+b; will do what is expected for fp<2,5>+fp<2,5> -> fp<2,5> (taking in account overflow). If the conversion between different fixed points is explicit we need to cast c = fp<2,5>(a+b); The other FPGA design needs fp<2,5> a,b; fp<3,5> c; c = a+b; The single problem is that this design doesn't prevent you for using temporaries.

...

One reason I prefer not to have any automatic promotion. Also because, only in simple cases can the compiler guess the correct sizes needed for results, so it's better for the user to be forced to be explicit.

I agree that the result deduction can not be done in general and that each algorithm becomes specific for the input /output quantizations and every time you change the input /output quantizations you need to review the internal quantization. This is one of the major drawbacks of fixed-point algorithms. What I don't see is how your design improves this. Could you give an example? Best, Vicente

Beman Dawes

2 Apr 2 Apr

3:13 p.m.

On Sun, Apr 1, 2012 at 11:55 AM, John Maddock <boost.regex@virgin.net> wrote:

...

Folks,

I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox.

...

And of course, I'm looking for a review manager ;-)

This seems to me to be a particularly timely library for Boost, and perhaps the C++ standard library. Thus I'd like to encourage someone to volunteer as review manager, and get the review process started as soon as possible. --Beman

Ronald Garcia

10:28 p.m.

Hi John, I have received your request and will add it to the review schedule. Best, Ron On Apr 1, 2012, at 8:55 AM, John Maddock wrote:

...

Folks,

I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox.

Features:

* Expression template enabled front end. * Support for Integer, Rational and Floating Point types.

Supported Integer backends:

* GMP. * Libtommath. * cpp_int.

cpp_int is an all C++ Boost licensed backend, supports both arbitrary precision types (with Allocator support), and signed and unsigned fixed precision types (with no memory allocation).

There are also some integer specific functions - for Miller Rabin testing, bit fiddling, random numbers. Plus interoperability with Boost.Rational (though that loses the expression template frontend).

Supported Rational Backends:

* GMP * libtommath * cpp_int (as above)

Supported Floating point backends:

* GMP * MPFR * cpp_dec_float

cpp_dec_float is an all C++ Boost licensed type, adapted from Christopher Kormanyos' e_float code (published in TOMS last year).

All the floating point types, have full std lib support (cos sin exp, pow etc), as well as full interoperability with Boost.Math.

There's nothing in principal to prevent extension to complex numbers and interval arithmetic types (plus any other number types I've forgotten!), but I've run out of energy for now ;-)

Code is in the sandbox under /big_number/.

Docs can be viewed online here: http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/ht...

And of course, I'm looking for a review manager ;-)

Many thanks, John.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Jeffrey Lee Hellrung, Jr.

3 Apr 3 Apr

1:15 p.m.

On Sun, Apr 1, 2012 at 8:55 AM, John Maddock <boost.regex@virgin.net> wrote:

...

Folks,

I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox.

Features:

* Expression template enabled front end. * Support for Integer, Rational and Floating Point types.

Supported Integer backends:

* GMP. * Libtommath. * cpp_int.

cpp_int is an all C++ Boost licensed backend, supports both arbitrary precision types (with Allocator support), and signed and unsigned fixed precision types (with no memory allocation).

There are also some integer specific functions - for Miller Rabin testing, bit fiddling, random numbers. Plus interoperability with Boost.Rational (though that loses the expression template frontend).

Supported Rational Backends:

* GMP * libtommath * cpp_int (as above)

Supported Floating point backends:

* GMP * MPFR * cpp_dec_float

cpp_dec_float is an all C++ Boost licensed type, adapted from Christopher Kormanyos' e_float code (published in TOMS last year).

All the floating point types, have full std lib support (cos sin exp, pow etc), as well as full interoperability with Boost.Math.

There's nothing in principal to prevent extension to complex numbers and interval arithmetic types (plus any other number types I've forgotten!), but I've run out of energy for now ;-)

Code is in the sandbox under /big_number/.

Docs can be viewed online here: http://svn.boost.org/svn/** boost/sandbox/big_number/libs/**multiprecision/doc/html/index.**html<http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/html/index.html>

And of course, I'm looking for a review manager ;-)

Many thanks, John.

I will have to take a look at it this week, sounds great! My initial question/request pertains to putting this library within the context of past attempts at getting an extended precision arithmetic library into Boost. I'm specifically thinking of Chad Nelson's XInt library reviewed approximately a year ago [1]. I'm wondering what, if any, influence the XInt library and the XInt review process/result had on your design decisions for MAL. I also hope that if issues similar to those brought up during the XInt review also apply to MAL that you'll be prepared to address them. That said, I'm pretty confident you've done your homework :) - Jeff [1] http://lists.boost.org/boost-announce/2011/04/0305.php

John Maddock

3:44 p.m.

...

I will have to take a look at it this week, sounds great!

My initial question/request pertains to putting this library within the context of past attempts at getting an extended precision arithmetic library into Boost. I'm specifically thinking of Chad Nelson's XInt library reviewed approximately a year ago [1]. I'm wondering what, if any, influence the XInt library and the XInt review process/result had on your design decisions for MAL. I also hope that if issues similar to those brought up during the XInt review also apply to MAL that you'll be prepared to address them.

That said, I'm pretty confident you've done your homework :)

Nod. I looked back over the review comments, the main things I spotted were: * No expression template support. * Only a GMP backend. * No support for fixed precision ints. Those I hope are fixed in this library, if I missed anything else, then no doubt I'll soon find out! Regards, John.

Christopher Kormanyos

9:33 p.m.

...

Folks,

...

I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox.

<snip>

...

cpp_dec_float is an all C++ Boost licensed type, adapted from Christopher Kormanyos' e_float code (published in TOMS last year).

Great work, John! And let's not forget to mention in the documentation that the cpp_dec_float back-end (in the present implementation) is limited to < 500 (need to check) decimal digits of precision. Best regards, Chris.

John Maddock

4 Apr 4 Apr

5:12 p.m.

...

And let's not forget to mention in the documentation that the cpp_dec_float back-end (in the present implementation) is limited to < 500 (need to check) decimal digits of precision.

Nope, I didn't set an upper limit on the number of digits - except of course that the type will grow too large to instantiate / allocate on the stack if too many are used. I should note that though. Thanks, John.

Christopher Kormanyos

9:10 p.m.

...

...
And let's not forget to mention in the documentation

...

...
that the cpp_dec_float back-end (in the present implementation) is limited to < 500 (need to check) decimal digits of precision.

...

Nope, I didn't set an upper limit on the number of digits - except of course that the type will grow too large to instantiate / allocate on the stack if too many are used. I should note that though. Thanks, John.

Thanks, John. Unfortunately, it's worse than that. My O(N^2), base-10^8 school-method multiplication stores multiple carries in its rows of the multiplication algorithm. These can can potentially overflow 64-bit. Also, since cpp_dec_float does not climb the digit scale elegantly switching from school O(N^2) multiply to Karatsuba to FFT, the N^2 crushes efficiency for more than a few hundred digits. I had originally set an upper limit of 300 decimal digits. It's so critical that I would maybe go so far as to use a static_assert on it. This is one of the weakest parts of my original design and one that I would be most interested in eliminating in future versions. Best regards, Chris.

John Maddock

5 Apr 5 Apr

8:18 a.m.

...

Unfortunately, it's worse than that. My O(N^2), base-10^8 school-method multiplication stores multiple carries in its rows of the multiplication algorithm. These can can potentially overflow 64-bit.

Also, since cpp_dec_float does not climb the digit scale elegantly switching from school O(N^2) multiply to Karatsuba to FFT, the N^2 crushes efficiency for more than a few hundred digits. I had originally set an upper limit of 300 decimal digits.

It's so critical that I would maybe go so far as to use a static_assert on it.

For sure, looking at the code, this is the Coomba algorithm yes? Assuming that's correct I'm adding: // // There is a limit on how many limbs this algorithm can handle without dropping digits // due to overflow in the carry, it is: // // FLOOR( (2^64 - 1) / (10^8 * 10^8) ) == 1844 // BOOST_STATIC_ASSERT_MSG(mp_elem_number < 1800, "Too many limbs in the data type for the multiplication algorithm - unsupported precision in cpp_dec_float."); That still allows 14000 digits which would be grindingly slow at O(N^2), but at least the basic algorithm should work ;-) Thanks for the heads up! John.

Christopher Kormanyos

9:57 p.m.

...

...
Also, since cpp_dec_float does not climb the

...

...
digit scale elegantly switching from school O(N^2) multiply to Karatsuba to FFT, the N^2 crushes efficiency for more than a few hundred digits. I had originally set an upper limit of 300 decimal digits.

It's so critical that I would maybe go so far as to use a static_assert on it.

...

For sure, looking at the code, this is the Coomba algorithm yes? Assuming that's correct I'm adding:

<snip>

...

That still allows 14000 digits which would be grindingly slow at O(N^2), but at least the basic algorithm should work ;-) Thanks for the heads up! John.

Yes. Thanks! Did I ever mention that I love what you have done with this stuff? It's simply great programming! But let's face it, my O(N^2) is very slowwww for digits. I've got to do better than that! What would you say if I suggested, late in the game, that I add, *at least*, a naive FFT multiplication based on complex decimation in radix-2. We won't break any speed records or catch GMP at a million digits, but at least we won't end up with a stand-still at 10,000 decimal digits. In your opinion, should I take a crack at this if I find a spare afternoon? Of course, in the future we need to cooperate with FFT specialists who could get the speed of big-num types (with BPL licensing) up to state.of-the-art. But maybe we can get just a bit more functionality now before review and potential first distrubition. Best regards, Chris

John Maddock

6 Apr 6 Apr

8:20 a.m.

...

Did I ever mention that I love what you have done with this stuff? It's simply great programming!

All your fault for writing the arithmetic code though ;-)

...

But let's face it, my O(N^2) is very slowwww for digits.

Up to a point - for smaller digit counts it's usually the best you can do in practice.

...

I've got to do better than that!

What would you say if I suggested, late in the game, that I add, *at least*, a naive FFT multiplication based on complex decimation in radix-2. We won't break any speed records or catch GMP at a million digits, but at least we won't end up with a stand-still at 10,000 decimal digits.

In your opinion, should I take a crack at this if I find a spare afternoon?

Maybe. My understanding is that FFT doesn't become viable until the digit count grows truely huge? Would the Karatsuba algorithm be viable sooner and help more users? My gut feeling is that the current implementation is perfectly fine up to a few hundred (and maybe a thousand?) decimal places, which will probably take care of most use cases.... well my use cases anyway! So maybe examples are a higher priority for now? We could also go through the current the code looking for optimisation oportunities - for example would the inversion code be more efficient using Halley rather than Newton iteration (see for example http://numbers.computation.free.fr/Constants/Algorithms/inverse.html)? Also multiplication and division by integer (relatively common operations?) as special cases.

...

Of course, in the future we need to cooperate with FFT specialists who could get the speed of big-num types (with BPL licensing) up to state.of-the-art. But maybe we can get just a bit more functionality now before review and potential first distrubition.

For sure, there's always much more that can be done, it's just a case of knowing when to stop! ;-) Cheers, John.

Christopher Kormanyos

10:57 a.m.

...

...
But let's face it, my O(N^2) is very slowwww for digits.

...

...
In your opinion, should I take a crack at this if I find a spare afternoon?

...

Maybe. My understanding is that FFT doesn't become viable until the digit count grows truely huge? Would the Karatsuba algorithm be viable sooner and help more users? My gut feeling is that the current implementation is perfectly fine up to a few hundred (and maybe a thousand?) decimal places, which will probably take care of most use cases.... well my use cases anyway!

In my previous unpublished work (mp_cpp, also using base-10^8), I tuned to the following: static const INT32 mp_elem_karatsuba_min = static_cast< INT32>(40); // Approx. 280 digits. static const INT32 mp_elem_fft_min = static_cast< INT32>(129); // Approx. 900 digits. I've got a recursive Karatsuba template available in my catalog already. Maybe we should try it out. The FFT mentioned above was FFTW (a sadly non-BPL wonder of mankind) so a vanilla FFT would be, maybe, 2 or 3 times slower.

...

So maybe examples are a higher priority for now?

Yes, you are right. I just don't want to get into trouble with the community. Everyone wants to compute a million digits of pi, and they might get mad at us if boost can't do it. But you're right below. We need to stop and get a work of high quality out there to the world---maybe improving it later, like in the fall of 2012. In fact, as I told you and Paul Bristow (both of whom have helped so much in this project), I'm actually booked solid through summer.

...

We could also go through the current the code looking for optimisation oportunities - for example would the inversion code be more efficient using Halley rather than Newton iteration (see for example http://numbers.computation.free.fr/Constants/Algorithms/inverse.html)?

Interesting...

...

Also multiplication and division by integer (relatively common operations?) as special cases.

Fortunately, we already have this. The routines mul_unsigned_long_long() and div_unsigned_long_long() utilize O(N) linear mul/div by integer. And we support them in your interface functions eval_multiply() and eval_divide().

...

...
Of course, in the future we need to cooperate with FFT specialists who could get the speed of big-num types (with BPL licensing) up to state.of-the-art. But maybe we can get just a bit more functionality now before review and potential first distrubition.

...

For sure, there's always much more that can be done, it's just a case of knowing when to stop! ;-) Cheers, John.

You are right. We need to stop real soon and review what we've got. If I can contribute anything like Karatsuba or FFT relatively quickly, yet adequately reliable, I'll let you know. Best regards, Chris

Paul A. Bristow

5:58 p.m.

...

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Christopher Kormanyos Sent: Friday, April 06, 2012 11:57 AM To: boost@lists.boost.org Subject: Re: [boost] [Review Request] Multiprecision Arithmetic Library

...
...
But let's face it, my O(N^2) is very slowwww for digits.

...
...
In your opinion, should I take a crack at this if I find a spare afternoon?

...
Maybe. My understanding is that FFT doesn't become viable until the digit count grows truely huge? Would the Karatsuba algorithm be viable sooner and help more users? My gut feeling is that the current implementation is perfectly fine up to a few hundred (and maybe a thousand?) decimal places, which will probably take care of most use cases.... well my use cases anyway!

In my previous unpublished work (mp_cpp, also using base-10^8), I tuned to the following: static const INT32 mp_elem_karatsuba_min = static_cast< INT32>(40); // Approx. 280 digits. static const INT32 mp_elem_fft_min = static_cast< INT32>(129); // Approx. 900 digits.

I've got a recursive Karatsuba template available in my catalog already. Maybe we should try it out. The FFT mentioned above was FFTW (a sadly non-BPL wonder of mankind) so a vanilla FFT would be, maybe, 2 or 3 times slower.

...
So maybe examples are a higher priority for now?

Yes, you are right. I just don't want to get into trouble with the community. Everyone wants to compute a million digits of pi, and they might get mad at us if boost can't do it.

But you're right below. We need to stop and get a work of high quality out there to the world---maybe improving it later, like in the fall of 2012.

Agreed - it needs user exposure.

...

In fact, as I told you and Paul Bristow (both of whom have helped so much in this project), I'm actually booked solid through summer.

I fully understand that. But I'd be willing work up the handful of examples that you mentioned if that would help (I think it would - I think novices need much more help). Novice views on whether they find the package useful to them would a (small part) factor in a review. Paul --- Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com

Christopher Kormanyos

10:42 p.m.

...

I fully understand that. But I'd be willing work up the handful of examples that you mentioned if

...

that would help (I think it would - I think novices need much more help). Novice views on whether they find the package useful to them would a (small part) factor in a review.

...

Paul

Paul, we need a few examples. My generic programming examples are pretty high up on the totem pole. Maybe the users need some real rudimentary stuff like including the headers, doing add/sub/mul/div and sqrt, sin, log, exp, etc. Setting the precision of the output stream and getting the numbers. You might want to provide some intuitive, easy-to-understand "four-banger" stuff. Do you remember what a four-banger is? (And the rest of you, stop giggling! It was a calculator.) Best regards, Chirs.

Andrii Sydorchuk

7 Apr 7 Apr

12:34 a.m.

On Mon, Apr 2, 2012 at 10:49 PM, Christopher Kormanyos <e_float@yahoo.com> wrote:

...

My basic operations (add, sub, mul, div) are on-par or faster for a variety of reasons. For example, I don't allocate---just use std::array "on the stack".

That's great that they are faster, but what are the precision warranties for the following operations: add, sub, mul, div, sqrt? Andrii Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Christopher Kormanyos

3:49 p.m.

...

...
My basic operations (add, sub, mul, div) are on-par or faster

...

...
for a variety of reasons. For example, I don't allocate---just use std::array "on the stack".

...

That's great that they are faster, but what are the precision warranties for the following operations: add, sub, mul, div, sqrt? Andrii

Andrii, In our work, we support three floating point back-ends: * boost::multiprecision::cpp_dec_float<unsigned Digits10> * boost::multiprecision::gmp_float_backend<unsigned Digits10> * boost::multiprecision::mpfr_float_backend<unsigned Digits10>The cpp_dec_float back-end is entirely under BPL, whereas the other two are restricted with GPL. The floating-point infrastructure also supports *any* floating-point back-end, provided this adheres to the requirements of the interface. Each of our back-ends is designed to maintain precision for add, sub, mul, div, sqrt. After all, these are precision-maintaining algorithms. If, however, you compute a divergent series or, maybe, a numerical derivative in your own development, then you might lose precision if the algorithm is ill-conditioned. This is just like with ordinary float, double, etc. Actually, I am re-benchmarking this stuff. I would say the speeds of all supported back-ends are comparable from 30-75 decimal digits on a 64-bit machine, whereas above 100 decimal digits, the GNU-based back-ends are faster. So I had better be more precise with performance comments in the future. John has provided benchmark results in his documentation. Best regards, Chris.

John Maddock

11:32 a.m.

...

Paul, we need a few examples. My generic programming examples are pretty high up on the totem pole.

Maybe, but if folks can't cope with the "area of a circle example", then they're probably beyond hope ;-) John.

Christopher Kormanyos

3:51 p.m.

...

...
Paul, we need a few examples. My generic programming

...

...
examples are pretty high up on the totem pole.

...

Maybe, but if folks can't cope with the "area of a circle example", then they're probably beyond hope ;-) John.

Hey man, don't underestimate the area_of_a_circle()! That thing has layers... ;-) Best regards, Chris.

Andrii Sydorchuk

7:50 p.m.

...

Each of our back-ends is designed to maintain precision for add, sub, mul, div, sqrt.

Excellent, that's exactly what I wanted to know. So I would expect the relative error for those operations to be equal to 1 machine epsilon.

...

From a brief view on your implementation it seems that there is no way to get access to the internal data, exp, neg, fpclass members. Am I right? Could you expose those as readonly through a const accessor methods? I need this functionality to be able to compare two floating point values within specified ulp range (units in the last place).

Thanks, Andrii Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Christopher Kormanyos

11:34 p.m.

...

...
Each of our back-ends is designed to maintain precision for add, sub, mul,

...

...
div, sqrt.

...

Excellent, that's exactly what I wanted to know. So I would expect the relative error for those operations to be equal to 1 machine epsilon.

...

From a brief view on your implementation it seems that there is no way to get access to the internal data, exp, neg, fpclass members. Am I right? Could you expose those as readonly through a const accessor methods? I need this functionality to be able to compare two floating point values within specified ulp range (units in the last place).

...

Thanks, Andrii

Hi Andrii, I believe the relative error in our basic operations is *less than* or equal to one ULP. However, in portable C++, we can not hack the floating-point type---never. One of our back ends has base-10. The other two have an intriguing combinations of base-2. The hacks of the internal details of these representations would not be portable. Consider ldexp() and frexp(). With our strategy, the user is not granted access to the internal details of the floating-point type. The facility std::numeric_limits<T> has a member called epsilon(), which returns the smallest number in a given floating-point system that differs from 1. To test for relative (in)-equality within one ULP, you need to take the ratio of two quantities, subtract this ratio from 1 and subsequently take the absolute value of the result to obtain a basis for comparison with 1 ULP. This is the normal way for any floating-point type. It is a lot to take at first. But no portable code can rely on the internal representation of the floating-point type. You can only use library functions, limits and/or comparison. I hope this helps. Best regards, Chris.

John Maddock

8 Apr 8 Apr

9:11 a.m.

...

...
From a brief view on your implementation it seems that there is no way to get access to the internal data, exp, neg, fpclass members. Am I right? Could you expose those as readonly through a const accessor methods? I need this functionality to be able to compare two floating point values within specified ulp range (units in the last place).

We can never expose internal details as such.... since they're details and change from backend to backend. But: exponent - available via frexp, cheap for binary types, expensive for decimal types. neg - Use mp_number::sign(). Comparison with a literal 0 is cheap too (optimized in terms of sign()). fpclass - use boost::math::fpclassify - it's as cheap as it's going to get for that operation. In other words, use the same operations you would as with a builtin floating point type. HTH, John.

Andrii Sydorchuk

10:40 a.m.

...

I believe the relative error in our basic operations is *less than* or equal to one ULP. However, in portable C++, we can not hack the floating-point type---never. One of our back ends has base-10. The other two have an intriguing combinations of base-2. The hacks of the internal details of these representations would not be portable.

I guess my statement was too ambiguous. I don't ask this functionality to be accessible through generic mp_number interface, but via backend interface of cpp_dec_float. To test for relative (in)-equality within one ULP, you need to take the

...

ratio of two quantities, subtract this ratio from 1 and subsequently take the absolute value of the result to obtain a basis for comparison with 1 ULP.

Approach you mentioned requires division and would be quite slow. It is a lot faster to compare internal integer representations. This is the normal way for any floating-point type. It is a lot to take at

...

first. But no portable code can rely on the internal representation of the floating-point type. You can only use library functions, limits and/or comparison.

Ok, that's another way to go. It would be great to have ulp comparison functionality as part of the library. Particularly I am interested in the following prototype: bool ulp_equal(const mp_number<fpt_type> &a, const mp_number<fpt_type> &b, unsigned int ulps); Thanks, Andrii Unsubscribe & other changes: http://lists.boost.org/** mailman/listinfo.cgi/boost<http://lists.boost.org/mailman/listinfo.cgi/boost>

John Maddock

6 Apr 6 Apr

6:03 p.m.

...

Maybe. My understanding is that FFT doesn't become viable until the digit count grows truely huge? Would the Karatsuba algorithm be viable sooner and help more users? My gut feeling is that the current implementation is perfectly fine up to a few hundred (and maybe a thousand?) decimal places, which will probably take care of most use cases.... well my use cases anyway!

In my previous unpublished work (mp_cpp, also using base-10^8), I tuned to the following: static const INT32 mp_elem_karatsuba_min = static_cast< INT32>(40); // Approx. 280 digits. static const INT32 mp_elem_fft_min = static_cast< INT32>(129); // Approx. 900 digits.

I've got a recursive Karatsuba template available in my catalog already. Maybe we should try it out.

For sure!

...

The FFT mentioned above was FFTW (a sadly non-BPL wonder of mankind) so a vanilla FFT would be, maybe, 2 or 3 times slower.

Unless someone wants to implement the same optimisations over again as a template library? Way beyond our scope to do that though...

...

So maybe examples are a higher priority for now?

...

Yes, you are right. I just don't want to get into trouble with the community. Everyone wants to compute a million digits of pi, and they might get mad at us if boost can't do it.

Million digits of PI? Look 'em up I say :-)

...

We could also go through the current the code looking for optimisation oportunities - for example would the inversion code be more efficient using Halley rather than Newton iteration (see for example http://numbers.computation.free.fr/Constants/Algorithms/inverse.html)?

...

Interesting...

Lot's of good stuff explained on that site including binary splitting.

...

Also multiplication and division by integer (relatively common operations?) as special cases.

...

Fortunately, we already have this. The routines mul_unsigned_long_long() and div_unsigned_long_long() utilize O(N) linear mul/div by integer. And we support them in your interface functions eval_multiply() and eval_divide().

My mistake, excellent. Cheers, John.

Christopher Kormanyos

10:35 p.m.

<snip>

...

...
I've got a recursive Karatsuba template available in my catalog already.

...

...
Maybe we should try it out.

...

For sure!

I will put the Karatsuba into my local copy and take it to the test over the holidays. I will report on its performance as the data become available. <snip>

...

So maybe examples are a higher priority for now?

I added a brief PDF report on generic numeric programming and included three sample files in the SVN repo today. https://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/examp... I can't write quickbook so I wrote LaTeX. I can't add *.tex to SVN (my bad on some kind of mime or whatever). I added the LaTeX text to this e-mail. If you approve of these examples, may I ask you to translate my doc to yours? In my opinion, Boost.Multiprecision is a wonderful creation from John and Paul and my preliminary contributions. There is nothing like it in compiled code for MP numerical analysis. I have a lot more ideas for this thing than time. My dream list is *way* bigger than the time we have right now: * 1,000 digits high performance * Transition to competitive base-2 back-end * up to 1,000,000 digits for the hunters * PSLQ algorithm for experimental mathematics * Seamless interface to Python and Mathematica (in my original research paper)*But we need to go to review with what we got soon.* <snip> Best regards, Chris. P.S. I also have a potential solution to my excessive guard digits. I can show it to you with the Karatsuba stuff and, based on your approval, commit it.

John Maddock

7 Apr 7 Apr

11:31 a.m.

...

I added a brief PDF report on generic numeric programming and included three sample files in the SVN repo today. https://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/examp...

I can't write quickbook so I wrote LaTeX. I can't add *.tex to SVN (my bad on some kind of mime or whatever). I added the LaTeX text to this e-mail.

If you approve of these examples, may I ask you to translate my doc to yours?

Wilco.

...

In my opinion, Boost.Multiprecision is a wonderful creation from John and Paul and my preliminary contributions. There is nothing like it in compiled code for MP numerical analysis. I have a lot more ideas for this thing than time.

My dream list is *way* bigger than the time we have right now:

* 1,000 digits high performance * Transition to competitive base-2 back-end

* up to 1,000,000 digits for the hunters * PSLQ algorithm for experimental mathematics * Seamless interface to Python and Mathematica (in my original research paper)*But we need to go to review with what we got soon.*

<snip>

Best regards, Chris.

P.S. I also have a potential solution to my excessive guard digits. I can show it to you with the Karatsuba stuff and, based on your approval, commit it.

Great! Thanks, John.

John Maddock

9 Apr 9 Apr

5:02 p.m.

...

If you approve of these examples, may I ask you to translate my doc to yours?

Now done, can you check that it all looks OK: http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/ht... I extended the derivative example slightly to include a partial derivative of the incomplete gamma - I assume I'm not doing anything "stupid" in doing that? Thanks for those, John.

Christopher Kormanyos

8:29 p.m.

...

...
If you approve of these examples, may I ask you to translate my doc

...

...
to yours?

...

Now done, can you check that it all looks OK: http://svn.boost.org/svn/boost/sandbox/big_number/libs/multiprecision/doc/ht... I extended the derivative example slightly to include a partial derivative of the incomplete gamma - I assume I'm not doing anything "stupid" in doing that? Thanks for those, John.

Thank you John, It's looking good! Would you like more examples of Floating-point stuff? I have a really nice one for numerical root finding in generic math. Maybe we should stop where we are, as it's a pretty good level, probably with enough examples for this stage in the game. Your opinion? Best regards, Chris.

John Maddock

10 Apr 10 Apr

8:21 a.m.

...

It's looking good! Would you like more examples of Floating-point stuff? I have a really nice one for numerical root finding in generic math.

Root finding is good for an example, but I'm slightly wary because we have some nice boilerplate routines in Boost.Math for that which handle all the corner cases etc, and I'd rather folks called those routines than cut and paste from a - necessarily simplified - example?

...

Maybe we should stop where we are, as it's a pretty good level, probably with enough examples for this stage in the game.

We probably have enough to be going on with: I might just add a polynomial evaluation example because it shows off the expression templates ability to eliminate temporaries very nicely! :-) Cheers, John.

Christopher Kormanyos

8:06 p.m.

...

...
I have a really nice one for numerical root finding in generic math.

...

Root finding is good for an example, but I'm slightly wary because we have some nice boilerplate routines in Boost.Math for that which handle all the corner cases etc, and I'd rather folks called those routines than cut and paste from a - necessarily simplified - example?

Right!

...

...
Maybe we should stop where we are, as it's a pretty good level, probably with enough examples for this stage in the game.

Yeah. Let's stop (Almost! See below).

...

We probably have enough to be going on with: I might just add a polynomial evaluation example because it shows off the expression templates ability to eliminate temporaries very nicely! :-)

...

Cheers, John.

I was playing around with polynomial expansions for my microcontroller book. I just extended one to multiprecision. Very cool! In the attachment is an order-63 polynomial expansion to sin(x pi/2) for -1 <= x <= 1. It's good to go for better than 64 decimal digits of precision.

Christopher Kormanyos

9 Apr 9 Apr

9:13 p.m.

<snip>

...

...
...
I've got a recursive Karatsuba template available in my catalog already.

...

...
...
Maybe we should try it out.

...

...
For sure!

...

I will put the Karatsuba into my local copy and take it to the test over the holidays. I will report on its performance as the data become available.

Hi John, Long e-mail here. Lot's of information and need your inputs. My attempts with recursive Karatsuba in base-10 were a disaster. The expected benefit was not achieved---all the way up to 100,000 decimal digits. I experimented with FFTs. We need your inputs if we *really* want to make very high-precision available at this stage. The community might expect it. But I really don't know if you would like to introduce a potentially poorly tested FFT-based multiplication scheme at this late stage in the game or if you would prefer to do it in a more controlled and better tested fashion after review, potential acceptance and establishment. A *dumb* FFT was slow, not out-performing O(N^2) until over 50,000 decimal digits. A merely *naive* FFT of Danielson-Lanczos type was better, out-performing O(N^2) at around 10,000 decimal digits. It's low-impact, clean code, maybe about 30 lines of code. Still, the naive FFT gives a slow multiplication. Performing 100 individual 100,000 digit square-root calculation with the naive FFT requires 12 seconds on my machine. State-of-the-art MPFR does the same test square-root in about 0.8 sec. This puts us off of the best by a factor of 15. My goal with base-10, no Karatsuba and no Toom-Cook would be, say, 3-4 times worse. Still, the naive FFT is a good start---not embarrassing, rather, shows the potential and paves the way for improvement over the years. My program crashed for a million digit square-root calculation. I could potentially make a much faster FFT (factor 3 or 4) in fall or winter. The square-root algorithm required some slight modifications to properly converge for high digits. Other Newton-iterations may need similar tuning if we decide to make high-precision available. Plus ultra-high precision needs other algos for atan, exp, etc., as Taylor loses effectiveness. Would need to build these up over years if you would like to go that way. <snip>

...

...
So maybe examples are a higher priority for now?

We have some examples now. I am awaiting your input if you think I should contribute more floating-point examples.

...

P.S. I also have a potential solution to my excessive guard digits.

I resolved the excessive guard digit issue and reduced the guard digits to the necessary amount (and provide rationale in a code comment). It may require testing the test cases again because tiny tolerances might shift. Not expected, but maybe. With your permission, I can check it in on Tuesday.

Christopher Kormanyos

9:50 p.m.

See previous long e-mail. <snip>

...

I experimented with FFTs.

<snip>

...

My program crashed for a million digit square-root calculation.

Was stack overflow, as expected. * 10^6 digit10, naive FFT, base-10: time =3.0s * 10^6 digit10, MPFR: time = 0.23sLot's of information here. But in short, big digits require allocator, as in your other back-end classes. Duh! Best regards, Chris.

John Maddock

10 Apr 10 Apr

8:18 a.m.

...

Long e-mail here. Lot's of information and need your inputs.

My attempts with recursive Karatsuba in base-10 were a disaster. The expected benefit was not achieved---all the way up to 100,000 decimal digits.

:-(

...

I experimented with FFTs. We need your inputs if we *really* want to make very high-precision available at this stage. The community might expect it. But I really don't know if you would like to introduce a potentially poorly tested FFT-based multiplication scheme at this late stage in the game or if you would prefer to do it in a more controlled and better tested fashion after review, potential acceptance and establishment.

A *dumb* FFT was slow, not out-performing O(N^2) until over 50,000 decimal digits. A merely *naive* FFT of Danielson-Lanczos type was better, out-performing O(N^2) at around 10,000 decimal digits. It's low-impact, clean code, maybe about 30 lines of code.

Nice, had no idea it could be implemented in such a small space!

...

Still, the naive FFT gives a slow multiplication. Performing 100 individual 100,000 digit square-root calculation with the naive FFT requires 12 seconds on my machine. State-of-the-art MPFR does the same test square-root in about 0.8 sec. This puts us off of the best by a factor of 15. My goal with base-10, no Karatsuba and no Toom-Cook would be, say, 3-4 times worse. Still, the naive FFT is a good start---not embarrassing, rather, shows the potential and paves the way for improvement over the years.

My program crashed for a million digit square-root calculation.

I could potentially make a much faster FFT (factor 3 or 4) in fall or winter.

The square-root algorithm required some slight modifications to properly converge for high digits. Other Newton-iterations may need similar tuning if we decide to make high-precision available. Plus ultra-high precision needs other algos for atan, exp, etc., as Taylor loses effectiveness. Would need to build these up over years if you would like to go that way.

Sigh.... yes I forgot about those, so it sounds like for now at least these more advanced algorithms aren't going to help much - at least not for cpp_dec_float's target ordiance - those wanting higher precision than long double, but not crazy digit counts? So my gut feeling is to hold off for now, and maybe do things properly later (dynamic allocation, better std lib support etc)?

...

...
...
So maybe examples are a higher priority for now?

We have some examples now. I am awaiting your input if you think I should contribute more floating-point examples.

I think the existing ones work out quite nicely now thanks, I could use some ideas for integer and maybe rational number examples, but that's a whole other ball game ;-)

...

...
P.S. I also have a potential solution to my excessive guard digits.

I resolved the excessive guard digit issue and reduced the guard digits to the necessary amount (and provide rationale in a code comment).

It may require testing the test cases again because tiny tolerances might shift. Not expected, but maybe.

With your permission, I can check it in on Tuesday.

Sure, or mail me a patch and I'll run the full tests (or you can - just cd into libs/multiprecision/test and do a "bjam --enable-specfun" and then wait for a long time!) Thanks, John.

Christopher Kormanyos

3 Apr 3 Apr

10:09 p.m.

...

Folks,

...

I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox.

When you do something like "get_constant_pi()", I see you are "warm-caching" the result from the first computation. Great. Just to double-check... I see the static bool flag, right? It's warm-cached, right? Yes. OK. Please confirm: Would you like to "seed" common constants such as pi, e, sqrt2, ln2, etc. with a known string representation of, say, a few hundred digits? Or is reading the string slower than computing the value? Best regards, Chris.

Andrii Sydorchuk

10:49 p.m.

...

What is the result type of operator+ on this upper bound range? Currently the compiler fails.

Following code compiles for me: template <int N> struct big_int { big_int() { val = N; } template <int M> big_int<(N>M?N:M)> operator+(const big_int<M> &that) const { return big_int<(N>M?N:M)>(); } int val; }; int main() { big_int<1> a; big_int<2> b; std::cout << (a + b).val << std::endl; std::cout << (b + a).val << std::endl; }

...

I hope this answers to your argument against arithmetic operators growing automatically the range of the result.

And do you think this is worth the effort? Also I wouldn't like the idea of fixed_point<range,0> structures. It is going to be too complicated from the user perspective. All I need from fixed integer library is efficient fixed integer type (behaving like C++ integral types) and no complex structures to compute addition, multiplication, etc. Andrii Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Vicente J. Botet Escriba

4 Apr 4 Apr

5:30 a.m.

Le 04/04/12 00:49, Andrii Sydorchuk a écrit :

...

...
What is the result type of operator+ on this upper bound range? Currently the compiler fails.

Following code compiles for me:

template<int N> struct big_int { big_int() { val = N; } template<int M> big_int<(N>M?N:M)> operator+(const big_int<M> &that) const { return big_int<(N>M?N:M)>(); } int val; };

int main() { big_int<1> a; big_int<2> b; std::cout<< (a + b).val<< std::endl; std::cout<< (b + a).val<< std::endl; }

My comment was respect to my library ;-)

...

...
I hope this answers to your argument against arithmetic operators growing automatically the range of the result.

And do you think this is worth the effort? Yes. At least for my needs. Also I wouldn't like the idea of fixed_point<range,0> structures. It is going to be too complicated from the user perspective. All I need from fixed integer library is efficient fixed integer type (behaving like C++ integral types) and no complex structures to compute addition, multiplication, etc.

I've not yet defined an specific class for fixed point integers. Note that the resolution parameter defaults to 0, so that fixed_point<range> defines an integer. I'll include it once the generic class is stable enough. Best, Vicente

John Maddock

5:06 p.m.

...

...
I'd like to ask for a formal review of the Multiprecision Arithmetic Library, currently in the sandbox.

When you do something like "get_constant_pi()", I see you are "warm-caching" the result from the first computation. Great.

Just to double-check... I see the static bool flag, right? It's warm-cached, right? Yes. OK. Please confirm:

The constants are initialized prior to the start of main - so they're thread safe as long as only one thread executes prior to main.

...

Would you like to "seed" common constants such as pi, e, sqrt2, ln2, etc. with a known string representation of, say, a few hundred digits? Or is reading the string slower than computing the value?

Ah, good point, oversight on my part. I've just committed an update that uses your 1100 digit values to initialize when possible. John.

Christopher Kormanyos

9:24 p.m.

...

...
When you do something like "get_constant_pi()", I see you

...

...
are "warm-caching" the result from the first computation. Great.

Just to double-check... I see the static bool flag, right? It's warm-cached, right? Yes. OK. Please confirm:

...

The constants are initialized prior to the start of main - so they're thread safe as long as only one thread executes prior to main.

I have struggled with this issue before. Maybe you help me out. The bool "is_initialized" flags are local statics in template subroutines. A subroutine local static will be initialized once, but not prior to main(), rather the first time the subroutine is executed, i.e., the first time it's called by a user. Is this even right, or are templates somehow special? So if thread1 starts initializing the warm-cached representation of pi and thread2 of higher priority interrupts this halfway through, then thread2 will complete the initialization fully and return. Thread1 subsequently continues its initialization of the, well, already initialized representation of pi. I think it's harmless, though highly confusing. But when push comes to shove, I think I have historically made a subtle mistake in this matter. The is_initialized flag (which is actually a kind of manual "guard-variable") should be set to true *after* initializing the representartion of pi. Am I just plain wrong here, or is it really this confusing/subtle, etc.? I don't want to introduce <atomic> or <mutex>. Thanks, Chris.

...

...
Would you like to "seed" common constants such as pi, e, sqrt2, ln2, etc. with a known string representation of, say, a few hundred digits? Or is reading the string slower than computing the value?

...

Ah, good point, oversight on my part. I've just committed an update that uses your 1100 digit values to initialize when possible. John.

Excellent! Thanks.

Christopher Kormanyos

9:29 p.m.

...

...
The constants are initialized prior to the start of main - so they're thread safe as long as only one thread executes prior to main.

...

I have struggled with this issue before. Maybe you help me out.

<snip>

...

The is_initialized flag (which is actually a kind of manual "guard-variable") should be set to true *after* initializing the representartion of pi.

Ahhhh! But you got it right. Excellent. You initialize "b" after initializing the representation of pi, ln2, etc. I think that's looking OK for threads. Sorry for the chatter... Best regards, Chris.

John Maddock

5 Apr 5 Apr

7:59 a.m.

...

...
I have struggled with this issue before. Maybe you help me out.

<snip>

...
The is_initialized flag (which is actually a kind of manual "guard-variable") should be set to true *after* initializing the representartion of pi.

Ahhhh! But you got it right. Excellent. You initialize "b" after initializing the representation of pi, ln2, etc.

I think that's looking OK for threads.

Actually no, that code *on it's own* is most definitely not thread safe. Anywhere you have a function scope: static const UDT my_object. where UDT is not a POD, then initialization of my_object is subject to race conditions (in C++03 anyway, C++11 and recent GCC versions get this right I believe). What's more if you try to fix the race condition using the "double checked looking idiom", it's still not thread safe, you actually do have to grab a mutex with every call to the function :-( However, there's a trick I stole from Boost.Pool employed here, in the: constant_initializer<T, &get_constant_e<T> >::do_nothing(); Which ultimately calls the global object constant_initialized<>::initializer::do_nothing() - purely to make sure that that object is instantiated. And once it is instantiated, because it's a global object it gets constructed before main(), and it's constructor makes a call to get_constant_e<T>, thus initializing our constant, also before main is called. Tricky, and convoluted, but effective! HTH, John.

Christopher Kormanyos

9:48 p.m.

...

...
I think that's looking OK for threads.

...

Actually no, that code *on it's own* is most definitely not thread safe. Anywhere you have a function scope: static const UDT my_object. where UDT is not a POD, then initialization of my_object is subject to race conditions (in C++03 anyway, C++11 and recent GCC versions get this right I believe). What's more if you try to fix the race condition using the "double checked looking idiom", it's still not thread safe, you actually do have to grab a mutex with every call to the function :-(

...

However, there's a trick I stole from Boost.Pool employed here, in the:

...

constant_initializer<T, &get_constant_e<T> >::do_nothing();

<snip> Thanks, John. That constant_initializer sure is a nice programming element. So, let me get this straight. The instantiation of a template such as get_constant_ln2() causes the compiler to notice that it needs to instantiate one (and only one) instantiation of template "long_type"::init in the first translation unit in which it is encountered. C++ has a one-time deal for template instantiation of non-local class-type statics, and the thing ends up being initialized once with ctor code in the pre-main() initialization. Is that what you are talking about? OK. But finally, what if we bombard get_constant_ln2() with an MP type of run-time dynamic precision, first with 100 decimal digits, then with 200 decimal digits? Who tells get_constant_ln2() that it needs to re-calculate ln2? Or is this just a matter of documentation, saying, in effect, "don't do this!". Best regards, Chris.

John Maddock

6 Apr 6 Apr

7:33 a.m.

...

So, let me get this straight. The instantiation of a template such as get_constant_ln2() causes the compiler to notice that it needs to instantiate one (and only one) instantiation of template "long_type"::init in the first translation unit in which it is encountered. C++ has a one-time deal for template instantiation of non-local class-type statics, and the thing ends up being initialized once with ctor code in the pre-main() initialization.

Is that what you are talking about?

I think so ;-)

...

OK. But finally, what if we bombard get_constant_ln2() with an MP type of run-time dynamic precision, first with 100 decimal digits, then with 200 decimal digits? Who tells get_constant_ln2() that it needs to re-calculate ln2? Or is this just a matter of documentation, saying, in effect, "don't do this!".

For cpp_dec_float they're all different types, so cpp_dec_float<50> instantiates a different set of templates (and different value/cache for ln2) to cpp_dec_float<100>. mpf_float<0> is the only type we have that has variable precision at runtime, and the std lib routines that call these constants aren't supported in that use case (which should be documented somewhere - I'll check). And they're not supported largely because of the caching mechanism. You should get a compiler error if you try and use that type with those templates, but I'll double check that. John. PS Although it's possible to support variable runtime precision, it complicates things enough not to support it unless folks get really worked up about it ;-)

Christopher Kormanyos

11 a.m.

<snip>

...

mpf_float<0> is the only type we have that has variable precision at runtime, and the std lib routines that call these constants aren't supported in that use case (which should be documented somewhere - I'll check). And they're not supported largely because of the caching mechanism. You should get a compiler error if you try and use that type with those templates, but I'll double check that. John.

Got it!

...

PS Although it's possible to support variable runtime precision, it complicates things enough not to support it unless folks get really worked up about it ;-)

I categorically reject dynamic digit modification but do use precision control in limited instances such as Newton iteration. But you're right. It's a can of worms. Best regards, Chris.

Christopher Kormanyos

7 Apr 7 Apr

11:51 p.m.

<snip>

...

Ah, good point, oversight on my part. I've just committed an update that uses your 1100 digit values to initialize when possible. John.

Hi John, This caused a weird repercussion. When we define constants with hundreds or thousands of digits of precision in Boost.Multiprecision, this is in conflict with boost::math::constants, which has only tens or maybe a hundred digits. I instantiated an mp_number<cpp_dec_float> from the value of boost::math::constants::e() today and got only maybe 100 decimal digits of precision. This is a very subtle, hard to catch detail. It's easy to see in the source, but confusing when interacting with Boost.Multiprecision and Boost.Math. I don't know where to patch this one. Unfortunately, we might need to add more digits to Boost.Math. This seems to be string-based and can potentially be synchronized with our new 1100 digit strings. Then we might want to consider a common string pool for Boost.Math and Boost.Multiprecision so as to avoid multiple upkeep of strings. Eventually, I will *hammer* Boost.Math with 1,000 digits. We will find issues. Best regards, Chris.

John Maddock

8 Apr 8 Apr

9:05 a.m.

...

This caused a weird repercussion. When we define constants with hundreds or thousands of digits of precision in Boost.Multiprecision, this is in conflict with boost::math::constants, which has only tens or maybe a hundred digits.

That's true of the last release, but has been fixed in SVN Trunk to include both many more constants (thanks to yourself and Paul), and effectively unlimited precision (by calculating on the fly when needed).

...

Eventually, I will *hammer* Boost.Math with 1,000 digits. We will find issues.

Bring it on ;-) Cheers, John.

Andreas Pokorny

8:02 p.m.

Hi, I am not sure whether it fits into the scope of the Multiprecision Library. In my graduation work I implemented a small library that did transformations on expression templates. The expression templates supported matrix operations. You could write something like result = A*b + C*d; where A and C are matrices and b and d are vector. If nothing is specified the expression is evaluated at the assignment. But one could also write: result[gmp<1024>] = A*b + C*d; result[k_fold<6>] = A*b + C*d; or even result[accurate] = A*b + C*d; gmp stands for the GMP backend, k_fold for k - fold precision of the source type. Implemented by k fold accumulations of errors. Accurate sum refers to a algorithm invented and published by S.M Rump T. Ogita et al in a paper called "Accurate Floating Point Summation". The accurate implementation executes the summations by traversing capturing only the values with relevant mantiassas in the current exponent regions, until no further values are contributed. As a result, "normal" input performs nearly as fast as a recursive sum. Ill conditioned input takes as many iterations as needed to reach a result in the source precision. Well, in my tests k_fold<8> performed better than gmp, and accurate similar to k_fold<4> execution. So in my opinion if the range of 64 and 32 floating point is detailed enough tuning the calculations that are prone to ill conditioned floating point summations, is usually better than using a library that does floating point calculations with a higher precision. regards Andreas Pokorny

4854

Age (days ago)

4863

Last active (days ago)

List overview

Download

76 comments

14 participants

participants (14)

Andreas Pokorny
Andrii Sydorchuk
Beman Dawes
Christof Donat
Christopher Kormanyos
Daniel Trebbien
Jeffrey Lee Hellrung, Jr.
John Maddock
Neal Becker
Paul A. Bristow
Ronald Garcia
Thomas Heller
Topher Cooper
Vicente J. Botet Escriba