[xint] Boost.Move vs Copy-on-Write timings

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've done some further testing, eliminating the Boost.Thread locking on the random number generator as a factor, as promised earlier today. Here are the raw timings: (Debug build) Baseline: 29.4, 36.0, 32.8 (32.73) GCC move: 32.0, 35.3, 31.2 (32.83) Emulated Move: 28.4, 33.8, 28.4 (30.20) Copy-on-write: 27.3, 25.0, 26.9 (26.40) (Release build) Baseline: 10.5, 12.5, 10.6 (11.20) GCC move: 11.3, 13.0, 10.4 (11.57) Emulated move: 10.0, 10.8, 9.7 (10.17) Copy-on-write: 8.5, 8.4, 9.4 ( 8.77) Times are measured in seconds, using two sets of ten 2048-bit integers. The tests consisted of adding each of those pairs of numbers together 10,000 times, multiplying them 10,000 times, and doing a "mulmod" operation on them (with a third randomly-generated 2048-bit modulus argument) 1,000 times. Each test was run three times, under as close to identical situations as I could manage, with the raw results (rounded to the nearest tenth of a second) in the first three columns; the fourth column is the average of the three. The compiler used was GCC 4.4.3, under Ubuntu Linux 10.04. The release build used -O2 optimization; the debug build used none. Otherwise, the settings were left at their defaults, except where noted below. The baseline configuration had neither copy-on-write nor any kind of move semantics enabled. The "emulated move" rows used the baseline configuration plus Boost.Move in C++03 emulation mode, with no copy-on-write; the "GCC move" rows indicate the exact same configuration, except that I used "-std=c++0x" as well (so presumably Boost.Move would switch to compiler-supplied move semantics). The copy-on-write rows used the baseline configuration, but enabled my copy-on-write system. Using the calculation... (baseline_time - time) / baseline_time ... copy-on-write provided a 19.3% and 21.7% speed increase over the baseline settings. Boost.Move emulation provided 7.7% and 9.2%. GCC-provided move semantics were, surprisingly, the worst performers, coming in at 0.3% and 3.3% *worse* than the baseline performance. I'm assuming that that code hasn't been optimized yet, and that it will eventually be a lot faster. Conclusions: * The copy-on-write system is faster than Boost.Move emulation by a respectable amount, and an even greater one than my earlier tests indicated. * The non-thread-safe version is *not* twice the speed of the thread-safe one, as I erroneously indicated before. Much of the difference must have come from the random number locking. It's still markedly faster, as it uses copy-on-write and the thread-safe version can only use Boost.Move, but only by about 12%. Am I missing any statistics? Does anyone want the test code to reproduce this and look for errors in my test setup? It's only a small modification from the code in the sandbox and vault, plus the test function. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvd5ZwACgkQp9x9jeZ9/wTnnQCfYt1MrUNFGnYKNJImYKuuryDf HfYAoIrXeBgsT25cCwI/oiIAShSyvyh0 =nZxq -----END PGP SIGNATURE-----

----- Original Message ----- From: "Chad Nelson" <chad.thecomfychair@gmail.com> To: <boost@lists.boost.org> Sent: Sunday, May 02, 2010 10:50 PM Subject: [boost] [xint] Boost.Move vs Copy-on-Write timings
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I've done some further testing, eliminating the Boost.Thread locking on the random number generator as a factor, as promised earlier today. Here are the raw timings:
(Debug build) Baseline: 29.4, 36.0, 32.8 (32.73) GCC move: 32.0, 35.3, 31.2 (32.83) Emulated Move: 28.4, 33.8, 28.4 (30.20) Copy-on-write: 27.3, 25.0, 26.9 (26.40)
(Release build) Baseline: 10.5, 12.5, 10.6 (11.20) GCC move: 11.3, 13.0, 10.4 (11.57) Emulated move: 10.0, 10.8, 9.7 (10.17) Copy-on-write: 8.5, 8.4, 9.4 ( 8.77)
Times are measured in seconds, using two sets of ten 2048-bit integers. The tests consisted of adding each of those pairs of numbers together 10,000 times, multiplying them 10,000 times, and doing a "mulmod" operation on them (with a third randomly-generated 2048-bit modulus argument) 1,000 times. Each test was run three times, under as close to identical situations as I could manage, with the raw results (rounded to the nearest tenth of a second) in the first three columns; the fourth column is the average of the three.
Could you share the test programm so we can test with other compilers or architectures? Could you add some figures? * How many xint::integer copies? * How many xint::integer moves? * How many writes? 0?
The compiler used was GCC 4.4.3, under Ubuntu Linux 10.04. The release build used -O2 optimization; the debug build used none. Otherwise, the settings were left at their defaults, except where noted below.
Could you give figures with -O3? or bjam variant=release? Best, Vicente

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/02/2010 05:10 PM, vicente.botet wrote:
Could you share the test programm so we can test with other compilers or architectures?
Certainly. I've just uploaded the updated code to the sandbox, at <https://svn.boost.org/svn/boost/sandbox/xint>. See the new PERFORMANCE TESTING.txt file in the main directory for information on reproducing the tests. Let me know if you have any trouble with the setup. I don't use bjam for my own testing (too slow, I've got a Code::Blocks project that duplicates its functionality for my own local testing), so the jamfiles might need some additional tweaking.
Could you add some figures? * How many xint::integer copies? * How many xint::integer moves? * How many writes? 0?
It's not set up to report any of that yet. I'll see if I can get that into the code tonight; if not, it may be Tuesday before I can get to it.
The compiler used was GCC 4.4.3, under Ubuntu Linux 10.04. The release build used -O2 optimization; the debug build used none. Otherwise, the settings were left at their defaults, except where noted below.
Could you give figures with -O3? or bjam variant=release?
Probably not tonight, sorry. But once you've got the code, you should be able to test those easily enough. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkveAlsACgkQp9x9jeZ9/wQwEwCguDqQlZrVdo3xsHW5u1T3fMB4 q+AAnA3Pr8KCPxYrJ3ZGaApbwnNJua6W =x9ww -----END PGP SIGNATURE-----

[I seem to have lost the original email with the timings comparing COW and move semantics, which was probably the proper email to reply to keep the thread-tree correct, but I'm hoping this is close enough...] On 05/02/2010 03:53 PM, Chad Nelson wrote:
On 05/02/2010 05:10 PM, vicente.botet wrote:
Could you share the test programm so we can test with other compilers or architectures?
Certainly. I've just uploaded the updated code to the sandbox, at <https://svn.boost.org/svn/boost/sandbox/xint>. See the new PERFORMANCE TESTING.txt file in the main directory for information on reproducing the tests. [...]
I've looked briefly through the code. I'm not entirely sure where all the "base" arithmetic functions are defined, but I do see that primitives.cpp contains at least some of them. It seems like the general template you're using is void function(base_integer& target, const base_integer& argument0, const base_integer& argument1, ...) { integer result; // ...populate result with the...well...result of the function... result._cleanup(); target._attach(result); } With only guesses as to the semantics of _cleanup and _attach (perhaps you can help in that regard), I suspect, in the absence of COW and regardless of movability of integer and base_integer, you'd be making spurious and unneeded copies of the result. Can you please explain the rationale for the intermediate "result" integer, when you have the "target" reference sitting right there from the get-go? Does this have to do with base_integer trying to account for both fixed-width and variable-width integers? - Jeff

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 01:32 AM, Jeffrey Lee Hellrung, Jr. wrote:
I've looked briefly through the code. I'm not entirely sure where all the "base" arithmetic functions are defined, but I do see that primitives.cpp contains at least some of them.
They're in the files that most logically support them. For example, detail::gcd and detail::lcm are in gcd.cpp.
It seems like the general template you're using is
void function(base_integer& target, const base_integer& argument0, const base_integer& argument1, ...) { integer result; // ...populate result with the...well...result of the function... result._cleanup(); target._attach(result); }
With only guesses as to the semantics of _cleanup and _attach (perhaps you can help in that regard),
Certainly. _cleanup reduces the length by skipping any zeros that are now in the upper portion of the integer, ensures that zero values are not marked negative, and (for fixed_integer types) ensures that the topmost digit_t is limited to the proper number of bits by ANDing it with a mask value. It's a very fast function. You can find the definition in base_integer.cpp, if you want to confirm its efficiency. _attach attaches the result to the target by adjusting the pointers where possible, and copying if necessary. Copying is usually only necessary for fixed_integer types, and only in some of the detail functions.
I suspect, in the absence of COW and regardless of movability of integer and base_integer, you'd be making spurious and unneeded copies of the result.
I don't think so, but if you can show me otherwise, please do.
Can you please explain the rationale for the intermediate "result" integer, when you have the "target" reference sitting right there from the get-go?
'target' may refer to a fixed_integer that is not large enough to hold the entire result. For functions where this limitation might severely reduce the time it takes to come up with the answer (such as multiply), I've taken steps to limit the calculations, but for others (such as addition, where the result could be at most one extra digit_t) it is faster and cleaner to simply let the _attach function deal with truncating the result to the proper size.
Does this have to do with base_integer trying to account for both fixed-width and variable-width integers?
That's the primary reason for that design, yes. Since the detail functions all take base_integer references, they can be shared by all the integer types, and the _attach call at the end is often the only piece that needs to worry about the differences. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvfu3UACgkQp9x9jeZ9/wTclQCffHxg+943lbpu3Dta1C3sb0hu 1jwAoO5uHs3RUSfFWdI+zGkPSlBFunW1 =Bf4i -----END PGP SIGNATURE-----

On 5/3/2010 11:15 PM, Chad Nelson wrote:
On 05/04/2010 01:32 AM, Jeffrey Lee Hellrung, Jr. wrote:
I've looked briefly through the code. I'm not entirely sure where all the "base" arithmetic functions are defined, but I do see that primitives.cpp contains at least some of them.
They're in the files that most logically support them. For example, detail::gcd and detail::lcm are in gcd.cpp.
Actually I was speaking more toward base_integer::_add, base_integer::_subtract, etc., which I don't see defined in base_integer.hpp or base_integer.cpp...
It seems like the general template you're using is
void function(base_integer& target, const base_integer& argument0, const base_integer& argument1, ...) { integer result; // ...populate result with the...well...result of the function... result._cleanup(); target._attach(result); }
With only guesses as to the semantics of _cleanup and _attach (perhaps you can help in that regard), [...] _attach attaches the result to the target by adjusting the pointers where possible, and copying if necessary. Copying is usually only necessary for fixed_integer types, and only in some of the detail functions.
I suspect, in the absence of COW and regardless of movability of integer and base_integer, you'd be making spurious and unneeded copies of the result.
I don't think so, but if you can show me otherwise, please do.
Well, _attach does have some allocation function calls, but I really have no idea how the logic branches when COW is turned off (and hence whether those allocations are actually invoked). Perhaps you can convince me that those allocations are never called when COW is turned off...?
Can you please explain the rationale for the intermediate "result" integer, when you have the "target" reference sitting right there from the get-go?
'target' may refer to a fixed_integer that is not large enough to hold the entire result. For functions where this limitation might severely reduce the time it takes to come up with the answer (such as multiply), I've taken steps to limit the calculations, but for others (such as addition, where the result could be at most one extra digit_t) it is faster and cleaner to simply let the _attach function deal with truncating the result to the proper size.
I see. I feel like a better design would give more abstraction to the arithmetic algorithms. E.g., if I want to add 2 integers, the actual implementation of addition would take a couple of (pointer, length) pairs and a pointer for output (possibly with an optional maximal length), it being the callee's responsibility (of course) to ensure the output has enough space (up to the maximal length, if given). This might seem too C-ish, but it is the common denominator among all your integer types (they all boil down to points to an array of digits), and it would make your arithmetic algorithms usable for a statically-allocated integer type as well (e.g., one built on a boost::array< digit_t, n >). Carrying on this discussion might be more appropriate in a separate thread, however; one topic at a time. - Jeff

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 02:38 AM, Jeffrey Lee Hellrung, Jr. wrote:
I've looked briefly through the code. I'm not entirely sure where all the "base" arithmetic functions are defined, but I do see that primitives.cpp contains at least some of them.
They're in the files that most logically support them. For example, detail::gcd and detail::lcm are in gcd.cpp.
Actually I was speaking more toward base_integer::_add, base_integer::_subtract, etc., which I don't see defined in base_integer.hpp or base_integer.cpp...
They're not defined there because they don't exist. :-) base_integer is mostly concerned with small shared member functions (like _log2 and _get_length) and memory handling, it doesn't have any mathematical operations code in it, other than _increment and _decrement. That's all handled by free functions in namespace detail, like detail::add (in primitives.cpp).
I suspect, in the absence of COW and regardless of movability of integer and base_integer, you'd be making spurious and unneeded copies of the result.
I don't think so, but if you can show me otherwise, please do.
Well, _attach does have some allocation function calls, but I really have no idea how the logic branches when COW is turned off (and hence whether those allocations are actually invoked). Perhaps you can convince me that those allocations are never called when COW is turned off...?
They aren't supposed to be, but tracing through the code, I don't see any way they won't be. :-( The problem is that the 'adopt' variable, at the top, isn't being set to true because the data that it's supposed to adopt never has a copy-count of zero... after getting distracted by other things, I never finished the move-emulation code that I wrote that for. Well, that explains at least part of why the copy-on-write code measures faster. I'll fix that tomorrow and re-measure everything.
Can you please explain the rationale for the intermediate "result" integer, when you have the "target" reference sitting right there from the get-go?
'target' may refer to a fixed_integer that is not large enough to hold the entire result. [...]
I see. I feel like a better design would give more abstraction to the arithmetic algorithms. E.g., if I want to add 2 integers, the actual implementation of addition would take a couple of (pointer, length) pairs and a pointer for output (possibly with an optional maximal length), it being the callee's responsibility (of course) to ensure the output has enough space (up to the maximal length, if given). This might seem too C-ish, but it is the common denominator among all your integer types (they all boil down to points to an array of digits), and it would make your arithmetic algorithms usable for a statically-allocated integer type as well (e.g., one built on a boost::array< digit_t, n >).
That might indeed be a better design. And if I do adopt the CRTP design, it might be the only way, since if I understand it correctly, there won't be a non-template base class that's shared by all of the integer types anymore.
Carrying on this discussion might be more appropriate in a separate thread, however; one topic at a time.
Feel free to start one if you like. I don't think there's any need though, I know exactly what you're describing. I'll finish this one by reporting the new timings, hopefully tomorrow. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvfy5EACgkQp9x9jeZ9/wSQ8gCfdhLP0LCjbZHj/6d01Ibh8aYC XC4An1mk+vtxc9trsjzZ4+Xl3KQIVPMR =I+FP -----END PGP SIGNATURE-----

Chad Nelson wrote:
That might indeed be a better design. And if I do adopt the CRTP design, it might be the only way, since if I understand it correctly, there won't be a non-template base class that's shared by all of the integer types anymore.
Using CRTP doesn't preclude a non-templated base class. It would be the base of the class template with its derivate as a parameterizing type. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 07:22 AM, Stewart, Robert wrote:
That might indeed be a better design. And if I do adopt the CRTP design, it might be the only way, since if I understand it correctly, there won't be a non-template base class that's shared by all of the integer types anymore.
Using CRTP doesn't preclude a non-templated base class. It would be the base of the class template with its derivate as a parameterizing type.
If I understand it correctly, then yes, I could use a non-templated base class, but that base class wouldn't be usable the way that base_integer is used now, for the parameters to the functions that implement the math operations, so that those functions could be shared between all three types. Or am I missing something? - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgQhcACgkQp9x9jeZ9/wST/gCgpq0X9wt8JW8Vs1jo6G9y1j9F gbgAoMPV0cpJ5PoLX4LE7N2LEQt7pmPk =DdRu -----END PGP SIGNATURE-----

Jeffrey Lee Hellrung, Jr. wrote:
I see. I feel like a better design would give more abstraction to the arithmetic algorithms. E.g., if I want to add 2 integers, the actual implementation of addition would take a couple of (pointer, length) pairs and a pointer for output (possibly with an optional maximal length), it being the callee's responsibility (of course) to ensure the output has enough space (up to the maximal length, if given). This might seem too C-ish, but it is the common denominator among all your integer types (they all boil down to points to an array of digits), and it would make your arithmetic algorithms usable for a statically-allocated integer type as well (e.g., one built on a boost::array< digit_t, n >). Carrying on this discussion might be more appropriate in a separate thread, however; one topic at a time.
It might be interesting to make all your types convertible to views (same as your integers, but only behaving as references to them), and have your algorithms take views. That way you wouldn't necessarily need the dynamic allocation.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 06:07 AM, Mathias Gaunard wrote:
It might be interesting to make all your types convertible to views (same as your integers, but only behaving as references to them), and have your algorithms take views. That way you wouldn't necessarily need the dynamic allocation.
I'm not sure I understand what you're suggesting. If views are the same as my integers, then they're essentially the same as base_integer as well, so that's what I'm already doing. Or am I missing something? - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgP6QACgkQp9x9jeZ9/wSxJQCgojWaNdSh3U/x0v4aQKPAWWtI rigAoPTxfKcLsvG6Pf9EwdGxekw1WO5F =5icD -----END PGP SIGNATURE-----

Le 04/05/2010 16:39, Chad Nelson a écrit :
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 05/04/2010 06:07 AM, Mathias Gaunard wrote:
It might be interesting to make all your types convertible to views (same as your integers, but only behaving as references to them), and have your algorithms take views. That way you wouldn't necessarily need the dynamic allocation.
I'm not sure I understand what you're suggesting. If views are the same as my integers, then they're essentially the same as base_integer as well, so that's what I'm already doing. Or am I missing something?
They do not own anything, they're simply thin wrappers on top of the pointer, and you can construct them from the pointer and whatever information you need as well. That way the user does not need to allocate his memory with dynamic memory allocation -- or even with your type -- to run your arbitrary precision algorithms.

On 5/4/2010 11:01 AM, Mathias Gaunard wrote:
Le 04/05/2010 16:39, Chad Nelson a écrit :
On 05/04/2010 06:07 AM, Mathias Gaunard wrote:
It might be interesting to make all your types convertible to views (same as your integers, but only behaving as references to them), and have your algorithms take views. That way you wouldn't necessarily need the dynamic allocation.
I'm not sure I understand what you're suggesting. If views are the same as my integers, then they're essentially the same as base_integer as well, so that's what I'm already doing. Or am I missing something?
They do not own anything, they're simply thin wrappers on top of the pointer, and you can construct them from the pointer and whatever information you need as well.
That way the user does not need to allocate his memory with dynamic memory allocation -- or even with your type -- to run your arbitrary precision algorithms.
+1, and going the natural step further, build these algorithms off a view *concept*, rather then hard-coding a specific view class, for maximum flexibility. The requirements you need *might* be as simple as existing range requirements in many cases (see Boost.Range). - Jeff

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 02:28 PM, Jeffrey Lee Hellrung, Jr. wrote: [...]
That way the user does not need to allocate his memory with dynamic memory allocation -- or even with your type -- to run your arbitrary precision algorithms.
+1, and going the natural step further, build these algorithms off a view *concept*, rather then hard-coding a specific view class, for maximum flexibility. The requirements you need *might* be as simple as existing range requirements in many cases (see Boost.Range).
I'm afraid they'll be slightly more complex, as the view will need to know the sign of the integer as well (and be able to alter it, for mutable arguments). So I'll probably need to hard-code at least one view class anyway. We'll see when I get to it, which I hope will be tonight. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgufwACgkQp9x9jeZ9/wRx2QCeNi6zfWRFmfIhuV3CFTVjNa1D JlsAoPRKs3raRP54ZihNogVqkukZ0p3O =+k/l -----END PGP SIGNATURE-----

On 05/04/2010 05:21 PM, Chad Nelson wrote:
On 05/04/2010 02:28 PM, Jeffrey Lee Hellrung, Jr. wrote:
[...]
That way the user does not need to allocate his memory with dynamic memory allocation -- or even with your type -- to run your arbitrary precision algorithms.
+1, and going the natural step further, build these algorithms off a view *concept*, rather then hard-coding a specific view class, for maximum flexibility. The requirements you need *might* be as simple as existing range requirements in many cases (see Boost.Range).
I'm afraid they'll be slightly more complex, as the view will need to know the sign of the integer as well (and be able to alter it, for mutable arguments). So I'll probably need to hard-code at least one view class anyway. We'll see when I get to it, which I hope will be tonight.
Yes, you're right, start with a concrete view class, just with the additional abstraction to a view concept in the back of your mind. Later you can add support for a more concept-based implementation it if it's deemed useful. - Jeff

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/05/2010 12:06 AM, Jeffrey Lee Hellrung, Jr. wrote:
[...]
+1, and going the natural step further, build these algorithms off a view *concept*, rather then hard-coding a specific view class, for maximum flexibility. The requirements you need *might* be as simple as existing range requirements in many cases (see Boost.Range).
I'm afraid they'll be slightly more complex, as the view will need to know the sign of the integer as well (and be able to alter it, for mutable arguments). So I'll probably need to hard-code at least one view class anyway. We'll see when I get to it, which I hope will be tonight.
Yes, you're right, start with a concrete view class, just with the additional abstraction to a view concept in the back of your mind. [...]
Did so. I've got a mutable_view_t and a const_view_t now, and I've switched three interlocking functions to using them. I've got some debugging to do on them though, and no time to finish it this morning. :-( - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvhdSgACgkQp9x9jeZ9/wSndACg0ORGb4ujco7SUGS1aCqgD5XP dvUAn38VHyCMs03UZTLNYpO6S0VlCl32 =t8zN -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 02:01 PM, Mathias Gaunard wrote:
I'm not sure I understand what you're suggesting. If views are the same as my integers, then they're essentially the same as base_integer as well, so that's what I'm already doing. Or am I missing something?
They do not own anything, they're simply thin wrappers on top of the pointer, and you can construct them from the pointer and whatever information you need as well.
That way the user does not need to allocate his memory with dynamic memory allocation -- or even with your type -- to run your arbitrary precision algorithms.
I see. Close to what I have, but not exactly. Pretty much the same as what Jeffrey Hellrung was suggesting, then. I'd planned on something like that... with any luck, I can get some of that done tonight, I won't have much time to work on the library tomorrow. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgrhIACgkQp9x9jeZ9/wRP9ACePAOjw1Ruo78aT0jeOye5Ongf 97cAn3afAdgwKIjHoxzDFVAOrfFv3PGB =MY4L -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 02:38 AM, Jeffrey Lee Hellrung, Jr. wrote:
I've looked briefly through the code. I'm not entirely sure where all the "base" arithmetic functions are defined, but I do see that primitives.cpp contains at least some of them.
They're in the files that most logically support them. For example, detail::gcd and detail::lcm are in gcd.cpp.
Actually I was speaking more toward base_integer::_add, base_integer::_subtract, etc., which I don't see defined in base_integer.hpp or base_integer.cpp...
Ah, now I see why you expected to see them... I didn't realize they were still declared in base_integer. Sorry about that, that's a leftover from a previous version of the library. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgGggACgkQp9x9jeZ9/wTNhgCbBybVDqKNT+ARZ3olJkq0q3Mm K1QAnA//WRH5+u3hrcEEX7F66TZNeUY6 =TFGG -----END PGP SIGNATURE-----

On 5/4/2010 5:58 AM, Chad Nelson wrote:
On 05/04/2010 02:38 AM, Jeffrey Lee Hellrung, Jr. wrote:
Actually I was speaking more toward base_integer::_add, base_integer::_subtract, etc., which I don't see defined in base_integer.hpp or base_integer.cpp...
Ah, now I see why you expected to see them... I didn't realize they were still declared in base_integer. Sorry about that, that's a leftover from a previous version of the library.
Yes, that is why I asked about them. And I see the free functions "add", "subtract", etc., are just farther down primitives.cpp. - Jeff

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 06:08 AM, Mathias Gaunard wrote:
_attach attaches the result to the target by adjusting the pointers where possible, and copying if necessary. Copying is usually only necessary for fixed_integer types, and only in some of the detail functions.
Sounds like what operator= does.
operator= is implemented by a call to _attach, in all three of the integer types. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgP9cACgkQp9x9jeZ9/wRxjACgvXK/ssy/yxW4euC4g+DnYZod ydEAn1wUX6u4cDvTc17TSdr+f5WQ/4iC =+xAD -----END PGP SIGNATURE-----

Le 04/05/2010 16:40, Chad Nelson a écrit :
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 05/04/2010 06:08 AM, Mathias Gaunard wrote:
_attach attaches the result to the target by adjusting the pointers where possible, and copying if necessary. Copying is usually only necessary for fixed_integer types, and only in some of the detail functions.
Sounds like what operator= does.
operator= is implemented by a call to _attach, in all three of the integer types.
Then what's the point of providing two names for the same function?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 02:02 PM, Mathias Gaunard wrote:
On 05/04/2010 06:08 AM, Mathias Gaunard wrote:
_attach attaches the result to the target by adjusting the pointers where possible, and copying if necessary. Copying is usually only necessary for fixed_integer types, and only in some of the detail functions.
Sounds like what operator= does.
operator= is implemented by a call to _attach, in all three of the integer types.
Then what's the point of providing two names for the same function?
_attach works across all three integer types, and doesn't need to return anything. operator= needs to return the type that it converts to, and thus has to be defined separately for all three types. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgrnEACgkQp9x9jeZ9/wROogCfX8q99B6pX+WRdcRt4fq1OkAa SgsAoPgAYjIwfBYajFrQiY05ZJFEmCZ7 =gzTc -----END PGP SIGNATURE-----

Chad Nelson wrote:
Times are measured in seconds, using two sets of ten 2048-bit integers. The tests consisted of adding each of those pairs of numbers together 10,000 times, multiplying them 10,000 times, and doing a "mulmod" operation on them (with a third randomly-generated 2048-bit modulus argument) 1,000 times. Each test was run three times, under as close to
Move semantics can optimize away some allocations that copy on write can't, such as for example w = x * y + z; (assuming that x*y already has enough bits to store x*y+z). But it doesn't have to be either-or; you can move even when copy on write is used.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/02/2010 06:43 PM, Peter Dimov wrote:
Times are measured in seconds, using two sets of ten 2048-bit integers. The tests consisted of adding each of those pairs of numbers together 10,000 times, multiplying them 10,000 times, and doing a "mulmod" operation on them (with a third randomly-generated 2048-bit modulus argument) 1,000 times. Each test was run three times, under as close to
Move semantics can optimize away some allocations that copy on write can't, such as for example
w = x * y + z;
(assuming that x*y already has enough bits to store x*y+z).
The copy-on-write code should optimize that allocation away too, under those circumstances.
But it doesn't have to be either-or; you can move even when copy on write is used.
I tested that in my original tests, but it didn't seem to be useful. The results were halfway between move-only and copy-on-write-only, so I eliminated it for this round. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkveBP0ACgkQp9x9jeZ9/wTrGgCgi5rtDWC1TJczPq3/dtqMnmbz i8cAoNLkc6oqgPbEb5AtojghAr3WLi0Z =2m94 -----END PGP SIGNATURE-----

On 05/02/2010 04:04 PM, Chad Nelson wrote:
On 05/02/2010 06:43 PM, Peter Dimov wrote:
Move semantics can optimize away some allocations that copy on write can't, such as for example
w = x * y + z;
(assuming that x*y already has enough bits to store x*y+z).
The copy-on-write code should optimize that allocation away too, under those circumstances.
I don't think so, unless your arithmetic operators use by-value parameters (allowing copies to be elided). Assuming you're not using expression templates, I would expect an implementation with no move semantics to translate "w = x * y + z" to be, effectively, T t0 = x * y; w = t0 + z; which requires 2 allocations minimum (one to store the result of x * y, another to store the result of t0 + z). But, again, if operator+ takes its parameters by-value, you can probably get some copies elided in practice and detect that the first argument to operator+ has unique ownership, hence is modifiable. What's your allocation count for this operation? Note that move semantics, on the other hand, can directly detect that t0 is just a temporary (by overloading operator+ on both a reference-to-const and an (emulated) rvalue reference), hence the t0 + z can be safely replaced with t0 += z (which will possibly avoid a reallocation), and the then the new t0 would be moved into w. - Jeff

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/02/2010 07:17 PM, Jeffrey Lee Hellrung, Jr. wrote:
Move semantics can optimize away some allocations that copy on write can't [...]
The copy-on-write code should optimize that allocation away too, under those circumstances.
I don't think so, unless your arithmetic operators use by-value parameters (allowing copies to be elided). Assuming you're not using expression templates, I would expect an implementation with no move semantics to translate "w = x * y + z" to be, effectively,
T t0 = x * y; w = t0 + z;
which requires 2 allocations minimum (one to store the result of x * y, another to store the result of t0 + z).
Hm... looking at it more closely, I think you're right. There's no way for the copy code to know that t0 isn't going to be used again, so it would still allocate a temporary for the addition.
But, again, if operator+ takes its parameters by-value, you can probably get some copies elided in practice and detect that the first argument to operator+ has unique ownership, hence is modifiable.
I'm not sure that's right, because if I set it to take its parameters by value, it would either have to deep-copy those parameters or use copy-on-write, making the reference count for copy-on-write two, and preventing it from detecting that either parameter is unique (and thus temporary). The variables used for the parameters could be used again later on, so it's still not safe to modify them.
What's your allocation count for this operation?
Exactly what you said: two allocations.
Note that move semantics, on the other hand, can directly detect that t0 is just a temporary (by overloading operator+ on both a reference-to-const and an (emulated) rvalue reference), hence the t0 + z can be safely replaced with t0 += z (which will possibly avoid a reallocation), and the then the new t0 would be moved into w.
I must be missing a trick then, because operator+ just takes constant references. The library would do those same allocations for the move stuff too, at present. operator+ and operator- would be the only ones that could use that trick, if I'm seeing things right. And it seems to me that I could emulate move semantics with my current design too, without going through Boost.Move... I was thinking about that before, when I thought I shouldn't use Boost.Move because it wasn't yet official. It might be worth reviving the thought. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkveJzoACgkQp9x9jeZ9/wRAqgCglNdO/gFhjcBctCrz1kgYD7IL NDEAoLJksHKJOZaEpv4H3jMdT7TI+OJI =YIBU -----END PGP SIGNATURE-----

On 05/02/2010 06:30 PM, Chad Nelson wrote:
On 05/02/2010 07:17 PM, Jeffrey Lee Hellrung, Jr. wrote:
I don't think so, unless your arithmetic operators use by-value parameters (allowing copies to be elided). Assuming you're not using expression templates, I would expect an implementation with no move semantics to translate "w = x * y + z" to be, effectively,
T t0 = x * y; w = t0 + z;
which requires 2 allocations minimum (one to store the result of x * y, another to store the result of t0 + z).
Hm... looking at it more closely, I think you're right. There's no way for the copy code to know that t0 isn't going to be used again, so it would still allocate a temporary for the addition.
But, again, if operator+ takes its parameters by-value, you can probably get some copies elided in practice and detect that the first argument to operator+ has unique ownership, hence is modifiable.
I'm not sure that's right, because if I set it to take its parameters by value, it would either have to deep-copy those parameters or use copy-on-write, making the reference count for copy-on-write two, and preventing it from detecting that either parameter is unique (and thus temporary). The variables used for the parameters could be used again later on, so it's still not safe to modify them.
If the compiler does a decent job at copy elision (and most recent ones do, as I understand), then your reference count will be just 1. E.g., if you have T f(); void g(T); and you call g(f()), any decent compiler will elide the copy from the result of f to the argument of g (MSVC even does this in debug mode, last time I investigated it). It's far from guaranteed, but that's why I said "in practice".
Note that move semantics, on the other hand, can directly detect that t0 is just a temporary (by overloading operator+ on both a reference-to-const and an (emulated) rvalue reference), hence the t0 + z can be safely replaced with t0 += z (which will possibly avoid a reallocation), and the then the new t0 would be moved into w.
I must be missing a trick then, because operator+ just takes constant references. The library would do those same allocations for the move stuff too, at present.
Precisely; you're not taking full advantage of (emulated) rvalue references and moving from / modifying temporaries. There's no rule that says you can't have 4 overloads of operator+: T operator+(const T&, const T&); T operator+(const T&, T&&); T operator+(T&&, const T&); // this would be called for x * y + z T operator+(T&&, T&&); Would be even better when we can overload member functions on rvalue-ness ;) But I have no idea how mature that proposal is... :/
operator+ and operator- would be the only ones that could use that trick, if I'm seeing things right. And it seems to me that I could
All the free operators / functions could be overloaded on lvalue-/rvalue-ness, not just operator+ and operator-. Any operation where you can potentially save reallocations by modifying the arguments directly qualifies for consideration.
emulate move semantics with my current design too, without going through Boost.Move... I was thinking about that before, when I thought I shouldn't use Boost.Move because it wasn't yet official. It might be worth reviving the thought.
That's certainly an option, though it sounds like Ion is prepping Boost.Move for final submission, so the review for it shouldn't be *too* far down the line... - Jeff

[Jeffrey Lee Hellrung, Jr.]
Would be even better when we can overload member functions on rvalue-ness ;) But I have no idea how mature that proposal is... :/
That's http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm which added ref-qualifiers and was voted into the Working Paper. However, it is not yet implemented in either VC10 or GCC 4.5, as you can see from http://blogs.msdn.com/vcblog/archive/2010/04/06/c-0x-core-language-features-... and http://gcc.gnu.org/gcc-4.5/cxx0x_status.html . Stephan T. Lavavej Visual C++ Libraries Developer

On 05/02/2010 06:50 PM, Stephan T. Lavavej wrote:
[Jeffrey Lee Hellrung, Jr.]
Would be even better when we can overload member functions on rvalue-ness ;) But I have no idea how mature that proposal is... :/
That's http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm which added ref-qualifiers and was voted into the Working Paper. However, it is not yet implemented in either VC10 or GCC 4.5, as you can see from http://blogs.msdn.com/vcblog/archive/2010/04/06/c-0x-core-language-features-... and http://gcc.gnu.org/gcc-4.5/cxx0x_status.html .
Stephan T. Lavavej Visual C++ Libraries Developer
Thanks for the references Stephan! You think in VC11, then??? - Jeff

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/02/2010 09:42 PM, Jeffrey Lee Hellrung, Jr. wrote:
I'm not sure that's right, because if I set it to take its parameters by value, it would either have to deep-copy those parameters or use copy-on-write, making the reference count for copy-on-write two, and preventing it from detecting that either parameter is unique (and thus temporary). The variables used for the parameters could be used again later on, so it's still not safe to modify them.
If the compiler does a decent job at copy elision (and most recent ones do, as I understand), then your reference count will be just 1. [...]
Hm. I can see how that would be possible. I'll have to check into it.
operator+ and operator- would be the only ones that could use that trick, if I'm seeing things right. And it seems to me that I could
All the free operators / functions could be overloaded on lvalue-/rvalue-ness, not just operator+ and operator-. Any operation where you can potentially save reallocations by modifying the arguments directly qualifies for consideration.
Exactly what I was saying -- operator+ and operator- are the only ones where it would be likely to save any allocations. The results from operator*, for example, will be larger than the bit-length of the largest parameter. Most other operations can't safely operate directly on their parameters, because they have to have the entire parameters available until the end. It's possible that the parameters (to something like operator*) would have enough allocated space from previous operations that one of them could be re-used, but it wouldn't happen very often compared to addition and subtraction, and it would save very little time compared to the time required to do the operation itself. I doubt that it would be worth the effort to implement. It could be iffy even for addition, where the answer only requires one bit more than the largest parameter at most.
emulate move semantics with my current design too, without going through Boost.Move... I was thinking about that before, when I thought I shouldn't use Boost.Move because it wasn't yet official. It might be worth reviving the thought.
That's certainly an option, though it sounds like Ion is prepping Boost.Move for final submission, so the review for it shouldn't be *too* far down the line...
Yup. We'll see. I won't worry about it until there's a good reason to, I've got plenty of other stuff I could usefully do. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkveL4cACgkQp9x9jeZ9/wSafACgtxvDpyoO22Ctahe+k5cQQUvD 1awAoNEPTFlv0uQs1tvPTEinT0JfBYH1 =K8G0 -----END PGP SIGNATURE-----

Chad Nelson wrote:
Exactly what I was saying -- operator+ and operator- are the only ones where it would be likely to save any allocations. The results from operator*, for example, will be larger than the bit-length of the largest parameter.
I take you don't support in-place expansion then. A shame.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 06:16 AM, Mathias Gaunard wrote:
Exactly what I was saying -- operator+ and operator- are the only ones where it would be likely to save any allocations. The results from operator*, for example, will be larger than the bit-length of the largest parameter.
I take you don't support in-place expansion then. A shame.
I wasn't aware that it was possible to use in-place expansion when allocating memory via new. - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgQC4ACgkQp9x9jeZ9/wTr+gCgqw/+/MK+XrC5JQRAkBXV7s/R gS8AoMKW9VrTYOeC2oAsPfW8ZeYvzfGx =aiB3 -----END PGP SIGNATURE-----

Le 04/05/2010 16:41, Chad Nelson a écrit :
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 05/04/2010 06:16 AM, Mathias Gaunard wrote:
Exactly what I was saying -- operator+ and operator- are the only ones where it would be likely to save any allocations. The results from operator*, for example, will be larger than the bit-length of the largest parameter.
I take [it] you don't support in-place expansion then. A shame.
I wasn't aware that it was possible to use in-place expansion when allocating memory via new.
You allocate the memory with new? That severely restricts the library. You should use allocators. There are nice extensions to standard allocators that Ion Gaztañaga has made that allow two-way in-place expansion and shrinking. <http://www.drivehq.com/web/igaztanaga/allocplus/> I would have loved to see that in that kind of project, but of course, that's no more than a wish of mine.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/2010 02:10 PM, Mathias Gaunard wrote:
I take [it] you don't support in-place expansion then. A shame.
I wasn't aware that it was possible to use in-place expansion when allocating memory via new.
You allocate the memory with new? That severely restricts the library. You should use allocators.
That was recommended (I think by Jeffrey Hellrung again). I hadn't looked into the implementation details yet, I've got a lot of internal work to finish before I can restructure it that way.
There are nice extensions to standard allocators that Ion Gaztañaga has made that allow two-way in-place expansion and shrinking. <http://www.drivehq.com/web/igaztanaga/allocplus/>
Noted. I'm not sure whether I can use it or not, but I'll certainly take a look.
I would have loved to see that in that kind of project, but of course, that's no more than a wish of mine.
One that may be granted. :-) - -- Chad Nelson Oak Circle Software, Inc. * * * -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvgr0cACgkQp9x9jeZ9/wQANACffBdulgTNtQD0OcoxZYCx5R8t 7pQAnjfBDIZkCFB1aKmAUXW0rzypBbcd =+Vie -----END PGP SIGNATURE-----

----- Original Message ----- From: "Chad Nelson" <chad.thecomfychair@gmail.com> To: <boost@lists.boost.org> Sent: Monday, May 03, 2010 1:04 AM Subject: Re: [boost] [xint] Boost.Move vs Copy-on-Write timings
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 05/02/2010 06:43 PM, Peter Dimov wrote:
Times are measured in seconds, using two sets of ten 2048-bit integers. The tests consisted of adding each of those pairs of numbers together 10,000 times, multiplying them 10,000 times, and doing a "mulmod" operation on them (with a third randomly-generated 2048-bit modulus argument) 1,000 times. Each test was run three times, under as close to
Move semantics can optimize away some allocations that copy on write can't, such as for example
w = x * y + z;
(assuming that x*y already has enough bits to store x*y+z).
The copy-on-write code should optimize that allocation away too, under those circumstances.
This expression should construct two -rvalues and one move and no assignement. x*y must result in a r-value which is passed to operator + r-value (x*y) + z must result in a r-value w = r-value(r-value (x*y) + z) restult in one move operation. If your implmentation doesnt do that you it is not surprissing that the move emulation is less performant than COW.
But it doesn't have to be either-or; you can move even when copy on write is used.
I tested that in my original tests, but it didn't seem to be useful. The results were halfway between move-only and copy-on-write-only, so I eliminated it for this round.
I agree with Peter that both mechanism are orthogonal. For example. x=y; y=a+b; ... should result in a COW for x=y; and a move for y=a+b; Best, Vicente
participants (7)
-
Chad Nelson
-
Jeffrey Lee Hellrung, Jr.
-
Mathias Gaunard
-
Peter Dimov
-
Stephan T. Lavavej
-
Stewart, Robert
-
vicente.botet