Re: [boost] [xint] Boost.XInt formal review

On 06.03.2011 05:23, Chad Nelson wrote:
I believe that arbitrary precision arithmetic library is very useful. But I don't think that in it's current status XInt is appropriate for boost.
After your reply I changes my mind. As I see you had considered all my suggestions long before I posted them. If they really lead to unavoidable performance penalty I think current implementation is a good trade-off. I vote for inclusion XInt into boost.
One more thought about "secure" option. There are no standard containers with this option. I think either both standard containers and integer_t should have such option or both shouldn't have. This is yet another argument for container parametrization.
I'm not sure I understand your argument. Are you saying that because there are no standard containers with a zero-memory-before-releasing option, that the library shouldn't provide one either?
Yes. Why integer_t should be an exception?

On Mar 6, 2011, at 5:22 AM, Ivan Sorokin wrote:
After your reply I changes my mind. As I see you had considered all my suggestions long before I posted them. If they really lead to unavoidable performance penalty I think current implementation is a good trade-off.
I vote for inclusion XInt into boost.
Seems like Chad's reply was off-list? Could the rest of us see what convinced Ivan?

On Sun, 6 Mar 2011 06:01:24 -0500 Gordon Woodhull <gordon@woodhull.com> wrote:
On Mar 6, 2011, at 5:22 AM, Ivan Sorokin wrote:
After your reply I changes my mind. As I see you had considered all my suggestions long before I posted them. If they really lead to unavoidable performance penalty I think current implementation is a good trade-off.
I vote for inclusion XInt into boost.
Seems like Chad's reply was off-list?
Not deliberately. I hit reply as usual, and didn't notice that it wasn't sent to the list like it usually is. Makes me wonder how many other replies of mine have gotten misaddressed. :-(
Could the rest of us see what convinced Ivan?
On Sun, 06 Mar 2011 02:10:50 +0300 Ivan Sorokin <sorokin@rain.ifmo.ru> wrote:
I believe that arbitrary precision arithmetic library is very useful. But I don't think that in it's current status XInt is appropriate for boost.
Regardless, thank you for your comments.
Below is the list of features which I believe should be improved. The list is sorted by importance.
1. Parametrize with container type. [...]
I actually tested earlier iterations with containers such as vector for their storage. The performance numbers were noticeably worse.
Another option could be wrapping containers into some entity with interface suitable for integer_t needs.
I haven't tried that, but it did occur to me as a way to allow the range interface that has been requested. Although again, I'd prefer not to do something like that until the internals have stabilized some.
One more thought about "secure" option. There are no standard containers with this option. I think either both standard containers and integer_t should have such option or both shouldn't have. This is yet another argument for container parametrization.
I'm not sure I understand your argument. Are you saying that because there are no standard containers with a zero-memory-before-releasing option, that the library shouldn't provide one either?
2. Signed Zero.
I'm sure that long integer arithmetic should behave as close to native integer numbers as possible. Native integer types don't have negative zero. As the documentation mentions, this is done "for use in a planned future unlimited-precision floating point library". Although the only function that behaves differently on negative and positive zero is sign(true), I still believe that this is redundant and planned floating point library should use it's own sign bit.
This was argued over passionately last year. I was on the don't-include-it side, but some very compelling reasons convinced me to do so, reluctantly. The clincher was that any implementation of the aforementioned floating-point library would be seriously hampered without it.
3. Alternative representation of negative numbers
Currently integer_t supports bit arithmetic, but its result differ from one of native integer types: [...]
For negative numbers, yes, because sign/magnitude is far more efficient than two's complement for most operations in this kind of library. (Addition and subtraction are the exceptions.)
I propose the following solution. Currently integer_t stores the sign bit and value bits (absolute value). Instead of the sign bit let integer_t hold a bit "inverse_value" with the following meaning: [...]
As an option, there might be some call for that. Using that as the default would adversely affect the performance, because many of the algorithms -- even basic ones like multiplication and division -- are geared toward sign/magnitude, necessitating conversion to it before every such operation and back after.
The implementation of addition and subtraction for this representation is not more complicatedthan for current one. But it allows bit arithmetic to work in the same fashion as it works with native integer types.
Do you foresee a lot of need for such negative value bit-manipulation in a large-integer library?
4. About options naming.
The option name "threadsafe" is misleading as it implies something about concurrency. Instead it means just disabling copy-on-write. [...]
Copy-on-write is the only thing that *isn't* completely thread-safe about the library. Thread-safety is the important feature; copy-on-write is an optimization that affects it.
xint::nothrow_integer really is "xint::integer or nan". Maybe the name that reflects "integer or nan" representation is better?
Sorry, but I don't understand. xint::nothrow doesn't throw exceptions; that is its reason for existing. The ability to hold a Not-a-Number value is a secondary aspect, to give it a value that is never valid, as a way to report errors. -- Chad Nelson Oak Circle Software, Inc. * * * -- Chad Nelson Oak Circle Software, Inc.
Certainly, here's the entire reply: * * *

Le dimanche 06 mars 2011 à 08:20 -0500, Chad Nelson a écrit :
1. Parametrize with container type. [...]
I actually tested earlier iterations with containers such as vector for their storage. The performance numbers were noticeably worse.
You're restricting this to std::vector against your array, when many container could fit. I did following modifications on reviewed XInt: - naively turn magnitude_manager_t into a boost::array wrapper - add byref args in integer, raw_integer (and others, to compile) - nothing else (NB: I used the original with byref too as this was faster) I used dummy code introduced by Phil Endecott with various fixed lengths. Even at 20000 bits, boost::array version is almost twice faster. At 2000 bits, thrice faster. I did not optimize boost::array usage, using plain zero init in constructor, and I probably missed some copies. Oh, by the way, boost::array version should be threadsafe. My point: You're making choices in place of your user. This is a dangerous way. If you like container genericity idea suggested by the reviewer, I might try to provide my humble and limited help. I do not consider this exposing internals, but rather keeping the internals out. There is too much code in your library, actually. Having said that, your library works well and seems easy to hack. A very good point for the future. I'm not experienced enough to properly review it but that's definitely a good one IMHO. Ivan

On Sun, 06 Mar 2011 23:50:55 +0100 Ivan Le Lann <ivan.lelann@free.fr> wrote:
Le dimanche 06 mars 2011 à 08:20 -0500, Chad Nelson a écrit :
1. Parametrize with container type. [...]
I actually tested earlier iterations with containers such as vector for their storage. The performance numbers were noticeably worse.
You're restricting this to std::vector against your array, when many container could fit.
std::vector is the most logical STL container, given the code's needs. Though std::deque might be almost as good, perhaps better in some cases because it doesn't require large blocks of contiguous memory to store massive numbers.
I did following modifications on reviewed XInt:
- naively turn magnitude_manager_t into a boost::array wrapper - add byref args in integer, raw_integer (and others, to compile) - nothing else
(NB: I used the original with byref too as this was faster)
I'm nearly finished converting most arguments to by-reference myself, that alone should improve the speed some.
I used dummy code introduced by Phil Endecott with various fixed lengths. Even at 20000 bits, boost::array version is almost twice faster. At 2000 bits, thrice faster. I did not optimize boost::array usage, using plain zero init in constructor, and I probably missed some copies. Oh, by the way, boost::array version should be threadsafe.
I'd have to see the actual changes you've made, but I'd be interested to see what difference that same change will make once I've finished the current modifications. For fixed-length integers, it might be a cheap and useful speed improvement.
My point: You're making choices in place of your user. This is a dangerous way. If you like container genericity idea suggested by the reviewer, I might try to provide my humble and limited help. I do not consider this exposing internals, but rather keeping the internals out. There is too much code in your library, actually.
I do like the idea, just not quite yet. :-)
Having said that, your library works well and seems easy to hack. A very good point for the future. I'm not experienced enough to properly review it but that's definitely a good one IMHO.
Thanks, and thanks for your input. -- Chad Nelson Oak Circle Software, Inc. * * *

On Sun, 06 Mar 2011 13:22:51 +0300 Ivan Sorokin <sorokin@rain.ifmo.ru> wrote:
On 06.03.2011 05:23, Chad Nelson wrote:
I believe that arbitrary precision arithmetic library is very useful. But I don't think that in it's current status XInt is appropriate for boost.
After your reply I changes my mind. As I see you had considered all my suggestions long before I posted them. If they really lead to unavoidable performance penalty I think current implementation is a good trade-off.
I vote for inclusion XInt into boost.
Thank you (again) for your comments.
One more thought about "secure" option. There are no standard containers with this option. I think either both standard containers and integer_t should have such option or both shouldn't have. This is yet another argument for container parametrization.
I'm not sure I understand your argument. Are you saying that because there are no standard containers with a zero-memory-before-releasing option, that the library shouldn't provide one either?
Yes. Why integer_t should be an exception?
The contents of standard containers rarely need to be protected from recovery and analysis. The contents of a large integer may well be used for cryptography, and hold sensitive information like private keys, so they might. That was the rationale for the feature. (For some reason, your messages don't include a reply-to field in the header, so when I try to answer them, the reply always defaults to going to you alone. The version of Thunderbird you're using doesn't normally have that problem, so I'm not sure why it's happening.) -- Chad Nelson Oak Circle Software, Inc. * * *

AMDG On 03/06/2011 02:22 AM, Ivan Sorokin wrote:
On 06.03.2011 05:23, Chad Nelson wrote:
One more thought about "secure" option. There are no standard containers with this option. I think either both standard containers and integer_t should have such option or both shouldn't have. This is yet another argument for container parametrization.
I'm not sure I understand your argument. Are you saying that because there are no standard containers with a zero-memory-before-releasing option, that the library shouldn't provide one either?
Yes. Why integer_t should be an exception?
Just a thought: Maybe the Allocator should handle this, since it's related to memory management? In Christ, Steven Watanabe

On Wed, 09 Mar 2011 11:47:00 -0800 Steven Watanabe <watanabesj@gmail.com> wrote:
I'm not sure I understand your argument. Are you saying that because there are no standard containers with a zero-memory-before-releasing option, that the library shouldn't provide one either?
Yes. Why integer_t should be an exception?
Just a thought: Maybe the Allocator should handle this, since it's related to memory management?
I've been debating that since I read your message last night, and I don't have an answer. It would make sense, but it would also make it more difficult for someone who just wants to clear the memory when it's released. I can see uses where that would be sufficient. -- Chad Nelson Oak Circle Software, Inc. * * *

On Thu, Mar 10, 2011 at 05:49, Chad Nelson <chad.thecomfychair@gmail.com> wrote:
On Wed, 09 Mar 2011 11:47:00 -0800 Steven Watanabe <watanabesj@gmail.com> wrote:
Just a thought: Maybe the Allocator should handle this, since it's related to memory management?
I've been debating that since I read your message last night, and I don't have an answer. It would make sense, but it would also make it more difficult for someone who just wants to clear the memory when it's released. I can see uses where that would be sufficient.
Can you elaborate on why someone would want to clear the memory, but not want to actually be secure? ~ Scott

On Thu, 10 Mar 2011 09:15:28 -0800 Scott McMurray <me22.ca+boost@gmail.com> wrote:
Just a thought: Maybe the Allocator should handle this, since it's related to memory management?
I've been debating that since I read your message last night, and I don't have an answer. It would make sense, but it would also make it more difficult for someone who just wants to clear the memory when it's released. I can see uses where that would be sufficient.
Can you elaborate on why someone would want to clear the memory, but not want to actually be secure?
Airtight security is a hard problem that requires massive amounts of time and attention to get right, and is best reserved for programs that absolutely require it. Barring extremely sensitive information like government-level secrets, there are generally only two things that a developer needs to worry about: that sensitive data might be written to disk by the OS, and that it might be retrieved from memory, either by malware while the machine is running or by physical means immediately after removing power. Depending on the OS involved, setting a piece of memory to never be paged to disk could be easy or impossible. But other than a very small window of opportunity, clearing memory prevents retrieving data from it regardless of the means, and makes it much less likely that such data would get written to disk. If the system has a sufficiently large amount of memory, the person using it could even disable the swapfile completely, in which case clearing memory would provide all the security he's likely to need. -- Chad Nelson Oak Circle Software, Inc. * * *

On Thu, Mar 10, 2011 at 16:24, Chad Nelson <chad.thecomfychair@gmail.com> wrote:
On Thu, 10 Mar 2011 09:15:28 -0800 Scott McMurray <me22.ca+boost@gmail.com> wrote:
Can you elaborate on why someone would want to clear the memory, but not want to actually be secure?
Barring extremely sensitive information like government-level secrets, there are generally only two things that a developer needs to worry about: that sensitive data might be written to disk by the OS, and that it might be retrieved from memory, either by malware while the machine is running or by physical means immediately after removing power.
I think that makes a flawed assumption that the secret information will only reside in the one place in memory, and never be copied elsewhere. Clearing the memory from the bigint doesn't help when iostream cached the bytes of the file from which it was read, nor does it protect the information that the NIH implementation of RSA was used to decrypt. Any useful attempt at security will involve more than a single number, so any number that wants to be used securely should have a way to hook into an existing system. An allocator might be a reasonable way to do this, since it could handle clearing, telling the OS not to swap the memory, or whatever the user decides is important enough, and be applied to the xint, to the vector used in a custom streambuf, etc. Still, I think that the idea of even implying that doing home-grown security is an acceptable idea is a terrible one. Even if someone doesn't need NSA-resistent security, why would doing custom RSA with a big number library ever be a better idea than using a proper crypto toolkit? ~ Scott

On Thu, 10 Mar 2011 17:05:56 -0800 Scott McMurray <me22.ca+boost@gmail.com> wrote:
Can you elaborate on why someone would want to clear the memory, but not want to actually be secure?
Barring extremely sensitive information like government-level secrets, there are generally only two things that a developer needs to worry about: [...]
[...] Clearing the memory from the bigint doesn't help when iostream cached the bytes of the file from which it was read, nor does it protect the information that the NIH implementation of RSA was used to decrypt.
If you're reading and writing sensitive values to disk or across the network, you've got bigger security concerns than the library can deal with. If you're sending them through another library, obviously the other library must also have measures in place to keep them secure. As I said, airtight security is a hard problem.
Any useful attempt at security will involve more than a single number, so any number that wants to be used securely should have a way to hook into an existing system. An allocator might be a reasonable way to do this, since it could handle clearing, telling the OS not to swap the memory, or whatever the user decides is important enough, and be applied to the xint, to the vector used in a custom streambuf, etc.
Certainly. But forcing anyone who wants even low-level security to write an allocator, when the library itself can handle that much very easily, seems foolish.
Still, I think that the idea of even implying that doing home-grown security is an acceptable idea is a terrible one. Even if someone doesn't need NSA-resistent security, why would doing custom RSA with a big number library ever be a better idea than using a proper crypto toolkit?
XInt's predecessor was written because my company had a commercial project that needed public-key cryptography. For reasons I won't go into, it couldn't use non-system DLLs. This was around 1998; Microsoft's crypto API didn't exist at that point (so far as I know), commercial crypto libraries were well beyond our shoe-string budget, and open-source software was almost universally full-GPL and unavailable for our purposes. On top of that, we didn't need strong security in the implementation. The part of the program that would run on the end user's machine didn't have any sensitive information -- that was the whole point of using public-key encryption and signatures -- and the part that did would run on machines that we controlled. I'd also previously written such a library for research purposes. The decision would probably be different today, but at the time, writing our own was only logical. My point is that you can't predict what needs a program might have that would lead its developers to want to implement their own crypto code. Yes, it's a really bad idea in general, because it's extremely hard to get right, but sometimes the hard parts aren't needed, and there are other legitimate reasons for doing it. -- Chad Nelson Oak Circle Software, Inc. * * *

On 3/11/2011 6:28 AM, Chad Nelson wrote:
On Thu, 10 Mar 2011 17:05:56 -0800 Scott McMurray<me22.ca+boost@gmail.com> wrote: [...]
Any useful attempt at security will involve more than a single number, so any number that wants to be used securely should have a way to hook into an existing system. An allocator might be a reasonable way to do this, since it could handle clearing, telling the OS not to swap the memory, or whatever the user decides is important enough, and be applied to the xint, to the vector used in a custom streambuf, etc.
Certainly. But forcing anyone who wants even low-level security to write an allocator, when the library itself can handle that much very easily, seems foolish.
I agree with Scott here. Chad, if you feel particularly tied to this "low-level security" feature, perhaps it would be best (if this is possible) to supply an allocator adaptor that does this memory zero'ing upon release of a block. - Jeff

On Fri, 11 Mar 2011 10:58:26 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:
Any useful attempt at security will involve more than a single number [...]
Certainly. But forcing anyone who wants even low-level security to write an allocator, when the library itself can handle that much very easily, seems foolish.
I agree with Scott here. Chad, if you feel particularly tied to this "low-level security" feature, perhaps it would be best (if this is possible) to supply an allocator adaptor that does this memory zero'ing upon release of a block.
Perhaps. Maybe I can supply default allocator code with that option, which people could copy and modify if they want to make their own. That would satisfy my ease-of-use concerns. -- Chad Nelson Oak Circle Software, Inc. * * *

On 10 March 2011 18:24, Chad Nelson <chad.thecomfychair@gmail.com> wrote:
Airtight security is a hard problem that requires massive amounts of time and attention to get right, and is best reserved for programs that absolutely require it.
Forget about airtight. What guarantees are you making that the memory has been zeroed in the presence of an aggressive optimizer? For instance (reworded from recent C++0x committee discussions), the following: struct EraseOnDtor { ~EraseOnDtor() {s .assign(s.size(), '\0'); } std::string s; }; is not guaranteed to zero the string on destruction, because the call to assign can be optimized out since it has no observable behavior when the string is immediately destroyed afterwards. This stuff is hard to get right. You are better off not implementing it. -- Nevin ":-)" Liber <mailto:nevin@eviloverlord.com> (847) 691-1404

On Fri, 11 Mar 2011 11:28:25 -0600 Nevin Liber <nevin@eviloverlord.com> wrote:
Airtight security is a hard problem that requires massive amounts of time and attention to get right, and is best reserved for programs that absolutely require it.
Forget about airtight. What guarantees are you making that the memory has been zeroed in the presence of an aggressive optimizer? [...]
That was brought up during the review this week. I plan to implement much safer zeroing code than is presently in there now, and provide a way for people to add their own if they feel that my implementation is insufficient.
This stuff is hard to get right. You are better off not implementing it.
On the contrary. It's *because* it's hard to get right that it belongs in a library. -- Chad Nelson Oak Circle Software, Inc. * * *

On 11 March 2011 18:51, Chad Nelson <chad.thecomfychair@gmail.com> wrote:
That was brought up during the review this week. I plan to implement much safer zeroing code than is presently in there now,
How? Short of platform-specific extensions (which basically inhibit the optimizer) or waiting around for the next C standard (with memset_s), I'm not seeing it. A discussion of this issue can be found at < https://www.securecoding.cert.org/confluence/display/cplusplus/MSC06-CPP.+Be+aware+of+compiler+optimization+when+dealing+with+sensitive+data>, where their conclusion is "However, it should be noted that both calling functions and accessing volatile qualified objects can still be optimized out (while maintaining strict conformance to the standard), so the above may still not work." and provide a
way for people to add their own if they feel that my implementation is insufficient.
Either you are making a guarantee or you aren't. If you aren't, it worse than not doing anything at all, as it gives people a false sense of security.
This stuff is hard to get right. You are better off not implementing it.
On the contrary. It's *because* it's hard to get right that it belongs in a library.
By experts (and even they aren't perfect). When non-experts do it, we get vulnerabilities. -- Nevin ":-)" Liber <mailto:nevin@eviloverlord.com> (847) 691-1404

On Sat, 12 Mar 2011 00:51:31 -0600 Nevin Liber <nevin@eviloverlord.com> wrote:
That was brought up during the review this week. I plan to implement much safer zeroing code than is presently in there now,
How? Short of platform-specific extensions (which basically inhibit the optimizer) or waiting around for the next C standard (with memset_s), I'm not seeing it.
<https://groups.google.com/group/boost-list/msg/fea48e8173d6b411>
and provide a way for people to add their own if they feel that my implementation is insufficient.
Either you are making a guarantee or you aren't. If you aren't, it worse than not doing anything at all, as it gives people a false sense of security.
As you yourself noted above, anything I do is limited by being platform-specific, of necessity. I can make plenty of guarantees, but only for the specific platforms and compilers I have access to. For everything else, the person using the library will have to check that the implementation does what it's intended to do, a point I will make abundantly clear.
This stuff is hard to get right. You are better off not implementing it.
On the contrary. It's *because* it's hard to get right that it belongs in a library.
By experts (and even they aren't perfect). When non-experts do it, we get vulnerabilities.
If an expert comes forward and volunteers his services, I'll gladly accept them. Until then, if someone wants that feature, either they get it from the library or they write it themselves. I don't have a great deal of knowledge on the subject, but the odds are high that they have even less than I do, so your argument is exactly the reason it should be in there. -- Chad Nelson Oak Circle Software, Inc. * * *

On 20:59, Chad Nelson wrote:
On Fri, 11 Mar 2011 11:28:25 -0600 Nevin Liber<nevin@eviloverlord.com> wrote:
This stuff is hard to get right. You are better off not implementing it.
On the contrary. It's *because* it's hard to get right that it belongs in a library.
Yes, it belong in *a* library, but XInt is probably not the correct one. It should probably be done by the allocator, as someone suggested earlier. Regards, Anders Dalvander -- WWFSMD?

Anders Dalvander wrote:
On 20:59, Chad Nelson wrote:
On Fri, 11 Mar 2011 11:28:25 -0600 Nevin Liber<nevin@eviloverlord.com> wrote:
This stuff is hard to get right. You are better off not implementing it.
On the contrary. It's *because* it's hard to get right that it belongs in a library.
Yes, it belong in *a* library, but XInt is probably not the correct one.
It should probably be done by the allocator, as someone suggested earlier.
I think that anyone serious about security will probably choose to implement his own - the work required is the same as auditing an existing library, and with xint, with its multitude of options that obscure the code to an extent, it may be even less. The existing "secure" option may be better than nothing, of course. On the other hand, it creates the famous "false sense of security".
participants (10)
-
Anders Dalvander
-
Chad Nelson
-
Gordon Woodhull
-
Ivan Le Lann
-
Ivan Sorokin
-
Jeffrey Lee Hellrung, Jr.
-
Nevin Liber
-
Peter Dimov
-
Scott McMurray
-
Steven Watanabe