On 12/12/15 9:11 AM, Phil Endecott wrote:
Hi Robert,
Robert Ramey wrote:
creating types which carry their ranges around with them through expressions.
I'm curious to know how well this works in practice, i.e. how much of the time do you need run-time checking in real programs.
That makes two of us.
Here are a few examples that have occurred to me:
1. Arguments to functions
void f(safe<int> i) { int8_t j = i; // Needs a run-time check unless we know i<128 } correct.
f(3); // Run-time check happens, even though value is known.
The constructor of safe<int> doesn't do a runtime check when it's being initialized by an int. the assignment would require a runtime check.
Even though the body of f is visible at the point of the call,
In general this is not true. f could well be compiled separately.
I don't think an approach based on types can avoid the run-time check unless the function args are templates:
template <typename RANGE_TYPE> void f(RANGE_TYPE i) { int8_t j = i; }
f(3);
here is a simpler way
void f(safe<int> i){
int j = i; // Needs a run-time check unless we know i<128
}
also
void f(safe<int> i){
auto j = i; // Needs a run-time check unless we know i<128
}
also
void f(safe
Here you can presumably arrange things so that the type of i inside f includes the range information and the run-time check can be avoided. But this does rely on the body of f being visible at the point of the call.
nope - it relies upon picking types which are large enough and carry thier range information with them. The "automatic" promotion policy can do this automatically. The verbosity of the function declaration should
improve a bit with concepts, but it's not really a drop-in replacement.
This is true - used as a drop in replacement, one will get more runtime checking.
Related to this is function return types:
safe<int> incr(safe<int> i) { return i+1; }
int8_t j = incr(42); // Run-time check or not?
No runtime check when the safe<int> is constructed on the stack yes runtime check calculating the result of i + i. No runtime check when returning the result. Since C++ expression rules mean that the result of i + i will be an int and there check required to construct a safe<int> from an int. In fact, probably zero overhead due to NVRO. yes runtime check on assigment to int8_t j = incr(42). Unless incr is made constexpr at which time everything moves to compile time. zero runtime checking - actually zero execution for the whole thing!
Maybe auto return types help a bit but what about
return (i%2==0) ? i/2 : 3*i+1;
what about it?
2. Merging constraints
Example:
void g(int j);
void f(safe<int> i) { for (int a = 0; a < 1000000; ++a) { g(i+a); } }
Does that generate a million run-time checks?
yep (note: a will overflow on some machines)
If my expectation is that i will be "small", most likely I would be happy with just one check at the start of f:
assert(i < maxint - 1000000);
of course. But maybe you'd consider the following equivalent void f(safe_integer_range<0, maxint> i){ safe_integer_range<0, 1000000> a; for (a = 0; a < 1000000; ++a) { g(i+a); } } if maxint is less than 2^32 - 1000000 then no runtime checking will in the function. Note how this differs from the usage of assert. With assert we check during debug build. But when we ship we throw away the seat belts. With safe types in this case, we get zero runtime overhead and we're guaranteed not to crash at runtime. At this point we're beyond the "drop-in" replacement stage. But we're also enhancing getting safety and performance guarantees we've never been able to get before.
3. Implied constraints:
vector<S> v; f(v); // fills v safe
bytes = v.size() * sizeof(S); I know that the multiplication can't overflow, because if v.size() was large enough that it would overflow then I would have run out of memory already.
Hmmm - I don't actually know that for a fact. I've always found the
usage of size type confusing. I don't really know that it's not possible
to allocate a vector whose total space occupies more than 2^32 bytes (or
is it 2^64).
But in any case, since construction safe
So as I say, I'm curious to hear whether you've applied your safe<int> to any real applications and measured how many of the potential run- time checks are avoidable.
Only my tests and examples. I'm hoping someone will create some performance tests.
My fear is that it could be fewer than you hope, and that the checks remain inside loops rather than begin hoisted outside them.
We'll just have to see.
Another reservation about this is the question of what should happen when an overflow is detected. Exceptions are not great, because you have to worry about what you'll do when you get one.
The library has the concept of exception policy. throwing an exception is the default but you can select a different one or make your own. It's not hard. But you really have to spend time thinking about what you want to do. One thing you might want to look at is the usage of automatic type promotion (another policy - default is native). When using this policy, types returned from binary expressions are almost always large enough to hold the expression results - so we're back in to zero runtime checking. Another thing you might look into is safe...range which does the opposite. I attaches range information to your variables at compile time so that runtime checking can be skipped when it's known not to be necessary. Then there is the fact that all this is constexpr friendly which can have a huge impact on performace.
One ambitious idea is to find a way to fall back to arbitrary precision arithmetic:
template <typename INT> void f(INT i) { INT i2 = i*i; INT j = (i2 + 4096) << 2; g(j%7); }
try { f<int>(n); } catch (overflow&) { f
(n); } How much syntactic sugar can we hide that in?
This is too ambitious for me. But you could do it with your own custom
promotion policy. The requirements for such a policy are in the
documentation and you can look how the "automatic" promotion policy does
it. It specifies the return types from binary operations dependent on
the types of the operands. It does it in such a way as to minimize the
need for runtime checking. If you make your own policy and named it
"phil" then using it would be:
template<typename T>
using safe_t = safe
I recently experimented with LLVM's thread safety checking (see http://clang.llvm.org/docs/ThreadSafetyAnalysis.html ); it lets you annotate variables to indicate that locks must be held when they are accessed:
mutex m; int i GUARDED_BY(m);
I've been impressed by that, and I wonder if attributes defining expected ranges and static analysis might be a more practical way to find potential overflows.
using safe_...range(min, max) seems equivalent to me. I'm now trying to make work the piece d'resistance - an error trapping policy which uses static_assert so that for things like embedded systems one could get a compile time error anywhere something could possible overflow - no runtime checking and no exception handling required. I'm having difficulty with this as we speak. My expectation is to see the library used in following manner a) some safety critical program has a bug which is suspected to be addressable by safe type b) In desparation, someone just changes all the int, ... types via a alias to safe equivalents. c) program is compiled and run - and a bug is found. Hooray. d) the alias is redefined to the built-in equivalents d) but wait, some idiot consults a lawyer. He says - well now you know how to guarantee that the program has no bugs (remember he's a lawyer) so you better make sure you include this facility in case we get sued. We have to be able to prove we did everything humanly possible to avoid bugs. e) someone complains that the program is going to be too slow. But under pressure of the authorities they produce a benchmark and it IS slower. But not quite by as much as was feared. f) so now someone says - what would it take to make it faster? At this point some changes are made so that we're not doing as much shrinking of types. Perhaps the libraries "automatic" policy is used which expands intermediate and auto variables to be big enough to hold the results. perhaps safe_...range is used to specify real ranges. This eliminates the performance bottleneck. But we're no longer using safe types as "drop-in" replacements. We've created new types selected to (almost) never trap. It's just a speculative scenario. But it's not unbelievable. I hope it can be seen that I've spent considerable time thinking about this. Now I'm afraid to get on an air plane or get an MRI. It's that bad. Robert Ramey