Re: [boost] [review][constrained_value] Review of Constrained Value Library begins today

At 2:23 PM +0100 12/6/08, Robert Kawulak wrote:
... maybe the problem could be somehow solved if we have a function float exact(float) that, given a floating point value (that may have greater precision because of caching in a register), returns a value that is truncated (has exactly the precision of float, not greater).
I think that something along the lines of the following will likely work: inline double exact(double x) { struct { volatile double x; } xx = { x }; return xx.x; } The idea is to force the value to make a round trip through a memory location of the "correct" size. The use of volatile should prevent the compiler from optimizing away the trip through memory. If worried about whether a compiler will really do what's needed here, then I think the only other option is inline assembler to produce the equivalent behavior. Looking through the x86 instruction set, I don't see any other way to cause the desired rounding.

Kim Barrett skrev:
At 2:23 PM +0100 12/6/08, Robert Kawulak wrote:
... maybe the problem could be somehow solved if we have a function float exact(float) that, given a floating point value (that may have greater precision because of caching in a register), returns a value that is truncated (has exactly the precision of float, not greater).
I think that something along the lines of the following will likely work:
inline double exact(double x) { struct { volatile double x; } xx = { x }; return xx.x; }
The idea is to force the value to make a round trip through a memory location of the "correct" size. The use of volatile should prevent the compiler from optimizing away the trip through memory.
If worried about whether a compiler will really do what's needed here, then I think the only other option is inline assembler to produce the equivalent behavior. Looking through the x86 instruction set, I don't see any other way to cause the desired rounding.
For some ABIs I think using class instead of struct will have the desired effect of passing the value on the stack. So maybe something like template< class T > class rounder { ... }; template< class T > inline T exact( rounder<T> r ) { return r.val(); } -Thorsten

On Tue, Dec 9, 2008 at 1:23 PM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Kim Barrett skrev:
At 2:23 PM +0100 12/6/08, Robert Kawulak wrote:
... maybe the problem could be somehow solved if we have a function float exact(float) that, given a floating point value (that may have greater precision because of caching in a register), returns a value that is truncated (has exactly the precision of float, not greater).
I think that something along the lines of the following will likely work:
inline double exact(double x) { struct { volatile double x; } xx = { x }; return xx.x; }
Please no. Not only it might prevent other optimizations, but is also completely thread unsafe.
The idea is to force the value to make a round trip through a memory location of the "correct" size. The use of volatile should prevent the compiler from optimizing away the trip through memory.
If worried about whether a compiler will really do what's needed here, then I think the only other option is inline assembler to produce the equivalent behavior. Looking through the x86 instruction set, I don't see any other way to cause the desired rounding.
For some ABIs I think using class instead of struct will have the desired effect of passing the value on the stack. So maybe something like
AFAIK usually is the presence of a constructor that is the discriminant between passing by via stack or via register, not the struct/class keyword itself...
template< class T > class rounder { ... };
template< class T > inline T exact( rounder<T> r ) { return r.val(); }
... and if the call itself is inlined, I doubt that the object is actually forced to the stack. -- gpd

AMDG Giovanni Piero Deretta wrote:
I think that something along the lines of the following will likely work:
inline double exact(double x) { struct { volatile double x; } xx = { x }; return xx.x; }
Please no. Not only it might prevent other optimizations, but is also completely thread unsafe.
How so? In Christ, Steven Watanabe

On Tue, Dec 9, 2008 at 5:26 PM, Steven Watanabe <watanabesj@gmail.com> wrote:
AMDG
Giovanni Piero Deretta wrote:
I think that something along the lines of the following will likely work:
inline double exact(double x) { struct { volatile double x; } xx = { x }; return xx.x; }
Please no. Not only it might prevent other optimizations, but is also completely thread unsafe.
How so?
Ignore me. Somehow I parsed that struct as 'static'. Sorry for the noise. -- gpd

At 2:40 PM +0100 12/9/08, Giovanni Piero Deretta wrote:
Kim Barrett skrev:
I think that something along the lines of the following will likely work:
inline double exact(double x) { struct { volatile double x; } xx = { x }; return xx.x; }
Please no. Not only it might prevent other optimizations,
It might, but that's kind of the point. The key problem is deciding when such a mechanism might be needed, and I regret that I don't have a simple answer for that. Obviously one doesn't want to use this at all on a platform that doesn't have extended precision registers. One also doesn't want to use this if the value is already in memory and not in one of those extended precision registers. However, I also don't see any technique not involving inline assembler that can solve the latter. Well, perhaps I do. inline double exact(double const& x) { return *const_cast<double const volatile*>(&x); } This should be an improvement over my earlier version, since it avoids a store if the value is already in memory. Some experiments with gcc seem to bear that out. The notion is to model the "m"(emory) argument qualifier from gcc's inline assembler syntax. Again here the use of volatile is to prevent a trip through memory from being optimized away, and experiments with gcc demonstrate that leaving off the volatile qualifier can result in the undesired optimization and failure to get the desired rounding.
but is also completely thread unsafe.
I disagree; there is no thread safety issue here. At 2:40 PM +0100 12/9/08, Giovanni Piero Deretta wrote:
AFAIK usually is the presence of a constructor that is the discriminant between passing by via stack or via register, not the struct/class keyword itself...
On Tue, Dec 9, 2008 at 1:23 PM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
template< class T > class rounder { ... };
template< class T > inline T exact( rounder<T> r ) { return r.val(); }
... and if the call itself is inlined, I doubt that the object is actually forced to the stack.
And similarly for the constructor, if that is indeed a discriminant for passing via stack or register. Hence the need for volatile or something with similar impact.

Kim Barrett wrote:
At 2:23 PM +0100 12/6/08, Robert Kawulak wrote:
... maybe the problem could be somehow solved if we have a function float exact(float) that, given a floating point value (that may have greater precision because of caching in a register), returns a value that is truncated (has exactly the precision of float, not greater).
I think that something along the lines of the following will likely work:
inline double exact(double x) { struct { volatile double x; } xx = { x }; return xx.x; }
The idea is to force the value to make a round trip through a memory location of the "correct" size. The use of volatile should prevent the compiler from optimizing away the trip through memory.
The following should work: inline double exact(double x) { double y; memcpy(&y, &x, sizeof(double)); return y; } But I'm not sure the library should do this at all. It seems like forcing a policy upon the user. And it may be inefficient. The root of the problem is that Intel processors may store double values in 80-bit register. These values may later be truncated to 64-bits. However, you can force doubles to always be stored with 64-bit precision by changing the processor floating point control word. On Visual Studio, this can be done through the command _controlfp(_PC_53, MCW_PC) (53 = number of significand (mantissa) bits in the 64-bit format). --Johan Råde

Johan Råde wrote:
Kim Barrett wrote:
At 2:23 PM +0100 12/6/08, Robert Kawulak wrote:
... maybe the problem could be somehow solved if we have a function float exact(float) that, given a floating point value (that may have greater precision because of caching in a register), returns a value that is truncated (has exactly the precision of float, not greater).
I think that something along the lines of the following will likely work:
inline double exact(double x) { struct { volatile double x; } xx = { x }; return xx.x; }
The idea is to force the value to make a round trip through a memory location of the "correct" size. The use of volatile should prevent the compiler from optimizing away the trip through memory.
The following should work:
inline double exact(double x) { double y; memcpy(&y, &x, sizeof(double)); return y; }
But I'm not sure the library should do this at all. It seems like forcing a policy upon the user. And it may be inefficient.
The root of the problem is that Intel processors may store double values in 80-bit register. These values may later be truncated to 64-bits. However, you can force doubles to always be stored with 64-bit precision by changing the processor floating point control word. On Visual Studio, this can be done through the command
_controlfp(_PC_53, MCW_PC)
(53 = number of significand (mantissa) bits in the 64-bit format).
--Johan Råde
Ideas that force the floating point value out of the register and into cache will, on average cost more than an order of magnitude in time increase for the comparison (forcing to main memory is far, far worse). This isn't acceptable for anyone who needs performance from the comparison. Reducing the bits used in the mantissa for the registers can be far worse. Most accuracy guarantees made for the calculations assume the existence of guard bits in the register. Without them, the calculation loses significance faster than expected. This is not acceptable for anyone who needs high precision. In general, these are difficult and subtle problems you are approaching when you try to compare close floating point numbers. The past is littered with examples of how not to do it, and doing it well is very fiddly. It is possible to find all of the details for this, but it should only be done if someone wants to put in the time and effort to find out the best available methods. "This is probably good enough" ideas almost never are in this setting. John

John Phillips skrev:
Ideas that force the floating point value out of the register and into cache will, on average cost more than an order of magnitude in time increase for the comparison (forcing to main memory is far, far worse). This isn't acceptable for anyone who needs performance from the comparison.
Reducing the bits used in the mantissa for the registers can be far worse. Most accuracy guarantees made for the calculations assume the existence of guard bits in the register. Without them, the calculation loses significance faster than expected. This is not acceptable for anyone who needs high precision.
In general, these are difficult and subtle problems you are approaching when you try to compare close floating point numbers. The past is littered with examples of how not to do it, and doing it well is very fiddly. It is possible to find all of the details for this, but it should only be done if someone wants to put in the time and effort to find out the best available methods. "This is probably good enough" ideas almost never are in this setting.
Well, problem one would be to find one that has the knowledge to find the datails. John Maddock gave two examples: one where we accapted wrong values and one where we rejected correct values. I found the latter somewhat better (because the invariant of bounded_float would be preserved), and if we are able to get that behavior consistently, I would be quite happy. So maybe most of our problems will be fixed by using >= and <= instead of < and >? -Thorsten

Thorsten Ottosen wrote:
John Maddock gave two examples: one where we accapted wrong values and one where we rejected correct values. I found the latter somewhat better (because the invariant of bounded_float would be preserved), and if we are able to get that behavior consistently, I would be quite happy.
So maybe most of our problems will be fixed by using >= and <= instead of < and >?
-Thorsten _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Is the following true: if x and y satisfy x <= y (or x >= y) then they will still do so after any truncation from 80 to 64 bits? Then, as Thorsten suggests, <= and >= will never erroneously accept any value. We could then maybe implement < and > using <=, >= and the std::nextafter function, and achieve the same guarantee there. --Johan Råde

On Dec 10, 2008, at 7:24 AM, Johan Råde wrote:
John Maddock gave two examples: one where we accapted wrong values and one where we rejected correct values. I found the latter somewhat better (because the invariant of bounded_float would be preserved), and if we are able to get that behavior consistently, I would be quite happy. So maybe most of our problems will be fixed by using >= and <= instead of < and >? -Thorsten Is the following true: if x and y satisfy x <= y (or x >= y)
Thorsten Ottosen wrote: then they will still do so after any truncation from 80 to 64 bits?
Nope, as John pointed out later in his message, if the test is x<=y and x is just slightly greater than y, then x can flip either way. Epsilon doesn't help here; it just moves the problem around. Now x<=y is going to have the same problem with x very close to y+eps. The only nicer thing about it is that if you are testing *inequality* x<y then you can be sure that it will only accept x<y if x really is less than y and will remain so after rounding (even though the predicate may later fail). This is the distinction Stjepan was making between the invariant and the test.
Then, as Thorsten suggests, <= and >= will never erroneously accept any value.
Rejecting correct values is indeed better than accepting incorrect values. It seems that you will have to be careful which predicate you choose when dealing with floats.
We could then maybe implement < and > using <=, >= and the std::nextafter function, and achieve the same guarantee there.
That's interesting. I'm really glad to see the discussion of the problems with floating point, and I'll make a stab at a few predicates implementing epsilon and forced truncation in a couple of weeks when school is out - or if anyone else wants to start this, please do. I believe that this group of people has the ability to get this problem right. Gordon

Gordon Woodhull wrote:
On Dec 10, 2008, at 7:24 AM, Johan Råde wrote:
Is the following true: if x and y satisfy x <= y (or x >= y) then they will still do so after any truncation from 80 to 64 bits?
Nope, as John pointed out later in his message, if the test is x<=y and x is just slightly greater than y, then x can flip either way.
The statement is "x <= y before truncation implies x <= y after truncation". (Note the word "implies", not "equivalent to".) The example you mention is not a counter-example to that statement. --Johan Råde

On Dec 10, 2008, at 4:16 PM, Johan Råde wrote:
Gordon Woodhull wrote:
On Dec 10, 2008, at 7:24 AM, Johan Råde wrote:
Is the following true: if x and y satisfy x <= y (or x >= y) then they will still do so after any truncation from 80 to 64 bits? Nope, as John pointed out later in his message, if the test is x<=y and x is just slightly greater than y, then x can flip either way.
The statement is "x <= y before truncation implies x <= y after truncation". (Note the word "implies", not "equivalent to".) The example you mention is not a counter-example to that statement.
Ooops, you are quite right. In other words, perhaps truncation can merge two values but it can't switch them, if I understand you this time. I hope this is the case.

From: Johan Rade The statement is "x <= y before truncation implies x <= y after truncation".
Or more specifically: "x < y before truncation implies x <= y after truncation" Is this right?

Robert Kawulak wrote:
From: Johan Rade The statement is "x <= y before truncation implies x <= y after truncation".
Or more specifically: "x < y before truncation implies x <= y after truncation" Is this right?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Thought about it again. None if these statements are true, unfortunately, if you consider the case when one number is truncated to 64 bits and the other is kept at 80 bits. Then you can even have x < y before and x > y after. --Johan Råde

Johan Råde skrev:
Robert Kawulak wrote:
From: Johan Rade The statement is "x <= y before truncation implies x <= y after truncation".
Or more specifically: "x < y before truncation implies x <= y after truncation" Is this right?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Thought about it again. None if these statements are true, unfortunately, if you consider the case when one number is truncated to 64 bits and the other is kept at 80 bits. Then you can even have x < y before and x > y after.
Hm. This sucks. Do we know if that 80-bit floating point values are guaranteed to be a superset of 64-bit and 32-bit floats (or if 64-bit is a superset of 32-bit floats). I suspect the answer might be yes. If so, I don't think x < y before ==> x > y after can be true. If x is rounded upwards, y would be an upper bound. -Thorsten

Thorsten Ottosen wrote:
Johan Råde skrev:
Robert Kawulak wrote:
From: Johan Rade The statement is "x <= y before truncation implies x <= y after truncation".
Or more specifically: "x < y before truncation implies x <= y after truncation" Is this right?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Thought about it again. None if these statements are true, unfortunately, if you consider the case when one number is truncated to 64 bits and the other is kept at 80 bits. Then you can even have x < y before and x > y after.
Hm. This sucks.
Do we know if that 80-bit floating point values are guaranteed to be a superset of 64-bit and 32-bit floats (or if 64-bit is a superset of 32-bit floats).
I suspect the answer might be yes. If so, I don't think
x < y before ==> x > y after
can be true. If x is rounded upwards, y would be an upper bound.
Imagine the following case, 1) x and y are stored in 80-bit registers 2) x < y 3) both x and y are just below 1.0, so close that they would be rounded to 1.0 if truncated to 64 bits If now x is truncated to 64 bits, and then moved back to an 80-bit register, while y is unchanged, then x > y. --Johan Råde

AMDG Johan Råde wrote:
Imagine the following case,
1) x and y are stored in 80-bit registers 2) x < y 3) both x and y are just below 1.0, so close that they would be rounded to 1.0 if truncated to 64 bits
If now x is truncated to 64 bits, and then moved back to an 80-bit register, while y is unchanged, then x > y.
In other words, we always have to force at least one of x and y to be truncated in order to have any guarantee. If we can do that we might as well force both values to be truncated, right. I think that so far in this discussion is has been assumed that the bounds are truncated. In Christ, Steven Watanabe

Steven Watanabe wrote:
AMDG
Johan Råde wrote:
Imagine the following case,
1) x and y are stored in 80-bit registers 2) x < y 3) both x and y are just below 1.0, so close that they would be rounded to 1.0 if truncated to 64 bits
If now x is truncated to 64 bits, and then moved back to an 80-bit register, while y is unchanged, then x > y.
In other words, we always have to force at least one of x and y to be truncated in order to have any guarantee. If we can do that we might as well force both values to be truncated, right. I think that so far in this discussion is has been assumed that the bounds are truncated.
Forcing truncation may have a bad effect on performance. That will be unacceptable to some users. --Johan Råde

On Thu, Dec 11, 2008 at 9:07 AM, Johan Råde <rade@maths.lth.se> wrote:
In other words, we always have to force at least one of x and y to be truncated in order to have any guarantee. If we can do that we might as well force both values to be truncated, right. I think that so far in this discussion is has been assumed that the bounds are truncated.
Forcing truncation may have a bad effect on performance. That will be unacceptable to some users.
I think the idea is to force truncation on the bounds when they are set in the constraint object, which for many use cases happens only once per bounded value. Stjepan

Stjepan Rajko wrote:
On Thu, Dec 11, 2008 at 9:07 AM, Johan Råde <rade@maths.lth.se> wrote:
In other words, we always have to force at least one of x and y to be truncated in order to have any guarantee. If we can do that we might as well force both values to be truncated, right. I think that so far in this discussion is has been assumed that the bounds are truncated. Forcing truncation may have a bad effect on performance. That will be unacceptable to some users.
I think the idea is to force truncation on the bounds when they are set in the constraint object, which for many use cases happens only once per bounded value.
Stjepan
If you want to say it a different way, he wants the process of setting bounds to save something that is exactly representable in the floating point type that is being used for the bound. Then moving it from main memory to a register that may have more bits will not affect the value. This would add a little to the cost of runtime changes to the bounds, but that should be an uncommon enough operation to make the cost acceptable. John

John Phillips wrote:
Stjepan Rajko wrote:
On Thu, Dec 11, 2008 at 9:07 AM, Johan Råde <rade@maths.lth.se> wrote:
In other words, we always have to force at least one of x and y to be truncated in order to have any guarantee. If we can do that we might as well force both values to be truncated, right. I think that so far in this discussion is has been assumed that the bounds are truncated. Forcing truncation may have a bad effect on performance. That will be unacceptable to some users.
I think the idea is to force truncation on the bounds when they are set in the constraint object, which for many use cases happens only once per bounded value.
Stjepan
If you want to say it a different way, he wants the process of setting bounds to save something that is exactly representable in the floating point type that is being used for the bound. Then moving it from main memory to a register that may have more bits will not affect the value. This would add a little to the cost of runtime changes to the bounds, but that should be an uncommon enough operation to make the cost acceptable.
Yes, I think that if we force truncation of the bounds, but not of the actual values, and only allow <= and >= bounds, then we should be fine. We may also consider implementing < and > bounds in terms of <= and >= bounds and the std::nextafter function. --Johan Råde

Johan Råde skrev:
Thorsten Ottosen wrote:
Johan Råde skrev:
Robert Kawulak wrote:
From: Johan Rade The statement is "x <= y before truncation implies x <= y after truncation".
Or more specifically: "x < y before truncation implies x <= y after truncation" Is this right?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Thought about it again. None if these statements are true, unfortunately, if you consider the case when one number is truncated to 64 bits and the other is kept at 80 bits. Then you can even have x < y before and x > y after.
Hm. This sucks.
Do we know if that 80-bit floating point values are guaranteed to be a superset of 64-bit and 32-bit floats (or if 64-bit is a superset of 32-bit floats).
I suspect the answer might be yes. If so, I don't think
x < y before ==> x > y after
can be true. If x is rounded upwards, y would be an upper bound.
Imagine the following case,
1) x and y are stored in 80-bit registers 2) x < y 3) both x and y are just below 1.0, so close that they would be rounded to 1.0 if truncated to 64 bits
If now x is truncated to 64 bits, and then moved back to an 80-bit register, while y is unchanged, then x > y.
We need to make either x or y one of the bounds of our interval. Let us assume it is y. Furthermore, we know y has been specified such that it is representable exactly in float or double. Since the 80-bit floating point values are a superset, we know y can also be represented exactly in this format, and we know that truncation on y will have no effect. Basically, y will have the same value no matter if it is stored in a register or at some memory location. The question is know, can x < y pass, and then x is assigned to the a bounded_float (where it will be stored in x'), only to discover that x' < y ? We know y will always have the same value, so only x may switch location. Case 1: x is in memory, x' is in register. Then x == x' by the superset property. Case 2: x is in register, x' is register. Then x == x'. Case 3: x is in register, x' is in memory. Then we might have x' == y, but not x' > y, since that would violate the fact that y is exactly representable in both formats. Have I overlooked something? -Thorsten

2008/12/11 Thorsten Ottosen <thorsten.ottosen@dezide.com>:
We need to make either x or y one of the bounds of our interval. Let us assume it is y.
Furthermore, we know y has been specified such that it is representable exactly in float or double.
Not necessarily: typedef bounded<float>::type bfloat; bfloat ouch( sin(x), bfloat::constraint_type( cos(y), tan(z) ) ); ouch = log(7); Any of the bounds or the underlying value may be extended during computations involved in both the construction and the assignment (if I understand correctly), there's no guarantee that the compiler will not optimise away storage of the values. And actually what "truncation" is -- really a truncation (rounding towards 0), or rounding to the nearest representable value? Regards, Robert

Robert Kawulak skrev:
2008/12/11 Thorsten Ottosen <thorsten.ottosen@dezide.com>:
We need to make either x or y one of the bounds of our interval. Let us assume it is y.
Furthermore, we know y has been specified such that it is representable exactly in float or double.
Not necessarily:
typedef bounded<float>::type bfloat;
bfloat ouch( sin(x), bfloat::constraint_type( cos(y), tan(z) ) ); ouch = log(7);
Any of the bounds or the underlying value may be extended during computations involved in both the construction and the assignment (if I understand correctly), there's no guarantee that the compiler will not optimise away storage of the values.
Well, in most of my use cases the bounds are defined at compile time. As for your example, then I hope we find a way to force this rounding.
And actually what "truncation" is -- really a truncation (rounding towards 0), or rounding to the nearest representable value?
I think it is usually rounding towards nearest. -Thorsten

On Dec 11, 2008, at 11:45 AM, Thorsten Ottosen wrote:
We need to make either x or y one of the bounds of our interval. Let us assume it is y.
Furthermore, we know y has been specified such that it is representable exactly in float or double. Since the 80-bit floating point values are a superset, we know y can also be represented exactly in this format, and we know that truncation on y will have no effect.
Basically, y will have the same value no matter if it is stored in a register or at some memory location.
The question is know, can
x < y
pass, and then x is assigned to the a bounded_float (where it will be stored in x'), only to discover that
x' < y
? We know y will always have the same value, so only x may switch location.
Case 1: x is in memory, x' is in register. Then x == x' by the superset property.
Case 2: x is in register, x' is register. Then x == x'.
Case 3: x is in register, x' is in memory. Then we might have x' == y, but not x' > y, since that would violate the fact that y is exactly representable in both formats.
Well said. It seems that a truncated bound is key. Surely it will be if it's stored as a runtime field, I am not so sure when it's one of those nice optimized-away functors. So the next questions are 1. how to implement 0. how to test I will inquire of a local float enthusiast, one of my profs at NYU. Gordon

Thorsten Ottosen wrote:
Do we know if that 80-bit floating point values are guaranteed to be a superset of 64-bit and 32-bit floats (or if 64-bit is a superset of 32-bit floats).
They are. --Johan Råde
participants (10)
-
Giovanni Piero Deretta
-
Gordon Woodhull
-
Johan Råde
-
John Phillips
-
Kim Barrett
-
Robert Kawulak
-
Steven Watanabe
-
Stjepan Rajko
-
Thorsten Ottosen
-
Thorsten Ottosen