Stream input and output of NaN and infinity

newer
Re: [boost] BigInt divide-by-zero...

older
Re: [boost] BigInt divide-by-zero...

Paul A Bristow

15 May 2006 15 May '06

9:47 a.m.

Jeff Garland proposed a solution the problem of serialization of NaN and infinity which had beed discussed http://article.gmane.org/gmane.comp.lib.boost.devel/141006/match=langer+nan Although this is probably the right expendient solution which could be done now (if anyone has the time and skill - SoC?), I also feel this is a general problem that needs a Standard solution (some impect on lexical_cast) so I have drafted a proposal for TR2 which is attached for comment. Briefly, it is a KISS solution. All infinities are output (and input) as one string like "Inf", And all the various NaNs are also a single string "NaN", And on input you get the numeric_limits<FPtype>::quiet_NaN() or infinity(), Is the same for ALL FP types (including UDTs) and layouts. Paul --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS

Attachments:

NaN_infinite_IO.pdf (application/pdf — 21.2 KB)

Show replies by date

Guillaume Melquiond

15 May 15 May

12:30 p.m.

Le lundi 15 mai 2006 à 10:47 +0100, Paul A Bristow a écrit :

...

Jeff Garland proposed a solution the problem of serialization of NaN and infinity which had beed discussed

http://article.gmane.org/gmane.comp.lib.boost.devel/141006/match=langer+nan

Although this is probably the right expendient solution which could be done now (if anyone has the time and skill - SoC?), I also feel this is a general problem that needs a Standard solution (some impect on lexical_cast) so I have drafted a proposal for TR2 which is attached for comment.

Briefly, it is a KISS solution.

All infinities are output (and input) as one string like "Inf",

And all the various NaNs are also a single string "NaN",

And on input you get the numeric_limits<FPtype>::quiet_NaN() or infinity(),

Is the same for ALL FP types (including UDTs) and layouts.

Just a few comments from the IEEE-754 point of view (I know that the C++ standard is supposed to accommodate a wider arithmetic than just IEEE-754, but still). The revision of the IEEE-754 standard happens to contain a paragraph on representing infinities and NaN. There still are motions on this paragraph and the situation should get clearer by the end of the week. As it stands, the paragraph mainly says that representing infinities and NaN is language-defined. Some additions are being considered though. First the case should not matter, so inf or Inf or INF should not make a difference (but what case means is not clear). Second, any outputted value should be inputted correspondingly. In particular, writing a payload after NaN is allowed, e.g. the C99 syntax "nan(payload)", and hence it should be parsed till the right parenthesis. It would be a bit annoying if C++ was not able to read back C99 values. Finally, the sign of a NaN is not part of its payload: the way it is computed or propagated is explicitly left undefined. As a consequence, I don't think there is much point in displaying plus or minus before NaN since it could have a "random" value. Best regards, Guillaume

Paul A Bristow

3:22 p.m.

| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of | Guillaume Melquiond | Sent: 15 May 2006 13:31 | To: boost@lists.boost.org | Subject: Re: [boost] Stream input and output of NaN and infinity | | Le lundi 15 mai 2006 à 10:47 +0100, Paul A Bristow a écrit : | > Jeff Garland proposed a solution the problem of | serialization of NaN and infinity which had been discussed | > | > | http://article.gmane.org/gmane.comp.lib.boost.devel/141006/ma | tch=langer+nan | > | > Although this is probably the right expendient solution | which could be done | > now (if anyone has the time and skill - SoC?), I also feel | this is a general | > problem that needs a Standard solution (some impect on | lexical_cast) so I | > have drafted a proposal for TR2 which is attached for comment. | > | > Briefly, it is a KISS solution. | > | > All infinities are output (and input) as one string like "Inf", | > | > And all the various NaNs are also a single string "NaN", | > | > And on input you get the | numeric_limits<FPtype>::quiet_NaN() or infinity(), | > | > Is the same for ALL FP types (including UDTs) and layouts. | | Just a few comments from the IEEE-754 point of view (I know | that the C++ | standard is supposed to accommodate a wider arithmetic than just | IEEE-754, but still). The revision of the IEEE-754 standard | happens to | contain a paragraph on representing infinities and NaN. | There still are | motions on this paragraph and the situation should get clearer by the | end of the week. | | As it stands, the paragraph mainly says that representing | infinities and | NaN is language-defined. Some additions are being considered though. | First the case should not matter, so inf or Inf or INF | should not make a | difference (but what case means is not clear). Second, any outputted | value should be inputted correspondingly. In particular, writing a | payload after NaN is allowed, e.g. the C99 syntax "nan(payload)", and | hence it should be parsed till the right parenthesis. It | would be a bit | annoying if C++ was not able to read back C99 values. | | Finally, the sign of a NaN is not part of its payload: the way it is | computed or propagated is explicitly left undefined. As a | consequence, I | don't think there is much point in displaying plus or minus | before NaN | since it could have a "random" value. Thanks for this helpful, and much more informed that I am, comment. Does this mean that the // C99 macros defined as C++ templates template<class T> bool signbit(T x); is undefined for a NaN? I note that ISO/IEC 9899:1999 (E) 7.19.6.1p8: "A double argument representing an infinity is converted in one of the styles [-]inf or [-]infinity - which style is implementation-defined. A double argument representing a NaN is converted in one of the styles [-]nan or [-]nan(n-char-sequence) Explicitly allows a preceeding - sign. Seems like it is hard to justify prohibiting the - sign - but clearly its meaning is 'implementation defined'? Similarly interpretation of the suffix (n-char-sequence) is too - but how is the stream expected to know when it has ended? This sounds excessively complicated for little gain in serialization when the real need is just to indicate a 'bad' number or a 'missing' number - being able to differeniate these two would be much more useful. Paul --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS

Guillaume Melquiond

5:03 p.m.

Le lundi 15 mai 2006 à 16:22 +0100, Paul A Bristow a écrit :

...

Does this mean that the

// C99 macros defined as C++ templates template<class T> bool signbit(T x);

is undefined for a NaN?

No, it is perfectly defined as the value of the highest bit of the binary representation. But whether or not it is actually carrying any useful information is a whole other matter. NaN have sign bits but no sign. :)

...

I note that

ISO/IEC 9899:1999 (E) 7.19.6.1p8:

"A double argument representing an infinity is converted in one of the styles [-]inf or [-]infinity - which style is implementation-defined. A double argument representing a NaN is converted in one of the styles [-]nan or [-]nan(n-char-sequence)

Explicitly allows a preceeding - sign.

Seems like it is hard to justify prohibiting the - sign - but clearly its meaning is 'implementation defined'?

Right. My point was mainly about your sentence: "This allows for positive and negative infinities, and for both positive and negative quiet_NaN." I would simply avoid speaking of the sign for a NaN.

...

Similarly interpretation of the suffix (n-char-sequence) is too - but how is the stream expected to know when it has ended?

I may be wrong, but it seems to me the parentheses are present in the output, so the sequence trivially ends when the right parenthesis is reached. Best regards, Guillaume

Robert Ramey

3:27 p.m.

Paul A Bristow wrote:

...

...
Jeff Garland proposed a solution the problem of serialization of NaN and infinity which had beed discussed

http://article.gmane.org/gmane.comp.lib.boost.devel/141006/match=langer+nan

Although this is probably the right expendient solution which could be done now (if anyone has the time and skill - SoC?), I also feel this is a general problem that needs a Standard solution (some impect on lexical_cast) so I have drafted a proposal for TR2 which is attached for comment.

Briefly, it is a KISS solution.

All infinities are output (and input) as one string like "Inf",

And all the various NaNs are also a single string "NaN",

And on input you get the numeric_limits<FPtype>::quiet_NaN() or infinity(),

As far as I'm concerned it not simple enough. Once you admit Inf then someone is going to insist on -Inf (rightly so in my opinion). The whole idea of trying to overload a floating point number with non-floating pointing values is the idea that has to go. I'm sure someone is going to chime in that this has legitimate uses but it causes lots of problems. I'm sure its the source of a lot of software bugs. So I would propose only one thing other than a float - a NaN. Actually, taking the idea to its logical conclusion, the best would be just to say that serialization of a NaN is undefined. The serialization library could implement this by trapping all attempts to serialize a NaN() as errors. If a user wants to save/load other stuff he could define a "rich float" which would parse the float into a flag + an optional floating point value. This would "reduce" the problem to making a portable implementation of "rich float" which your proposal would address. I doubt your proposal will be acceptable though. Those who think overloading floats with non-numeric indicators is a good idea will want more not less variety on the types. If the "rich float" were to be implemented, people would want to start adding their own overloads - like special indicators for pi and e . I think we should go in the opposite direction. A float is a a legitmate floating point value. A union of float and some other special non-floating point values is something else. Robert Ramey

David Abrahams

4:27 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

I think we should go in the opposite direction. A float is a a legitmate floating point value. A union of float and some other special non-floating point values is something else.

Notwithstanding the fact that NaN is "not a number," it is a legitimate floating-point value, i.e. a legitimate value for the type float. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

6:14 p.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
I think we should go in the opposite direction. A float is a a legitmate floating point value. A union of float and some other special non-floating point values is something else.

Notwithstanding the fact that NaN is "not a number," it is a legitimate floating-point value, i.e. a legitimate value for the type float.

I guess it depends what one means by "legitimate". It is is certainly "legal" in C++ - no question about that fact. Its certainly not a number - no question about that either. The problem is that C++ makes float a union of two different things and that is what creates all the problems we have with it. C++ should correct this by making the result of arithmetic operations which result in Nan's either undefined or throw exceptions. Oh I know that this would break a lot of existing code. I would argue that such code is already broken anyway - it just looks like its working. Robert Ramey

Peter Dimov

6:27 p.m.

Robert Ramey wrote:

...

David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
I think we should go in the opposite direction. A float is a a legitmate floating point value. A union of float and some other special non-floating point values is something else.

Notwithstanding the fact that NaN is "not a number," it is a legitimate floating-point value, i.e. a legitimate value for the type float.

I guess it depends what one means by "legitimate". It is is certainly "legal" in C++ - no question about that fact. Its certainly not a number - no question about that either.

It depends. Where do you draw the line? Is inf a number? Is -0.0 a number? You have to have NaN if you want to be able to represent x/y as a float.

Robert Ramey

7:10 p.m.

Peter Dimov wrote:

...

It depends. Where do you draw the line? Is inf a number? Is -0.0 a number? You have to have NaN if you want to be able to represent x/y as a float.

That's the problem. x/y is not a valid operation if y is equal to 0. So it can't be represented as a number. The fact that C++ permits such an operation makes C++ different than arithmetic. The fact that C++ uses operators like "/" and defines them similar to - but not identical to - the way they are defined by standard arithmetic is the source of all these problems. I say that C++ should be changed to so that the floats and operators which apply to them should implement what people expect from arithmetic operators. Robert Ramey

David Abrahams

7:29 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

Peter Dimov wrote:

...
It depends. Where do you draw the line? Is inf a number? Is -0.0 a number? You have to have NaN if you want to be able to represent x/y as a float.

That's the problem. x/y is not a valid operation if y is equal to 0. So it can't be represented as a number.

The fact that C++ permits such an operation makes C++ different than arithmetic. The fact that C++ uses operators like "/" and defines them similar to - but not identical to - the way they are defined by standard arithmetic is the source of all these problems.

Well, you'd better come up with a new number representation, then. assert(2/3*3 == 2); // boom assert(2.0/3.0*3.0 == 2.0); // boom

...

I say that C++ should be changed to so that the floats and operators which apply to them should implement what people expect from arithmetic operators.

That would be great, wouldn't it, LOL? (there, I get to use that one this time) There are two kinds of programmers in the world: those who expect floating values to behave like idealized Real numbers, and those whose programs using floating values actually work. In fact, among those whose programs work, C++ floats and operators that apply to them *do* implement what people expect from arithmetic operators. The only way they could get more predictable for these people would be to mandate that they be ieee standard floats (which do include NaN, FWIW). C++ floating types fit solidly into a long tradition of floating point numbers as supported in many other languages. I just want to close by saying that the domain of floating point isn't really amenable to glib pronouncements about what is "broken." If you need more evidence of that, read http://docs.sun.com/source/806-3568/ncg_goldberg.html -- Dave Abrahams Boost Consulting www.boost-consulting.com

Guillaume Melquiond

7:30 p.m.

Le lundi 15 mai 2006 à 12:10 -0700, Robert Ramey a écrit :

...

Peter Dimov wrote:

...
It depends. Where do you draw the line? Is inf a number? Is -0.0 a number? You have to have NaN if you want to be able to represent x/y as a float.

That's the problem. x/y is not a valid operation if y is equal to 0. So it can't be represented as a number.

The fact that C++ permits such an operation makes C++ different than arithmetic. The fact that C++ uses operators like "/" and defines them similar to - but not identical to - the way they are defined by standard arithmetic is the source of all these problems. I say that C++ should be changed to so that the floats and operators which apply to them should implement what people expect from arithmetic operators.

"Standard arithmetic" defines infinitely precise operators on real numbers. You cannot be seriously suggesting that C++ should stop using floating-point numbers and switch to such an arithmetic. The C++ float type is a *floating-point* type and as such it obeys the rules of *floating-point* arithmetic. Whatever your expectations, you cannot do anything about the fact that floating-point numbers have a limited range and a limited precision. Because of those, they are unable to obey the rules of a standard arithmetic (whatever it is). Best regards, Guillaume

Robert Ramey

16 May 16 May

2:49 a.m.

Guillaume Melquiond wrote:

...

Le lundi 15 mai 2006 à 12:10 -0700, Robert Ramey a écrit :

...
Peter Dimov wrote:

...
It depends. Where do you draw the line? Is inf a number? Is -0.0 a number? You have to have NaN if you want to be able to represent x/y as a float.

That's the problem. x/y is not a valid operation if y is equal to 0. So it can't be represented as a number.

The fact that C++ permits such an operation makes C++ different than arithmetic. The fact that C++ uses operators like "/" and defines them similar to - but not identical to - the way they are defined by standard arithmetic is the source of all these problems. I say that C++ should be changed to so that the floats and operators which apply to them should implement what people expect from arithmetic operators.

"Standard arithmetic" defines infinitely precise operators on real numbers. You cannot be seriously suggesting that C++ should stop using floating-point numbers and switch to such an arithmetic.

of course not.

...

The C++ float type is a *floating-point* type and as such it obeys the rules of *floating-point* arithmetic. Whatever your expectations, you cannot do anything about the fact that floating-point numbers have a limited range and a limited precision. Because of those, they are unable to obey the rules of a standard arithmetic (whatever it is).

I don't expect undefined operations to produce results. That's what most C++ implementations do. I expect that - as in "standard arithmetic" I get some sort of exception when I invoke an undefined operation. That's what my desk calculator does and it seems reasonable to me. Much better than just returning a bogus value to be used in the next operation. Robert Ramey

Martin Bonner

15 May 15 May

7:47 p.m.

-----Original Message----- Robert Ramey wrote:

...

Peter Dimov wrote:

...
It depends. Where do you draw the line? Is inf a number? Is -0.0 a number? You have to have NaN if you want to be able to represent x/y as a float.

...

That's the problem. x/y is not a valid operation if y is equal to 0. So it can't be represented as a number.

...

The fact that C++ permits such an operation makes C++ different than arithmetic. "different than SOME arithmetics". The arithmentic C++ (well, actually IEEE) defines is a perfectly sensible arithmetic.

...

The fact that C++ uses operators like "/" and defines them similar to - but not identical to - the way they are defined by standard arithmetic

Granted. It isn't the standard arithmetic.

...

is the source of all these problems. What problems? (Apart from the extra work involved for authors of libraries like serialization I mean!)

...

I say that C++ should be changed to so that the floats and operators which apply to them should implement what people expect from arithmetic operators.

Yes, but realistically that is never going to happen. Wouldn't it be better to support the use-cases that people have in real code? -- Martin Bonner Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB4 6WZ +44 1223 203894

Robert Ramey

16 May 16 May

2:43 a.m.

Martin Bonner wrote:

...

-----Original Message----- Robert Ramey wrote:

...
Peter Dimov wrote:

...
It depends. Where do you draw the line? Is inf a number? Is -0.0 a number? You have to have NaN if you want to be able to represent x/y as a float.

...
That's the problem. x/y is not a valid operation if y is equal to 0. So it can't be represented as a number.

...
The fact that C++ permits such an operation makes C++ different than arithmetic. "different than SOME arithmetics". The arithmentic C++ (well, actually IEEE) defines is a perfectly sensible arithmetic.

...
The fact that C++ uses operators like "/" and defines them similar to - but not identical to - the way they are defined by standard arithmetic

Granted. It isn't the standard arithmetic.

...
is the source of all these problems. What problems? (Apart from the extra work involved for authors of libraries like serialization I mean!)

suppose that z = x * y generates a Nan or +Inf or whatever one some machine for some x and y. Now z contains an undefined value which is used on some other operations which presumably result in other types of Nan's. This behavior has the following problems: a) its undefined b) it varies from machine to machine. On some machines the hardware will trap an abort the program as it won't throw a C++ exception. Other machines will store some variety of Nan in the result c) if it doesn't trap we're just massaging undefined values. d) we're getting some useless result but don't know it untill maybe later or many never. How can any "real program" find this useful? How can such a program not be "broken". I suppose there might be some case where its OK but they would have to be special in some way.

...

...
I say that C++ should be changed to so that the floats and operators which apply to them should implement what people expect from arithmetic operators.

...

Yes, but realistically that is never going to happen.

LOL - it seems you're right there. But that is not my point here.

...

Wouldn't it be better to support the use-cases that people have in real code?

well that's what we have to do. But it raises the issue of whether its worth spending time to support programs that are most likely broken in any case. Robert Ramey

John Maddock

9:47 a.m.

Robert Ramey wrote:

...

...
Granted. It isn't the standard arithmetic.

...
is the source of all these problems. What problems? (Apart from the extra work involved for authors of libraries like serialization I mean!)

suppose that z = x * y generates a Nan or +Inf or whatever one some machine for some x and y. Now z contains an undefined value which is used on some other operations which presumably result in other types of Nan's. This behavior has the following problems:

a) its undefined b) it varies from machine to machine. On some machines the hardware will trap an abort the program as it won't throw a C++ exception. Other machines will store some variety of Nan in the result c) if it doesn't trap we're just massaging undefined values. d) we're getting some useless result but don't know it untill maybe later or many never.

How can any "real program" find this useful? How can such a program not be "broken". I suppose there might be some case where its OK but they would have to be special in some way.

The usual use case is this: suppose you have an algorithm with a "fast-but-possibly-fagile" implementation, and a "slow-but-never-brakes" implementation. It's reasonably common to try the "fast" version first and then revert to the slow-and-possibly-inaccurate-but-stable version only if you need to: because you get an infinity or NaN from the "fast" version for example. Sometimes the difference in performance between the two is *huge*, and if the fallback version has to resort to using logs for calculation (just as one example), then it may be much less accurate, as well as slower. Now I admit, that personally I've never *quite* needed to serialise intermediate results of calculations, but I've come close a few times, and for very long running algorithms I can certainly see the need for checkpointing / save and restore, so that programs can be stopped and started without returning to the starting post every time. Think large scale monti-carlo simulations for example, or even distributed applications like SETI@home, and it's drug-discovery copycat's. Currently the problem with serialising NaN's and Infinities, not to mention the regular stream IO bug that's been discussed around here, makes this far more error prone than it should be. In other words there is a problem looking for a solution here: that doesn't mean that you should be expected to just go off and solve it based on someone's whim, but the problem does need looking at and addressing by someone nonetheless. John.

Robert Ramey

3:40 p.m.

John Maddock wrote:

...

The usual use case is this: suppose you have an algorithm with a "fast-but-possibly-fagile" implementation, and a "slow-but-never-brakes" implementation. It's reasonably common to try the "fast" version first and then revert to the slow-and-possibly-inaccurate-but-stable version only if you need to: because you get an infinity or NaN from the "fast" version for example. Sometimes the difference in performance between the two is *huge*, and if the fallback version has to resort to using logs for calculation (just as one example), then it may be much less accurate, as well as slower.

Actually I can sympathize with this scenario - that was my motivation for the "safe_float" example.

...

Currently the problem with serialising NaN's and Infinities, not to mention the regular stream IO bug that's been discussed around here, makes this far more error prone than it should be.

Technically, this isn't a serialization issue. It only comes up there because text based archives use stream IO where the issue crops up. That is, the problem pops up when one tries to load a NaN of somesort from a text stream. Stream I/O fails. The serialization system just recognizes that fact and punts with an exception. So if one wants to address this the best way would be to fix the underlying stream i/o.

...

In other words there is a problem looking for a solution here: that doesn't mean that you should be expected to just go off and solve it based on someone's whim, but the problem does need looking at and addressing by someone nonetheless.

No dispute there - the question is what is the correct solution. Robert Ramey

John Maddock

9:27 a.m.

Robert Ramey wrote:

...

Peter Dimov wrote:

...
It depends. Where do you draw the line? Is inf a number? Is -0.0 a number? You have to have NaN if you want to be able to represent x/y as a float.

That's the problem. x/y is not a valid operation if y is equal to 0. So it can't be represented as a number.

The fact that C++ permits such an operation makes C++ different than arithmetic. The fact that C++ uses operators like "/" and defines them similar to - but not identical to - the way they are defined by standard arithmetic is the source of all these problems. I say that C++ should be changed to so that the floats and operators which apply to them should implement what people expect from arithmetic operators.

Hold on a second, C++ implements what the IEEE-754 standard requires FP arithmetic to do, and infinities and NaN are definitely part of that standard. I assure you that they do have legitimate uses, but more to the point, it's not only divide by zero that generates infinities, heck even addition (or subtraction) can generate infinities if push comes to shove. John.

Robert Ramey

4:09 p.m.

John Maddock wrote:

...

Hold on a second, C++ implements what the IEEE-754 standard requires FP arithmetic to do, and infinities and NaN are definitely part of that standard.

Hmmm - I want to clarifiy this. (The quoted text is from http://docs.sun.com/source/806-3568/ncg_goldberg.html) a) IEEE-754 does specify infinities and NaN C++ doesn't - C++ leaves it undefined and up to the implementation. b) IEEE-754 specifies flags to inquire as to the result of the last floating point operation. C++ doesn't specify anything about this. c) "The IEEE standard strongly recommends that implementations allow trap handlers to be installed." C++ doesn't permit this. On the other hand, the C++ standard library does support throwing exceptions from functions sine(x) for domain errors and such. So there's a mismatch here. d) "Another ambiguity in most language definitions concerns what happens on overflow, underflow and other exceptions. The IEEE standard precisely specifies the behavior of exceptions, and so languages that use the standard as a model can avoid any ambiguity on this point. " But C++ doesn't permit exceptions to be thrown in these instances.

...

I assure you that they do have legitimate uses,

Actually, the paper cited about give a good example of a such legitimate use. I believe such uses are far less common than people seem to think. Even the example cited in the paper wouldn't require saving and recovering from a text stream.

...

but more to the point, it's not only divide by zero that generates infinities, heck even addition (or subtraction) can generate infinities if push comes to shove.

Of course, I'm just using divide by zero as a one of the most common cases. But it occurs in other cases. So C++ is out of sync with IEEE-754. I see two ways to put it into sync. a) define what C++ should do for currently undefined operations like divide by zero and the others. b) require that C++ implementations throw exceptions when undefined operations are invoked. Until one of the above (or maybe something else) is done. There can really be no unambiguous resolution to the problem of passing results from undefined operations from one machine to another. Obviously, I believe that the adoption of b) above would result in fewer programs with hidden bugs. Robert Ramey

David Abrahams

17 May 17 May

3:42 a.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

John Maddock wrote:

...
Hold on a second, C++ implements what the IEEE-754 standard requires FP arithmetic to do,

Well, not quite. Most C++ implementations do implement IEEE-754.

...

...
and infinities and NaN are definitely part of that standard.

Hmmm - I want to clarifiy this. (The quoted text is from http://docs.sun.com/source/806-3568/ncg_goldberg.html)

Well, no offense, but you did a very bad job of clarifying. It seems like you went to an authoritative document and tried to use it to support a number of incorrect conclusions. A casual reader might be fooled into thinking that this document backs up your argument, but it doesn't.

...

a) IEEE-754 does specify infinities and NaN

Correct.

...

C++ doesn't - C++ leaves it undefined and up to the implementation.

No and yes. C++ does not leave NaN and infinity undefined, but whether to supply them is up to the implementation... just like whether to supply int with value > 32767 is up to the implementation.

...

b) IEEE-754 specifies flags to inquire as to the result of the last floating point operation. C++ doesn't specify anything about this.

Correct, it does not specify.

...

c) "The IEEE standard strongly recommends that implementations allow trap handlers to be installed." C++ doesn't permit this.

Incorrect. C++ absolutely does permit implementations to allow trap handlers to be installed. C++ simply does not require it.

...

On the other hand, the C++ standard library does support throwing exceptions from functions sine(x) for domain errors and such.

That's not an "on the other hand," it's a case of identical treatment! C++ supports the installation of trap handlers to precisely the same degree that it supports throwing exceptions from sin(x): both are 100% up to the implementation.

...

So there's a mismatch here.

Incorrect. There's no mismatch; there's perfect consistency, as noted above.

...

d) "Another ambiguity in most language definitions concerns what happens on overflow, underflow and other exceptions. The IEEE standard precisely specifies the behavior of exceptions, and so languages that use the standard as a model can avoid any ambiguity on this point. " But C++ doesn't permit exceptions to be thrown in these instances.

Incorrect. Exceptions can be thrown anywhere that undefined behavior is specified. Overflow, underflow, and divide-by-zero all induce undefined behavior.

...

...
I assure you that they do have legitimate uses,

Actually, the paper cited about give a good example of a such legitimate use. I believe such uses are far less common than people seem to think. Even the example cited in the paper wouldn't require saving and recovering from a text stream.

...
but more to the point, it's not only divide by zero that generates infinities, heck even addition (or subtraction) can generate infinities if push comes to shove.

Of course, I'm just using divide by zero as a one of the most common cases. But it occurs in other cases.

So C++ is out of sync with IEEE-754.

It's not out of sync, and IEEE-754 is much better supported than any number of other optional features of the language. Explicit provisions are made for it in the standard.

...

I see two ways to put it into sync.

a) define what C++ should do for currently undefined operations like divide by zero and the others.

b) require that C++ implementations throw exceptions when undefined operations are invoked.

b) is just a special case of a). I will agree that eliminating undefined floating point behaviors will make C++ more predictable.

...

Until one of the above (or maybe something else) is done. There can really be no unambiguous resolution to the problem of passing results from undefined operations from one machine to another.

Of course there can be. All you need to do is write a specification for it that describes what happens in all cases, and it will be unambiguous. If you can do this for ints that have nonportable values greater than 32767, you can do it for floats and doubles, too.

...

Obviously, I believe that the adoption of b) above would result in fewer programs with hidden bugs.

That's almost certainly wrong. Floating point divide-by-zero is almost never due to a program bug. And you can get the same effects when dividing by a nonzero number if the result can't be represented. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

5:13 a.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...

...
c) "The IEEE standard strongly recommends that implementations allow trap handlers to be installed." C++ doesn't permit this.

Incorrect. C++ absolutely does permit implementations to allow trap handlers to be installed. C++ simply does not require it.

My basis for citing this is page 122 of "The C++ Progamming Language" by Stroustrup, copyright 2000, reprinted May 2003 with corrections. It says, "In particular, underflow, overflow and division by zero do not throw standard exceptions". If that's wrong, incomplete or out of date, I would be curious to know about it. It seems to comport with my personal experience with C++ numeric operations. Feel free to expand upon this.

...

...
d) "Another ambiguity in most language definitions concerns what happens on overflow, underflow and other exceptions. The IEEE standard precisely specifies the behavior of exceptions, and so languages that use the standard as a model can avoid any ambiguity on this point. " But C++ doesn't permit exceptions to be thrown in these instances.

Incorrect. Exceptions can be thrown anywhere that undefined behavior is specified. Overflow, underflow, and divide-by-zero all induce undefined behavior.

Hmmm I suppose that any thing can happen when undefined behavior is specified. So writing a program that depends upon an undefined operation yielding a Nan would be a bad idea - wouldn't it?

...

b) is just a special case of a). I will agree that eliminating undefined floating point behaviors will make C++ more predictable.

...

...
Until one of the above (or maybe something else) is done. There can really be no unambiguous resolution to the problem of passing results from undefined operations from one machine to another.

Of course there can be. All you need to do is write a specification for it that describes what happens in all cases, and it will be unambiguous. If you can do this for ints that have nonportable values greater than 32767, you can do it for floats and doubles, too.

Besides writing such a specification, wouldn't C++ vendors have to agree to implement it?.

...

...
Obviously, I believe that the adoption of b) above would result in fewer programs with hidden bugs.

...

That's almost certainly wrong. Floating point divide-by-zero is almost never due to a program bug. And you can get the same effects when dividing by a nonzero number if the result can't be represented.

The kind of situation I'm thinking of is more like the following. I've got a program which among its operations is a matrix inversion. The program correctly implements the chosen algorithm. Now I load a near-singular matrix and invoke the matrix inversion operation. The sequence of operatons results in over/under flows in some intermediate results. No exception is thrown but some NaN's are propagated through the calculations. The final result Matrix may or may not have one or man Nan's. So now I have a result that is wrong but do not know it and have no way of knowing it. In FORTRAN this was never a problem as the program aborts at the first overflow/underflow or whatever. What am I expected to do here? I could recode the matrix inversion to check each intermediate result to see if its a NaN? I can't imagine that's what I'm expected to do. How do people handle this now? Robert Ramey

David Abrahams

1:54 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
...
c) "The IEEE standard strongly recommends that implementations allow trap handlers to be installed." C++ doesn't permit this.

Incorrect. C++ absolutely does permit implementations to allow trap handlers to be installed. C++ simply does not require it.

My basis for citing this is page 122 of "The C++ Progamming Language" by Stroustrup, copyright 2000, reprinted May 2003 with corrections. It says, "In particular, underflow, overflow and division by zero do not throw standard exceptions". If that's wrong, incomplete or out of date, I would be curious to know about it.

What do you want me to tell you about it? B.S. was probably writing colloquially, as in "there is no guarantee that the implementation will throw a standard exception."

...

It seems to comport with my personal experience with C++ numeric operations.

That means nothing about what implementations are allowed to do, and I'm pretty sure I can set VC++ up to throw a C++ exception in these cases. Yep, there it is: http://tinyurl.com/9my88 (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstechar...)

...

Feel free to expand upon this.

No, I'm not gonna expand upon it. Don't make me do the legwork; get a copy of standard and read what it says. Describing to you what's plainly written in the standard until you believe me is a big time and bandwidth waster.

...

...
...
d) "Another ambiguity in most language definitions concerns what happens on overflow, underflow and other exceptions. The IEEE standard precisely specifies the behavior of exceptions, and so languages that use the standard as a model can avoid any ambiguity on this point. " But C++ doesn't permit exceptions to be thrown in these instances.

Incorrect. Exceptions can be thrown anywhere that undefined behavior is specified. Overflow, underflow, and divide-by-zero all induce undefined behavior.

Hmmm I suppose that any thing can happen when undefined behavior is specified.

Yes, that's intentional latitude for the implementation to be able to behave gracefully or not, as the situation demands.

...

So writing a program that depends upon an undefined operation yielding a Nan would be a bad idea - wouldn't it?

No. If your implementation says it is IEEE-754 compliant, as many are, it's a perfectly good idea. Just like it's a perfectly good idea to depend on pthreads if you know your implementation is on POSIX, or to depend on the presence of 64-bit integers if your implementation tells you it supports long long, or...

...

...
b) is just a special case of a). I will agree that eliminating undefined floating point behaviors will make C++ more predictable.

...
...
Until one of the above (or maybe something else) is done. There can really be no unambiguous resolution to the problem of passing results from undefined operations from one machine to another.

Of course there can be. All you need to do is write a specification for it that describes what happens in all cases, and it will be unambiguous. If you can do this for ints that have nonportable values greater than 32767, you can do it for floats and doubles, too.

Besides writing such a specification, wouldn't C++ vendors have to agree to implement it?.

No, we're talking about a specification for serialization. That's not the job of the C++ vendor. The C++ vendor already tells you everything you need to know, e.g. "I support quiet NaN (or not)" and "here's how to produce a quiet NaN if I support them," etc.

...

...
...
Obviously, I believe that the adoption of b) above would result in fewer programs with hidden bugs.

...
That's almost certainly wrong. Floating point divide-by-zero is almost never due to a program bug. And you can get the same effects when dividing by a nonzero number if the result can't be represented.

The kind of situation I'm thinking of is more like the following. I've got a program which among its operations is a matrix inversion. The program correctly implements the chosen algorithm. Now I load a near-singular matrix and invoke the matrix inversion operation. The sequence of operatons results in over/under flows in some intermediate results. No exception is thrown but some NaN's are propagated through the calculations. The final result Matrix may or may not have one or man Nan's. So now I have a result that is wrong but do not know it and have no way of knowing it.

If the implementation supports NaNs, of course you do. Check to see if there are NaNs in the matrix. This is no different from a calculation on ints that may have produced intermediate values greater than 32767, except that the condition is easier to detect because NaNs are sticky. In this regard, the usual implementation of floating point math is much less error-prone than the usual implementation of integer math.

...

In FORTRAN this was never a problem as the program aborts at the first overflow/underflow or whatever.

I guarantee you the FORTRAN spec doesn't say that the program will abort upon "overflow/underflow or whatever." FORTRAN was probably the first language to ever implement IEEE-754. Just google for "fortran nan" and you'll see what I mean.

...

What am I expected to do here? I could recode the matrix inversion to check each intermediate result to see if its a NaN?

No, NaNs propagate into every calculation they touch, so if you got them in the matrix, you'll see them in the output. And if you multiply that matrix by a vector, the resulting non-NaN elements will still be meaningful (provided the original matrix was well-conditioned, which is a whole other matter).

...

I can't imagine that's what I'm expected to do. How do people handle this now?

I am not a numerics expert, but I know enough to understand that they do handle it in predicatable ways, and that doing so is important to them. Why don't you do a little research yourself? I'm sure a few well-aimed web searches will yield a wealth of information. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams

15 May 15 May

7:01 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
I think we should go in the opposite direction. A float is a a legitmate floating point value. A union of float and some other special non-floating point values is something else.

Notwithstanding the fact that NaN is "not a number," it is a legitimate floating-point value, i.e. a legitimate value for the type float.

I guess it depends what one means by "legitimate". It is is certainly "legal" in C++ - no question about that fact. Its certainly not a number - no question about that either. The problem is that C++ makes float a union of two different things

Not any more than a pointer is a "union of two different things." There are quite a few operations on pointers that suddenly become verboten when the pointer is null, and even more such operations when the pointer is dangling.

...

and that is what creates all the problems we have with it. C++ should correct this by making the result of arithmetic operations which result in Nan's either undefined or throw exceptions. Oh I know that this would break a lot of existing code.

Not only break, but break unfixably. There are important numerical applications where large collections of individual floating values (e.g. vectors and matrices) have to be computed in parallel, and one or more NaNs in the result do not render the whole calculation useless. Besides all that, eliminating NaNs from the serialization problem space is really impossible. Even if we struck them from the C++ language, someone would effectively recreate them using optional<float> or the equivalent.

...

I would argue that such code is already broken anyway - it just looks like its working.

Really, you still think so? If so, I'd like to know how you measure "brokenness." -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

16 May 16 May

3:10 a.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
I think we should go in the opposite direction. A float is a a legitmate floating point value. A union of float and some other special non-floating point values is something else.

Notwithstanding the fact that NaN is "not a number," it is a legitimate floating-point value, i.e. a legitimate value for the type float.

I guess it depends what one means by "legitimate". It is is certainly "legal" in C++ - no question about that fact. Its certainly not a number - no question about that either. The problem is that C++ makes float a union of two different things

Not any more than a pointer is a "union of two different things." There are quite a few operations on pointers that suddenly become verboten when the pointer is null, and even more such operations when the pointer is dangling.

pointers make no pretense to implement the operations of another system - in this case arithmetic. The concept of a pointer being the "union of two different things" is intuitively acceptable while its not for something meant to model a real number.

...

...
and that is what creates all the problems we have with it. C++ should correct this by making the result of arithmetic operations which result in Nan's either undefined or throw exceptions. Oh I know that this would break a lot of existing code.

...

Not only break, but break unfixably. There are important numerical applications where large collections of individual floating values (e.g. vectors and matrices) have to be computed in parallel, and one or more NaNs in the result do not render the whole calculation useless.

I suppose that's possible. I'm not familiar with such applications. It sounds to me that they would be implemented in special hardware and sort of out of the C++ mainstream.

...

Besides all that, eliminating NaNs from the serialization problem space is really impossible. Even if we struck them from the C++ language, someone would effectively recreate them using optional<float> or the equivalent.

Actually that would be an improvement in that the "union" would be explicit, visible and modifiable. In fact I could see the utility right now of a "safe_float" which would look something like #ifndef NDEBUG BOOST_STRONG_TYPE(float, safe_float) safe_float operator/(safe_float x, safe_float y){ if(y < epsilon) // machine dependent epsilon throw overflow_exception return x / y; } ... #else #define safe_float float #endif

...

...
I would argue that such code is already broken anyway - it just looks like its working.

...

Really, you still think so?

My view on this is in another post.

...

If so, I'd like to know how you measure "brokenness."

A program which produces an undefined result is "broke" BTW - as far as the serialization system is concerned I never had a problem with the idea of recovering the exact kind of "undefined" data. Its just that there's no way to do it with archives which might be ported from one machine to another as there is no guarentee that the reading machine has the same set of of undefined results as the source machine. Its even worse, there is way to write portable code which will generate any specific one of the "undefined" types. One might hack something together that would recover some undefined type but it wouldn't be guarenteed to the the same original type of undefined type. So the whole effort would be for naught. There was a discussion of this on the list a while ago and this was the conclusion. It was in the course of this discussion that I reached the conclusions I've stated here. Robert Ramey

Jeff Garland

5:21 a.m.

Robert Ramey wrote:

...

...
Besides all that, eliminating NaNs from the serialization problem space is really impossible. Even if we struck them from the C++ language, someone would effectively recreate them using optional<float> or the equivalent.

Actually that would be an improvement in that the "union" would be explicit, visible and modifiable. In fact I could see the utility right now of a "safe_float" which would look something like

#ifndef NDEBUG BOOST_STRONG_TYPE(float, safe_float) safe_float operator/(safe_float x, safe_float y){ if(y < epsilon) // machine dependent epsilon throw overflow_exception return x / y; } ... #else #define safe_float float #endif

Robert -- I believe that thinking of floats as a "union" is the wrong analogy. In date_time there's -inf, +inf, and not_a_date_time. There's no union internally used to represent them. These values are extremely useful for writing real programs and have obvious mappings into the real world -- trust me, I've used them in a real world scheduling system. In fact, since date_time uses integers internally the special values are simply implemented as a reserved number value. nadt is essentially max_int, +inf == max_int - 1 and -inf == min_int. I believe you can think if floating point number special values in the same way -- although the internal implementation may be in hardware ultimately, the special values are really just certain bit patterns that are defined to represent these logical states.

...

A program which produces an undefined result is "broke"

I guess all of 'math' is broken then too? The only way to deal with singularities in math is to effectively 'bail-out' -- say, well don't do that. If you try to do a divide by zero in a program it's the same -- some external logic needs to deal with that case. The fact that the floating point type tells you that you divided by zero by giving and infinite result is actually a well defined and correct interface.

...

BTW - as far as the serialization system is concerned I never had a problem with the idea of recovering the exact kind of "undefined" data. Its just that there's no way to do it with archives which might be ported from one machine to another as there is no guarentee that the reading machine has the same set of of undefined results as the source machine. Its even worse, there is way to write portable code which will generate any specific one of the "undefined" types. One might hack something together that would recover some undefined type but it wouldn't be guarenteed to the the same original type of undefined type. So the whole effort would be for naught.

I think the whole thing that started this thread is to fix the standard so that these values can be correctly and portably serialized. But it seems that we are now distracted from that purpose. In my view, it's terribly flawed when a language like JavaScript can correctly serialize NaNs and Infinites and C++ can't. We need to help Paul fix the standard, it just isn't that difficult a problem... Jeff

Guillaume Melquiond

5:54 a.m.

Le lundi 15 mai 2006 à 22:21 -0700, Jeff Garland a écrit :

...

In date_time there's -inf, +inf, and not_a_date_time. There's no union internally used to represent them. These values are extremely useful for writing real programs and have obvious mappings into the real world -- trust me, I've used them in a real world scheduling system. In fact, since date_time uses integers internally the special values are simply implemented as a reserved number value. nadt is essentially max_int, +inf == max_int - 1 and -inf == min_int.

For symmetry purpose, I would have chosen nadt == min_int, +inf = max_int, -inf = min_int + 1. That way, in usual two-complement representations, there is as much positive than negative finite values and infinities have the same absolute representation. Just pointing it out in case nobody did before. Best regards, Guillaume

Robert Ramey

6:18 a.m.

Jeff Garland wrote:

...

Robert Ramey wrote:

...
...
Besides all that, eliminating NaNs from the serialization problem space is really impossible. Even if we struck them from the C++ language, someone would effectively recreate them using optional<float> or the equivalent.

Actually that would be an improvement in that the "union" would be explicit, visible and modifiable. In fact I could see the utility right now of a "safe_float" which would look something like

#ifndef NDEBUG BOOST_STRONG_TYPE(float, safe_float) safe_float operator/(safe_float x, safe_float y){ if(y < epsilon) // machine dependent epsilon throw overflow_exception return x / y; } ... #else #define safe_float float #endif

Robert -- I believe that thinking of floats as a "union" is the wrong analogy. In date_time there's -inf, +inf, and not_a_date_time. There's no union internally used to represent them. These values are extremely useful for writing real programs and have obvious mappings into the real world -- trust me, I've used them in a real world scheduling system. In fact, since date_time uses integers internally the special values are simply implemented as a reserved number value. nadt is essentially max_int, +inf == max_int - 1 and -inf == min_int. I believe you can think if floating point number special values in the same way -- although the internal implementation may be in hardware ultimately, the special values are really just certain bit patterns that are defined to represent these logical states.

In my view, your usage above is logically and functionally equivalent to something like class dt { enum { a, b, c } state; int value; }; Since value can be of different "types" it pretty equivalent to a C union. The fact that its not an C union is really just an implementation detail. I'm using the word union as a short hand for all these types of constructions. Used in this way the current float behavior is a union. Perhaps a better word would have been "multi-type" of which the C union would be an example.

...

...
A program which produces an undefined result is "broke"

I guess all of 'math' is broken then too? The only way to deal with singularities in math is to effectively 'bail-out' -- say, well don't do that.

That's how math works - I think C++ should work the same way.

...

If you try to do a divide by zero in a program it's the same -- some external logic needs to deal with that case. The fact that the floating point type tells you that you divided by zero by giving and infinite result is actually a well defined and correct interface.

Not in any portable way. Page 132 of my reference "The C++ Programming Language" by Stroustrup says "the effect of dividing by zero is undefined but doing so usually causes abrupt termination of the program." In fact, when running under the debugger, my VC++ IDE permits me to trap or ignore and continue when this occurs. Intel chips have a floating point unit which throws a hardware exception when this occurs. C++ is designed to not handle these cases. I don't know what other processors do - I suppose it varies. In any case - its undefined.

...

...
BTW - as far as the serialization system is concerned ...

I think the whole thing that started this thread is to fix the standard so that these values can be correctly and portably serialized.

I only mentioned it because someone expressed the notion that this is only a problem for serialization. I don't think its a serialization problem. I think its a problem which manifests itself when one tries to implement serialization - among other places.

...

But it seems that we are now distracted from that purpose. In my view, it's terribly flawed when a language like JavaScript can correctly serialize NaNs and Infinites and C++ can't.

Well, of course that's true if one believes that NaNs and +/-Inf, +/-0 etc have legitimate uses. But I'm not convinced.

...

We need to help Paul fix the standard, it just isn't that difficult a problem...

I'll bet that given the variety of handling things now - it will be a lot more difficult than it first appears. Robert Ramey

Geoffrey Irving

6:45 a.m.

On Mon, May 15, 2006 at 11:18:08PM -0700, Robert Ramey wrote:

...

Jeff Garland wrote:

<snip

...
...
A program which produces an undefined result is "broke"

I guess all of 'math' is broken then too? The only way to deal with singularities in math is to effectively 'bail-out' -- say, well don't do that.

That's how math works - I think C++ should work the same way.

Sorry for the short rant: Saying that 1/0 means bail out is "how math works" is missing the point. The real number line, plus infinities, plus NaN, is one of the simplest extensions of the real numbers which is closed under the basic arithmetic operators. The idea of adding positive and negative infinities to the real line is hundreds of years old. Arguing that 1/0 should crash because that's what happens in "math" is the same as arguing that 2*max_int should crash because that's what happens in math...unless you have modulo arithmetic. As for the original topic, I very much like the proposal to have +-inf and NaN as the only special values. Geoffrey

Robert Ramey

2:44 p.m.

Geoffrey Irving wrote:

...

On Mon, May 15, 2006 at 11:18:08PM -0700, Robert Ramey wrote:

...
Jeff Garland wrote:

<snip

...
...
A program which produces an undefined result is "broke"

I guess all of 'math' is broken then too? The only way to deal with singularities in math is to effectively 'bail-out' -- say, well don't do that.

That's how math works - I think C++ should work the same way.

Sorry for the short rant:

No apologies necesssary - that's what we do here.

...

Saying that 1/0 means bail out is "how math works" is missing the point. The real number line, plus infinities, plus NaN, is one of the simplest extensions of the real numbers which is closed under the basic arithmetic operators. The idea of adding positive and negative infinities to the real line is hundreds of years old. Arguing that 1/0 should crash because that's what happens in "math" is the same as arguing that 2*max_int should crash because that's what happens in math...unless you have modulo arithmetic.

True - I would argue that as well

...

As for the original topic, I very much like the proposal to have +-inf and NaN as the only special values.

Geoffrey _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Damien Fisher

5:21 a.m.

On 5/16/06, Robert Ramey <ramey@rrsd.com> wrote:

...

David Abrahams wrote:

...
Not only break, but break unfixably. There are important numerical applications where large collections of individual floating values (e.g. vectors and matrices) have to be computed in parallel, and one or more NaNs in the result do not render the whole calculation useless.

I suppose that's possible. I'm not familiar with such applications. It sounds to me that they would be implemented in special hardware and sort of out of the C++ mainstream.

[Sorry to jump into the thread so late...but I just can't resist.] It really seems to me that you are objecting to a numerical model which has worked well for decades in numerous domains and into which much effort has been expended. The IEEE

...

Besides all that, eliminating NaNs from the serialization problem

...
space is really impossible. Even if we struck them from the C++ language, someone would effectively recreate them using optional<float> or the equivalent.

Actually that would be an improvement in that the "union" would be explicit, visible and modifiable. In fact I could see the utility right now of a "safe_float" which would look something like

#ifndef NDEBUG BOOST_STRONG_TYPE(float, safe_float) safe_float operator/(safe_float x, safe_float y){ if(y < epsilon) // machine dependent epsilon throw overflow_exception return x / y; } ... #else #define safe_float float #endif

...
...
I would argue that such code is already broken anyway - it just looks like its working.

...
Really, you still think so?

My view on this is in another post.

...
If so, I'd like to know how you measure "brokenness."

A program which produces an undefined result is "broke"

BTW - as far as the serialization system is concerned I never had a problem with the idea of recovering the exact kind of "undefined" data. Its just that there's no way to do it with archives which might be ported from one machine to another as there is no guarentee that the reading machine has the same set of of undefined results as the source machine. Its even worse, there is way to write portable code which will generate any specific one of the "undefined" types. One might hack something together that would recover some undefined type but it wouldn't be guarenteed to the the same original type of undefined type. So the whole effort would be for naught.

There was a discussion of this on the list a while ago and this was the conclusion. It was in the course of this discussion that I reached the conclusions I've stated here.

Robert Ramey

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey

6:21 a.m.

Damien Fisher wrote:

...

It really seems to me that you are objecting to a numerical model which has worked well for decades in numerous domains and into which much effort has been expended.

Hmmm - If I remember correctly from loooong ago when I used FORTRAN the issue never came up as things like divide by 0, overflow etc always crashed the program. I don't remember any notion that these were legitimate operations on which further computation could be performed. Robert Ramey

Damien Fisher

5:24 a.m.

On 5/16/06, Robert Ramey <ramey@rrsd.com> wrote:

...

David Abrahams wrote:

...
Not only break, but break unfixably. There are important numerical applications where large collections of individual floating values (e.g. vectors and matrices) have to be computed in parallel, and one or more NaNs in the result do not render the whole calculation useless.

I suppose that's possible. I'm not familiar with such applications. It sounds to me that they would be implemented in special hardware and sort of out of the C++ mainstream.

[Sorry to jump into the thread so late...but I just can't resist. And apologies for the cut-off post.] Have you ever heard of SSE/MMX/...? Vectorization is a very common optimization these days, and that's on commodity intel boxes. It really seems to me that you are objecting to a numerical model which has worked well for decades in numerous domains and into which much effort has been expended. The IEEE standard wasn't cooked up at random. If you have objections to the semantics I'd suggest you try coming up with a full proposal that works as well. That's probably the best way to realize why things are the way they are ;). (FWIW, I used to have problems with NaN/Inf back in the day too...) Damien

David Abrahams

12:46 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
There are important numerical applications where large collections of individual floating values (e.g. vectors and matrices) have to be computed in parallel, and one or more NaNs in the result do not render the whole calculation useless.

I suppose that's possible. I'm not familiar with such applications.

Exactly.

...

It sounds to me that they would be implemented in special hardware

No.

...

and sort of out of the C++ mainstream.

Wow, that's exactly the kind of thing politicians say about each other when they want to paint their views as radical and worthy of dismissal.

...

...
Besides all that, eliminating NaNs from the serialization problem space is really impossible. Even if we struck them from the C++ language, someone would effectively recreate them using optional<float> or the equivalent.

Actually that would be an improvement in that the "union" would be explicit, visible and modifiable.

It's currently visible, since it is possible to test for NaN. And as far as I know, it's modifiable. What, exactly, do you mean by "modifiable?" I can explicitly request a NaN from numeric_limits.

...

...
...
I would argue that such code is already broken anyway - it just looks like its working.

...
Really, you still think so?

My view on this is in another post.

...
If so, I'd like to know how you measure "brokenness."

A program which produces an undefined result is "broke"

I think what you mean to say is that it's nonportable.

...

BTW - as far as the serialization system is concerned I never had a problem with the idea of recovering the exact kind of "undefined" data. Its just that there's no way to do it with archives which might be ported from one machine to another as there is no guarentee that the reading machine has the same set of of undefined results as the source machine.

That doesn't mean there's no way to do it. I can think of any number of ways, including, for example, having a replaceable handler for translating unrepresentable values.

...

Its even worse, there is way to write portable code which will generate any specific one of the "undefined" types.

numeric_limits<float>::quiet_NaN() Is "reasonably portable."

...

One might hack something together that would recover some undefined type but it wouldn't be guarenteed to the the same original type of undefined type.

Type of undefined type? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

3:33 p.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...

...
BTW - as far as the serialization system is concerned I never had a problem with the idea of recovering the exact kind of "undefined" data. Its just that there's no way to do it with archives which might be ported from one machine to another as there is no guarentee that the reading machine has the same set of of undefined results as the source machine.

That doesn't mean there's no way to do it. I can think of any number of ways, including, for example, having a replaceable handler for translating unrepresentable values.

OK - let me rephrase. I looked into this sometime ago and could find no way to guarentee any given type of Nan. Also. Not all compilers support the same set of NaNs. So I could see no way that an archive created on one machine could be loaded on another and guarentee that the NaNs would be preserved. I asked for help on this list and no one else could do it figure out a way either. Too bad you didn't post to that discussion.

...

...
Its even worse, there is way to write portable code which will generate any specific one of the "undefined" types.

numeric_limits<float>::quiet_NaN()

Is "reasonably portable."

hmm - what's "reasonable portable" is hard to get a concensus on.

...

...
One might hack something together that would recover some undefined type but it wouldn't be guarenteed to the the same original type of undefined type.

Type of undefined type?

That was a joke - sorry. Robert Ramey

David Abrahams

17 May 17 May

3:11 a.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

...
...
Its even worse, there is way to write portable code which will generate any specific one of the "undefined" types.

numeric_limits<float>::quiet_NaN()

Is "reasonably portable."

hmm - what's "reasonable portable" is hard to get a concensus on.

Let's put it this way: if numeric_limits<float>::has_quiet_NaN is true, numeric_limits<float>::quiet_NaN() is a portable way to get a non-signaling NaN of type float. If numeric_limits<float>::has_quiet_NaN is false, there's no representation of float with a non-signaling NaN value on the implementation. This is exactly as portable as an int with value 100,000: if numeric_limits<int>::max() >= 100000, int(100000) is a portable way to get an int with value 10000 non-signaling NaN. If numeric_limits<int>::max < 10000, there's no representation of int with value 100000 on the implementation. I hope you'll agree that an int with value 100000 is not too exotic, and that a program that expects to be able to handle such an int ain't "broke." -- Dave Abrahams Boost Consulting www.boost-consulting.com

Paul A Bristow

2:34 p.m.

| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of Paul A Bristow | Sent: 15 May 2006 10:48 | To: boost@lists.boost.org | Subject: [boost] Stream input and output of NaN and infinity Whooow! Steady on chaps! I had hoped this might not trigger a concerted Boosters attempt to cure all the faults in floating-point arithmetic, IEEE754, mathematics, C++, world poverty ... But sadly I was wrong ;-(( Plese can we go back to the original problem which was highlighted by Boost.Serialization: 1 At present, if you serialize a floating-point type value which happens to be NaN or infinite, you probably don't get any warning on output (MSVC 1.#QNAN), and worse still you may get an apparently correct but misleading input on 'restoring' the archive (MSVC 1.). You CAN rely on the behaviour being seriously different on different platforms (VisualAge will restore' quiet_NaN). So unless you check every floating-point value for finite-ness before 'output' to an archive (and you only **ever** serialise finites), you are taking a significant risk. 2 You may wish to handle values that are 'missing' or 'defective in some way': using NaN is a convenient way to achieve that, albeit with meanings that must be private to your application. If you don't have some floating-point representation to indicate 'NotAValue', you are in danger of going back to Dark Age programs that said "enter 999999 to end input". 3 You may wish to handle + and/or - infinity. This proposal is an attempt to: 1 Reduce the risk of not detecting not-finites (especially in round-tripping to streams). 2 Allow some application-dependent, but otherwise portable, way of handling 'missing' or 'defective in some way' values usng NaN. 3 Allow some portable way of handling infinity values. 4 Make UserDefined Floating-point Types more useful. It assumes C++/C99 Standards: isfinite, isnan, isinf and numeric_limits<FPtype>::quiet_NaN() and infinity(). (And that they work!) It also seems reasonable and pragmatically useful to assume that all floating-point values have a sign bit, and that it can be determined by signbit. This is necessary to permit + and - infinity, and so I feel it may have some use to permit a tiny bit more information about a NaN to allow +NaN and -NaN. There is no pretence to any mathematical meaning - only that applications can make use of this bit in any way they chose, and that it will be portable between OSes (but not necessarily other applications). For example, suppose that one wishes to distinguish between 'missing' values - no data input by the user, and values that have produced a NaN by some mal-computation. One could, for example, chose to signal 'missing' with -NaN and 'bad' with +NaN. (If the IEEE754 revision eventually is concluded, an implementation is free to use information from the fuller NaN format, provided the OS can decipher it. My guess is that most won't and KISS is the best policy. I am not sure I understand the revision http://754r.ucbtest.org/drafts/754r.pdf - Latest IEEE754 draft But I don't see it prohibiting a precceding sign bit to "NAN"?) Please can you tell me if you think my pragmatic proposal (now in version 2) meets these limited objectives. Thanks Paul PS Please can anyone advise me what GCC does with output - AND INPUT - of NaN and infinity? --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS

Robert Ramey

3:43 p.m.

Paul A Bristow wrote:

...

...
Please can you tell me if you think my pragmatic proposal (now in version 2) meets these limited objectives.

A valient effort aimed to deal with a real problem in most (if not all) stream i/o implementations. However, trying to get everyone in sync with this and having the changes ripple through to compilers, libraries, etc will/would take years. I think that efforts would be more fruitful if directed to fixing the problem given the tools we currently have. I would like to see a couple of sets of num_put/num_get facets which implement the following: a) mapping of Nan's to some agreed upon strings such as you use in your proposal. b) trapping of Nan's on output. When users created the a stream, they could do one of the following a) nothing - use the current setup b) use the "standardized mapping facets" c) use the "trapping facets" This would permit one to choose the best behavior for the current application. It's not that I'm against making a submission to the standard, I just don't think that a "standard" solution is going to really address the problem - its not one size fits all. Robert Ramey

Paul A Bristow

3:54 p.m.

| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of Robert Ramey | Sent: 17 May 2006 16:43 | To: boost@lists.boost.org | Subject: Re: [boost] Stream input and output of NaN and infinity | | Paul A Bristow wrote: | | >> Please can you tell me if you think my pragmatic proposal (now in | >> version 2) meets these limited objectives. | | A valient effort aimed to deal with a real problem in most | (if not all) stream i/o implementations. | | However, trying to get everyone in sync with this and having the | changes ripple through to compilers, libraries, etc will/would take | years. | | I think that efforts would be more fruitful if directed to fixing | the problem given the tools we currently have. | | I would like to see a couple of sets of num_put/num_get facets | which implement the following: | | a) mapping of Nan's to some agreed upon strings such as you | use in your proposal. | | b) trapping of Nan's on output. | | When users created the a stream, they could do one of the following | | a) nothing - use the current setup | b) use the "standardized mapping facets" | c) use the "trapping facets" | | This would permit one to choose the best behavior for the | current application. | | It's not that I'm against making a submission to the standard, I just | don't think that a "standard" solution is going to really address the | problem - its not one size fits all. | | Robert Ramey I agree strongly - I'd be delighted to see what you suggest, asap. (An exercise for a 'student'?) But I think my 'size' may fit many, so I think trying to get it in the Standard is also a good idea. (It should never have been left as a muddle like this to start IMO. The Standard should have said 'if you want to do this, do it like this'). A working implementation showing its usefulness would be powerful persuader for WG21. (VisualAge appears to come close already - is anyone using it with Boost.Serialization?) (Preparing this was my 'reward' for attending a BSI WG21 meeting - if I'd kept my mouth shut I wouldn't have got a homework job ;-) Paul --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS

John Maddock

3:51 p.m.

Paul A Bristow wrote:

...

...
Whooow! Steady on chaps!

I had hoped this might not trigger a concerted Boosters attempt to cure all the faults in floating-point arithmetic, IEEE754, mathematics, C++, world poverty ...

:-)

...

...
Please can you tell me if you think my pragmatic proposal (now in version 2) meets these limited objectives.

Definitely yes! John.

David Abrahams

4:13 p.m.

"Paul A Bristow" <pbristow@hetp.u-net.com> writes:

...

Please can you tell me if you think my pragmatic proposal (now in version 2) meets these limited objectives.

I'm not qualified to say, but I support your effort. -- Dave Abrahams Boost Consulting www.boost-consulting.com

7007

Age (days ago)

7009

Last active (days ago)

List overview

Download

38 comments

10 participants

participants (10)

Damien Fisher
David Abrahams
Geoffrey Irving
Guillaume Melquiond
Jeff Garland
John Maddock
Martin Bonner
Paul A Bristow
Peter Dimov
Robert Ramey