
On Thu, Nov 17, 2005 at 10:00:01AM -0800, Robert Ramey wrote:
Troy S. and I have been looking at the question of implementing serialization of NaN, +/-Inf for floats, and doubles for portable text or binary archives.
Turns out there's a fly in the ointment.
For determining if a given double contains a "special" value, float.h contains the following handy function
int _fpclass( double x );
which returns one of the following values:
_FPCLASS_SNAN Signaling NaN _FPCLASS_QNAN Quiet NaN _FPCLASS_NINF Negative infinity ( -INF) _FPCLASS_NN Negative normalized non-zero _FPCLASS_ND Negative denormalized _FPCLASS_NZ Negative zero ( - 0) _FPCLASS_PZ Positive 0 (+0) _FPCLASS_PD Positive denormalized _FPCLASS_PN Positive normalized non-zero _FPCLASS_PINF Positive infinity (+INF)
So we could write a flag to the archive indicating if its a special value.
So far, so good.
When the archive is read back, we can read the flag and initialize the variable with the appropriate value.
BUT - I can't find any "official" to initialize a float/double to any of these values. They seem to be the result of operations and its certainly not obvious that all compilers would be on the same page here.
Note that this same problem arises whenever a float/double is written/read to/from a stream in a way designed to be portable. So it must have come up before. What's the solution here?
Robert Ramey
So. The real-world use cases that brought this up are that "I overload the meaning of NaN to mean uninitialized, and/or pos/neg inf are valid values for my floats, and I want to serialize them". There are a whole spectrum of bit patterns that constitute NaN. In fact a whole bunch of bit patterns that can represent lots of numbers if you take into account denormalization and so forth: (from some website)
The 32-bit IEEE 754 representations of these values are:
Positive infinity: 0x7f800000 Negative infinity: 0xff800000 Signaling NaN: any bit pattern between 0x7f800001 and 0x7fbfffff or any bit pattern between 0xff800001 and 0xffbfffff Quiet NaN: any bit pattern between 0x7fc00000 and 0x7fffffff or any bit pattern between 0xffc00000 and 0xffffffff
I don't think XML/text archives should attempt to guarantee that floating point types that are denormalized or inf or nan, in any form, are to-the-bit identical after a trip through one of the the serialization library's text archives (and it should guarantee only that zero is still zero). It's a text archive, you get a text representation, and there is no standard for text representations of wacky floating-point types. Fullstop. One reason you would want to write an XML archive is to be able to play with it with tools independent of boost::serialization. This means you will need to be able to understand the text representations, and since no standard exists, the representation may not be too complicated, as that would be a hassle maintenance wise. If a user does want a bit-faithful round-trip, for handling of nans, infs, denormalization or what-have-you, they could, say, wrap their floats/doubles in something that serializes them as 4/8 chars, as Pavel suggests, and if they wanted to be portable w.r.t endianness they'd have to take that into account themselves in the conversion. But this would be a real fringe case. If you want bit-faithful, just use a binary archive. If you want portable bit-faithful, use a portable binary archive. Looks like John Maddock figured the general isinf()/isnan() problem out in a general way: #include <math.h> // isnan where available #include <cmath> namespace boost{ namespace math{ namespace detail{ template <class T> inline bool test_is_nan(T t) { // Comparisons with Nan's always fail: return !(t <= std::numeric_limits<T>::infinity()) || !(t >= -std::numeric_limits<T>::infinity()); } #ifdef isnan template<> inline bool test_is_nan<float>(float t) { return isnan(t); } template<> inline bool test_is_nan<double>(double t) { return isnan(t); } template<> inline bool test_is_nan<long double>(long double t) { return isnan(t); } #endif So I'd suggest the following: for some D of type T, the following expressions will evaluate the same both before and after a round-trip through a text archive: test_is_nan(D) // (also isnan(D) if available) D <= -std::numeric_limits<T>::infinity() // also (isinf(D) && (D < 0)), etc D >= std::numeric_limits<T>::infinity() No more is guaranteed. This addresses the real-world use cases without getting too intimiate with ieee754. So one could use that kind of thing to detect them, then just set them to nan/inf/-inf given by std::numeric_limits<T>::infinity() -std::numeric_limits<T>::infinity() std::numeric_limits<T>::quiet_NAN() Sound reasonable? -t