Re: [boost] serialization - NaN, +/Inf and others

18 Nov 2005

      On Thu, Nov 17, 2005 at 10:00:01AM -0800, Robert Ramey wrote:
...
Troy S. and I have been looking at the question of implementing
serialization
of NaN, +/-Inf for floats, and doubles for portable text or binary archives.
Turns out there's a fly in the ointment.
For determining if a given double contains a "special" value, float.h
contains the following handy function
int _fpclass(
   double x
);
which returns one of the following values:
_FPCLASS_SNAN Signaling NaN
      _FPCLASS_QNAN Quiet NaN
      _FPCLASS_NINF Negative infinity ( -INF)
      _FPCLASS_NN Negative normalized non-zero
      _FPCLASS_ND Negative denormalized
      _FPCLASS_NZ Negative zero ( - 0)
      _FPCLASS_PZ Positive 0 (+0)
      _FPCLASS_PD Positive denormalized
      _FPCLASS_PN Positive normalized non-zero
      _FPCLASS_PINF Positive infinity (+INF)
So we could write a flag to the archive indicating if its a special value.
So far, so good.
When the archive is read back, we can read the flag and initialize
the variable with the appropriate value.
BUT - I can't find any "official" to initialize a float/double to any of
these
values.  They seem to be the result of operations and its certainly
not obvious that all compilers would be on the same page here.
Note that this same problem arises whenever a float/double is written/read
to/from a stream in a way designed to be portable.  So it must have come
up before.  What's the solution here?
Robert Ramey
So.  The real-world use cases that brought this up are that "I
overload the meaning of NaN to mean uninitialized, and/or pos/neg inf
are valid values for my floats, and I want to serialize them".  

There are a whole spectrum of bit patterns that constitute NaN.  In
fact a whole bunch of bit patterns that can represent lots of numbers
if you take into account denormalization and so forth:

(from some website)
...
The 32-bit IEEE 754 representations of these values are:
Positive infinity: 0x7f800000
    Negative infinity: 0xff800000
    Signaling NaN: any bit pattern between 0x7f800001 and 0x7fbfffff 
                   or any bit pattern between 0xff800001 and 0xffbfffff
    Quiet NaN: any bit pattern between 0x7fc00000 and 0x7fffffff 
               or any bit pattern between 0xffc00000 and 0xffffffff
I don't think XML/text archives should attempt to guarantee that
floating point types that are denormalized or inf or nan, in any form,
are to-the-bit identical after a trip through one of the the
serialization library's text archives (and it should guarantee only
that zero is still zero).  It's a text archive, you get a text
representation, and there is no standard for text representations of
wacky floating-point types.  Fullstop.  

One reason you would want to write an XML archive is to be able to
play with it with tools independent of boost::serialization.  This
means you will need to be able to understand the text representations,
and since no standard exists, the representation may not be too
complicated, as that would be a hassle maintenance wise.

If a user does want a bit-faithful round-trip, for handling of nans,
infs, denormalization or what-have-you, they could, say, wrap their
floats/doubles in something that serializes them as 4/8 chars, as
Pavel suggests, and if they wanted to be portable w.r.t endianness
they'd have to take that into account themselves in the conversion.
But this would be a real fringe case.  If you want bit-faithful, just
use a binary archive.  If you want portable bit-faithful, use a
portable binary archive.

Looks like John Maddock figured the general isinf()/isnan() problem
out in a general way:

  #include <math.h> // isnan where available
  #include <cmath>

  namespace boost{ namespace math{ namespace detail{

  template <class T>
  inline bool test_is_nan(T t)
  {
     // Comparisons with Nan's always fail:
     return !(t <= std::numeric_limits<T>::infinity()) 
              || !(t >= -std::numeric_limits<T>::infinity());
  }
  #ifdef isnan
  template<> inline bool test_is_nan<float>(float t) { return isnan(t); }
  template<> inline bool test_is_nan<double>(double t) { return isnan(t); }
  template<> inline bool test_is_nan<long double>(long double t) { return isnan(t); }
  #endif

So I'd suggest the following: for some D of type T, the following
expressions will evaluate the same both before and after a round-trip
through a text archive:

  test_is_nan(D)   // (also isnan(D) if available) 

  D <= -std::numeric_limits<T>::infinity()  // also (isinf(D) && (D < 0)), etc

  D >= std::numeric_limits<T>::infinity()  

No more is guaranteed.  This addresses the real-world use cases
without getting too intimiate with ieee754.

So one could use that kind of thing to detect them, then just set them
to nan/inf/-inf given by 

  std::numeric_limits<T>::infinity()
  -std::numeric_limits<T>::infinity()
  std::numeric_limits<T>::quiet_NAN()

Sound reasonable?  

-t

Re: [boost] serialization - NaN, +/Inf and others

troy d. straszheim