Re: [boost] Ann: Floating Point Utilities Review starts today

26 Feb 2008

      On Tue, Feb 26, 2008 at 3:04 PM, John Maddock <john@johnmaddock.co.uk> wrote:
...
John Maddock wrote:
...
1) Floating point classification routines: these are optimised
implementations of the C99 and C++ TR1 functions fpclassify, isinf,
isnan, isnormal and isfinite.  From Boost-1.35 onwards these are
already a part of Boost.Math (see
http://svn.boost.org/svn/boost/trunk/libs/math/doc/sf_and_dist/html/math_too...)
so if accepted the two implementations will get merged.
The review here should focus on the implementation used, and testing
on whatever platforms you have available - in particular are there any
circumstances (compiler optimisation settings etc) where this
implementation breaks?
These look fine to me, and as Johan mentioned, any breakages should be fixed
 in endian.hpp.
Sorry to bring this up again (!), but I've been trying to get my head around
 the aliasing rules and what is and isn't allowed, in particular by following
 the rather useful article here:
 http://www.cellperformance.com/mike_acton/2006/06/understanding_strict_alias...
also see: http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html
...
...
From this it appears that your current code is perfectly fine, however I
 still can't help feeling that we could do better than call an external
 function (memcpy) to move 4 or 8 bytes.  The suggestion in the article is to
 use a union to move from one type to another:
union float_to_int32
 {
   float f;
   boost::uint32_t i;
 };
boost::uint32_t get_bits(float f)
 {
   float_to_int32 u;
   u.f = f;
   return u.i;
 }
I can't claim to have tested exhaustively, but modifying get_bits and
 set_bits to use this method seems to work OK for VC8&9 and Minwg GCC.
GCC 4.2 under x86_64 produces better code with std::memcpy (which is
treated as an intrinsic)
than with the union trick (compiled with -O3):

  uint32_t get_bits(float f)  {
      float_to_int32 u;
      u.f = f;
      return u.i;
  }

Generates:

  _Z8get_bitsf:
        movss   %xmm0, -4(%rsp)
        movl    -4(%rsp), %edx
        movl    %edx, %eax
        ret

Which has an useless "movl %edx, %eax". I think that using the union
confuses the optimizer.
This instead:

  uint32_t get_bits2(float f) {
    uint32_t ret;
    std::memcpy(&ret, &f, sizeof(f));
    return ret;
  }

Generates:

   _Z9get_bits3f:
        movss   %xmm0, -4(%rsp)
        movl    -4(%rsp), %eax
        ret

Which should be optimal (IIRC you can't move from an xmms register to
an integer register without passing through memory).

Note that the (illegal) code:

  uint32_t get_bits3(float f) {
    uint32_t ret = *reinterpret_cast<uint32_t*>(&f);
    return ret;
 }

Generates the same code as get_bits2 if compiled with
-fno-strict-aliasing. Without that flag it miscompiles (rightly) the
code. I've tested the code under plain x86 and there is no difference
between all 3 functions.

So the standard compliant code is also optimal, at least with recent GCCs.

-- 
gpd

Re: [boost] Ann: Floating Point Utilities Review starts today

Giovanni Piero Deretta