Re: [boost] Ann: Floating Point Utilities Review starts today

26 Feb 2008

      Giovanni Piero Deretta wrote:
...
also see: http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html
Thanks, will take a look.
...
GCC 4.2 under x86_64 produces better code with std::memcpy (which is
treated as an intrinsic)
than with the union trick (compiled with -O3):
uint32_t get_bits(float f)  {
     float_to_int32 u;
     u.f = f;
     return u.i;
 }
Generates:
_Z8get_bitsf:
       movss   %xmm0, -4(%rsp)
       movl    -4(%rsp), %edx
       movl    %edx, %eax
       ret
Which has an useless "movl %edx, %eax". I think that using the union
confuses the optimizer.
This instead:
uint32_t get_bits2(float f) {
   uint32_t ret;
   std::memcpy(&ret, &f, sizeof(f));
   return ret;
 }
Generates:
_Z9get_bits3f:
       movss   %xmm0, -4(%rsp)
       movl    -4(%rsp), %eax
       ret
Which should be optimal (IIRC you can't move from an xmms register to
an integer register without passing through memory).
Note that the (illegal) code:
uint32_t get_bits3(float f) {
   uint32_t ret = *reinterpret_cast<uint32_t*>(&f);
   return ret;
}
Generates the same code as get_bits2 if compiled with
-fno-strict-aliasing. Without that flag it miscompiles (rightly) the
code. I've tested the code under plain x86 and there is no difference
between all 3 functions.
So the standard compliant code is also optimal, at least with recent
GCCs.
Very interesting!

In that case I think we should document what Johan has now, to avoid this 
comming up again in the future :-)

Thanks!  John.

Re: [boost] Ann: Floating Point Utilities Review starts today

John Maddock