New subject: Decimal math (was: floating point FUD)

23 Jan 2010

      Hi,

Find attached a small test program that tries to highlight some of the "usual" floating point traps. The essence is the following small function that outputs the results of some "buggy" floating point logic. The rest is just setting up the environment so that the bugs actually get triggered.

void buggy_function(double eps = 1e-13, int length = 3, double delta = 1 / 7.0)
{
  using namespace std;
  double alpha = 1e-13;
  double beta = eps;
  double a = length * delta;
  double b = length * delta;
  vector<double> inverse;
  for (int i = 0; i < length; ++i)
    inverse.push_back(1.0/(2*i+1));
  vector<double> sorted;
  for (int i = 0; i < length; ++i)
    sorted.push_back(inverse[i]*delta);

  double t = -delta*sorted.size();
  double tt = -sorted.size()*delta;
  bool t_tt = (abs(t-tt) < eps);
  bool abs_t = (abs(t) == -t);
  double c = (1-alpha) * a + alpha * b;
  double d = (1-beta) * a + beta * b;
  bool a_c = (a == c);
  bool b_d = (b == d);
  vector<bool> equal;
  for (int i = 0; i < length; ++i)
    equal.push_back(inverse[i]*delta == sorted[i]);

  std::cout << "t_tt  = " << t_tt << "\nabs_t = "  << abs_t << "\na_c   = " << a_c << "\nb_d   = " << b_d << "\n";
  std::cout << "equal = [";
  for (int i = 0; i < length; ++i)
    std::cout << " " << equal[i];
  std::cout << " ]\n";
}

The hypothetical "naive" programmer might expect the output:
t_tt  = 1
abs_t = 1
a_c   = 1
b_d   = 1
equal = [ 1 1 1 ]

However, the actual output is not only different from this, but also depends on the compiler, the compiler settings, the current configuration of the floating points unit and the actual headers included directly or indirectly by the compilation unit.

g++ test_fp.cpp -g -Wall
compiles without warning and executes with output:
t_tt  = 0
abs_t = 1
a_c   = 1
b_d   = 1
equal = [ 1 1 1 ]

g++ test_fp.cpp -DNDEBUG -O2 -msse2 -Wall
compiles without warning and executes with output:
t_tt  = 0
abs_t = 1
a_c   = 1
b_d   = 1
equal = [ 1 0 0 ]

I created a new project with MSVC9.
The debug version compiles with warnings for the things that go badly wrong and executes with output:
t_tt  = 0
abs_t = 0
a_c   = 1
b_d   = 1
equal = [ 1 1 1 ]
set floating point unit to 'high' precision:
t_tt  = 0
abs_t = 0
a_c   = 1
b_d   = 1
equal = [ 1 1 1 ]

The release version compiles with warnings for the things that go badly wrong and executes with output:
t_tt  = 0
abs_t = 0
a_c   = 1
b_d   = 1
equal = [ 1 1 1 ]
set floating point unit to 'high' precision:
t_tt  = 0
abs_t = 0
a_c   = 0
b_d   = 1
equal = [ 1 0 0 ]

In my case (see the background story below), it was actually the trap represented by the output "abs_t = 0" instead of "abs_t = 1" that caught me on linux64 but worked on win32. (It turned out difficult to reproduce this with gcc, so the test program reproduces this trap with MSVC, even so MSVC warns quite explicitly about it). I think it is a funny exercise to debug the test program and find out what actually goes wrong (which is especially funny since most traps only go wrong with compiler optimization turned on).

Regards,
Thomas

------------------------

The background story:

The recent reviews of [Constrained Value], [Polygon] and [GGL] all featured interesting disagreements over floating point issues:

http://lists.boost.org/Archives/boost/2008/12/145655.php
http://lists.boost.org/Archives/boost/2008/12/145788.php
http://lists.boost.org/Archives/boost/2009/03/149567.php
http://lists.boost.org/Archives/boost/2009/08/155767.php
http://lists.boost.org/Archives/boost/2009/11/158560.php

(These were long threads with many messages and many participants, so please don't misinterpret my choice of unrepresentative messages as any form of statement.)

Last Wednesday, I had to give an internal live demo of a feature I just finished. I had done the development mostly on win32, but I wanted to run the demo on linux64, to be able to show examples of realistic size (full chip...). While preparing the presentation, I was surprised to find the accuracy of the results significantly worse than I remembered from my previous tests. I repeated my previous tests on linux64, and found out that there was indeed a problem with my code on linux64. I did the demo on win32, but was asked after the presentation to send the source code to somebody from another group. So I spent most of the night trying to find the problem in my code. When I found out that some code with floating point driven logic (I had written years ago) to analytically integrate radial symmetric functions over rectangular domains was showing unexpected behavior, I decided to postpone the floating point fights with the compiler to the next day. I told this story the next day in a group meeting, and was confronted with much floating point FUD and unrealistic suggestions, basically boiling down to either not writing floating point driven logic in the first place, or commenting it in excessive detail. So I lost my temper and explained that it was difficult enough for me to write this code with floating point driven logic, and that anybody suggesting writing it without floating point driven logic should do so please, because I'm unable to do it.

I have to admit that I read "What Every Computer Scientist Should Know About Floating-Point Arithmetic" only superficially:
http://www.engrng.pitt.edu/hunsaker/3097/floatingpoint.pdf
http://docs.sun.com/source/806-3568/ncg_goldberg.html

I also have to admit that I write code with floating point driven logic from time to time in case it seems unavoidable to me, and that this code often has bugs that are only discovered months after the code is already finished. But I never found a bug in such code that would have been a fundamental problem with floating point, but always just the "usual" floating point traps I keep overlooking by mistake.

Now somebody will probably suggest that I should at least document the "usual" floating point traps then. So I created the attached small test program showing the "usual" floating point traps I could still remember having overlooked at least once. It was actually more difficult than expected to arrange the code in a way that the bugs are really triggered for typically compiler settings of MSVC and gcc.

floating point FUD

Thomas Klimpel

Bob Walters

Rutger ter Borg

Thomas Klimpel

Rutger ter Borg

Steve M. Robbins

Thomas Klimpel

Rutger ter Borg

John Maddock

Hervé Brönnimann

John Maddock

John Maddock

Phil Endecott

Thomas Klimpel

Barend Gehrels

dherring＠ll.mit.edu

Thomas Klimpel

tags

participants (9)