Re: [boost] [boostcon][proto] Suggestion for LIAW session: fixed-pointnumbers

16 Mar 2011


      On Tuesday 15 March 2011 18:45:49 Gruenke, Matt wrote:
...
template<
          SignProperty is_signed,
          int int_bits,           //!< Number of integer bits.
          int frac_bits           //!< Number of fractional bits.
               > 
      class CFixed;
I, too, used a class like this in the past, but experience makes me wish for 
more flexibility. When modeling hardware operations implemented on 
signal/image processing algorithms, the following schemes have all seen wide 
use:
  1. signed, int_bits, frac_bits                <- your design
  2. signed, msb_power_of_2, lsb_power_of_2
  3. signed, width, int_bits
  4. signed, width, frac_bits

Of these, the only one I found to be completely intuitive was the second one. 
All of the others required non-obvious rules to represent the cases when the 
MSB is to the right of the binary point or the LSB is much to the left of the 
binary point. Subsequently, my colleagues and I found that most of the 
reasoning in choosing appropriate bit-widths is in figuring out the ULP. Our 
experience led me to propose my design.
...
The way I structured it was to discard information only on assignment
(unless the maximum internal storage had overflowed, of course).  While I
agree that it's tempting to just use int64 for storage, the size and speed
implications of creating large arrays of them provides a strong argument
against this approach.
Good idea, but this is a refinement that can wait until later.
...
Since the most common use of fixed-point arithmetic is for speed, the
priority is speed over safety.
In order of decreasing priority: correctness, speed, safety. Safety can be 
enforced in debug builds.
...
For example, never assume that overflow
checking is performed.  Furthermore, operatorns supported by the class are
primarily designed to avoid unnecessary arithmetic, while minimizing
surprises.
Agreed.
...
The general policies were:
* No virtual functions!  Fixed-point types must be as fast
      and small as hand-coded fixed-point arthmetic, using
      built-in types.
* Converting constructor is implicit, since conversions (such
      as reductions in precision) are so common, in fixed-point
      arithmetic.
Agreed, but this is not an issue with the 'assigner' template argument.
...
* add/subtract return the greater of the int_bits and
      frac_bits (independently) of the operands.
* add/subtract returns signedness of the type with the
      greater range (choosing unsigned, in the event of a tie).
Disagreed. Follow the integers - operating on a signed always returns signed.
...
* multiply returns a type in which the int and frac bits
      are the sum of the corresponding values of the operands.
      The result is signed, unless both operands are unsigned.
* The return type of divide is equivalent to multiplying
      the numerator by the inverse of the denominator.
* inverse returns a type with one fewer frac bits than the
      operand had int bits and one more int bits than the
      operand had frac bits.  This sacrifice of precision is
      necessary to avoid overflow.
This one is tricky; the behavior you propose is completely unacceptable for 
signal processing.
...
I like the idea of abstracting the assigner, but I guess I don't really see
what you'd want to vary besides perhaps rounding behavior.
Examples (not exhaustive):
Truncation vs rounding (direction - 4 possible ways)
Wrapping vs saturation vs symmetric saturation
Allowing only values of 2^{k} (used a lot in ASIC loop filters)

Regards,
Ravi