
On Aug 25, 2008, at 12:40 PM, Robert Mecklenburg wrote:
Robert Ramey writes:
This should make it apparent why I've never wanted to make a "portable binary archive" but left it as a demo or example.
I'm sure many will think this is a totally stupid suggestion, but I can't resist making a fool of myself. ;-)
Since floats are the problem with portable binary archives, why not punt on this issue and render floating point types (only) in ascii.
For many uses floating point is not the critical path and binary archives solve many problems other than floating point: endian-ness, native integer size differences, etc. And those issues can be quite difficult to deal with otherwise.
This can be viewed as a special case of a standard technique: for highly non-standard data using an independent format that translates easily into each proprietary format. In this case the independent format is simply ascii. In fact, since we are only rendering the characters "[-+e.0-9]" we could use a modified BCD or other compressed format to provide the compression that is typically what people assume in binary formats.
Thoughts?
I have an alternate suggestion: what about continued fractions? Turn the floating point value into a list of integers. This works no matter what f.p. systems the source and destination use. You just need a portable integer serialization format. 1. Serialize whether or not the float is in a NaN state as a Boolean. If it's true, then you're done. Receiving systems that don't support such states could either return a zero or throw. If it's false, keep going. 2. Serialize the sign as a Boolean. This counts even for zero values, if the f.p. uses the "negative zero" concept. Receiving systems that don't should ignore the read sign for zero values. Continue, but use the absolute value instead (even for infinity). 3. Serialize the exponent as an Integer. The exponent is the shift needed to bring the base value between one and two (including 1, excluding 2). So values of two and above get a positive shift, values under one use negative shifts, and those that happen to be in our implementation range use a zero shift. If the base value is initially zero or infinite, use zero as the shift amount. 4. Serialize the base value as a list of continued fraction components. The length is variable, so make sure to serialize it too! For a zero base value, the list consists of a single element of value zero. For infinity, use an empty list. For all other values, start with the 1 as the whole part then proceed with the rest of the components. (Since the f.p. state represents a binary fraction, I suggest not using subtraction/truncation and reciprocating in floating-point, but manipulating the virtual numerator and denominator as integers with division/modulus. If the f.p. radix isn't 2, you may want to do the exponent and continued fraction in the native radix, then convert afterwards.) Saving a serialization should start with the whole 1 and go down to smaller contributions. Loading back a serialization should read in the entire list first, then expand starting from the smallest/last contribution up to the whole 1. Example: -2.75 -> !is_nan, is_negative, 1.325 << 1; 1.325 is stored as [1.01100], which is 44/32, which is 11/8, which has a c.f. of [1; 2, 1, 2]; so you'll serialize {False, True, 1, {4; 1, 2, 1, 2}}. Example: NaN -> is_nan -> {True}; you mustn't look for any other component. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com