[serialisation] reading and writing reals

Attached is a sketch of how reals can be serialised in hex notation, without data loss when round-tripping from binary-to-text-to-binary. Note: 1) The iostream code could be improved, particularly formatting errors on reading, if any iostream experts would like to give it the once over they're very welcome. 2) There is no check on reading that the target type is large enough to hold all the bits that were previously written out (neither is there for decimal io come to that). 3) I've only tested this code with VC8, so portability may vary. 4) I've made no attempt to beautify the output format, it's frankly rather ugly to read at present. 5) It's not clear whether on output the digit before the decimal point should always be a 1: at present it's a full hex digit, which means the value "1" comes out as 0x8p-3 rather than 0x1p0 6) Zero still needs to handled as a special case, at present it prints as 0x0p-3 rather than 0x0p0: it makes no difference on input, but it would be nice to print it prettily :-) 7) I've been unable to reproduce the original problem with native << and >> operators with either random value tests, or with sequential nextafter tests, unless I start with the specific value that caused all the fuss in the first place :-( Any ideas for better tests much appreciated. Enjoy, John.

| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of John Maddock | Sent: 16 March 2006 19:21 | To: Boost mailing list | Subject: [boost] [serialisation] reading and writing reals | | Attached is a sketch of how reals can be serialised in hex | notation, without | data loss when round-tripping from binary-to-text-to-binary. Quick work! And quite nice too. (although to be pedantic, the formula for precision is still not what I believe is correct, from Kahan precision(2 + std::numeric_limits<double>::digits * 3010/10000) but this is not the cause of the problem. ) | 7) I've been unable to reproduce the original problem with | native << and >> | operators with either random value tests, or with sequential | nextafter | tests, unless I start with the specific value that caused all | the fuss in | the first place :-( Any ideas for better tests much appreciated. A very brief look at this suggests that starting with values < 0.001 that cause trouble. So random testing may have trouble finding this. Using nextafter always seems to produce 35% or 3 % of values wrong (by one bit). Starting with 0.0001 ends with Original numerical value: 0.00010000000000001348 Contents of stream: 0.00010000000000001348 Deserialised value: 0.00010000000000001349 Deserialisation error! Original numerical value: 0.0001000000000000135 Contents of stream: 0.0001000000000000135 Deserialised value: 0.00010000000000001349 Deserialisation error! Original numerical value: 0.00010000000000001354 Contents of stream: 0.00010000000000001354 Deserialised value: 0.00010000000000001353 failed 450, out of 1000 I really don't have time to raise an long interrupt to look at this more fully at the moment, but it smells like a bug (except that it has already been deemed a 'feature' when I raised the same problem with the float version). So I suspect your proposal is a good workaround. But I am concerned that we have a really good test for it, or that it is provably correct, or better still both ;-)) With float, one can just about do a full test - takes all night. But even with my new Dual Core AMD X2 ;-)) an exhaustive double won't finish before it is worn out :-(( Paul -- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204 mailto: pbristow@hetp.u-net.com http://www.hetp.u-net.com/index.html http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html

A very brief look at this suggests that starting with values < 0.001 that cause trouble. So random testing may have trouble finding this.
Confirmed, there appear to be other problems with random tests that I'll address in a separate message.
Using nextafter always seems to produce 35% or 3 % of values wrong (by one bit).
Starting with 0.0001
Also confirmed, and the problem areas appear to be different for floats and doubles which makes them rather hard to find, and any kind of systematic testing of std lib's next to impossible.
But I am concerned that we have a really good test for it, or that it is provably correct, or better still both ;-))
With float, one can just about do a full test - takes all night.
But even with my new Dual Core AMD X2 ;-))
an exhaustive double won't finish before it is worn out :-((
Quite. Proving the code correct is tricky though, when I've more time I'll double check with Knuth Vol 2 and with ACM and see if there's any literature that can help out. John.

| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of John Maddock | Sent: 17 March 2006 19:32 | To: boost@lists.boost.org | Subject: Re: [boost] [serialisation] reading and writing reals | | Proving the code correct is tricky though, when I've more | time I'll double | check with Knuth Vol 2 and with ACM and see if there's any | literature that can help out. The key reference seems to be William D Clinger, How to read floating point numbers accurately, http://citeseer.ist.psu.edu/clinger90how.html I will try to re-read (and perhaps even understand it!) Paul -- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204 mailto: pbristow@hetp.u-net.com http://www.hetp.u-net.com/index.html http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html
participants (2)
-
John Maddock
-
Paul A Bristow