Re: [Boost-users] [serialization]

17 Nov 2005

      On 11/13/05 1:26 PM, "Robert Ramey" <ramey@rrsd.com> wrote:
...
Daryle Walker wrote:
...
In this case, we would have a bug in the decoding and encoding
routines. The bug would be that they don't match.  If the coding
routines are calling the standard library (like I think they are for
text archives of primitive types), then the bug is from the standard
library not being symmetric.  I think the standard library is
supposed to give symmetric text I/O
I don't know what the standard library is supposed to do.  But
the fact is that at least some implementations of he standard
library are not handling text i/o symetically in at least two cases:
uninitialized bools.
floating/double NaN, +/- inf, etc.
But it is never legal to push uninitialized variables through an output
system, text or binary.  Faulting the library for that is a severe stretch.
It's not something that can be worked on, unlike the NaN case.
...
...
so how much effort should we do
to work around such bugs?
actually, the effort for uninitialized bool is pretty trivial and I've
incorporated and assertion into the appropriate spot.
But it's non-portable.  Your environment lets you get away with it.  What
about the user of another environment where the uninitialized read does
cause a crash & burn?

What if the variable's bit pattern just happens to match a valid state?

Sometimes the simplest solution isn't the best.
...
For the others, its a little more work.  If someone has enough
interest to actually make and test the changes, I'll be happy to
receive them, check them, and incorporate them in to the code.
I would expect that only some small changes in ??text_i/oprimitive
would be necessary.  Oh it would be a bad idea to post them to
the list.  Personally, I'm of the view trying to serialize a NaN would
probably a be bug in user code.  I'm aware that not everyone
would agree with this.  Maybe throwing an exception might
be enabled/dissabled with another flag applied at archive open
time (like no_header, etc).  Any way, its not a big issue for me.
Presumably, if someone has interest he can submit his
improvements and we can discuss it then.
And you think serializing an uninitialized value isn't a bug?!
...
...
Reading from an uninitialized variable, like what could happen in the
original case during encoding, is not a problem any library can fix.
The programmer just has to be non-sloppy.
lol - and further more, if we catch him doing something like this
we should make sure we don't tell him so he get his deserved
punishment!!!!
It isn't a matter or "should," but "could."  We cannot portably warn the
user since undefined behavior doesn't have to play along with your
"resolution" code.
...
...
...
I think in the case reported by Paul, he's not necessarily using the
unitialised value, as its an object that is kind of like a
discriminated union. I think this usage parallels the idea of NaN's
etc in floating point. I'd expect these to be read back in too, as
you suggested.
The problems are not in parallel.  For a discriminated union, it is
the responsibility of the coding author to determine which fields are
active and only read/write those particular fields and skip the
inactive fields.  The unusability of NaN values is from a high-level
perspective, such values are still valid objects from a low-level
view.  (And the high-level view is just an opinion; some programmers
might want to keep NaNs around as a flag.)
This articulates my view.  I used the term "overloading" as
in semantic overloading where we might use NaN to mean something
specific.  I can see where this might be useful in some narrow contexts
but I would generally consider it an error prone practice.  Just
one man's opinion.
The problem is that the "invalidity" of NaNs is at the semantic level.
Objects with a NaN value are still valid objects at the base level.  You
would have to add some sort of censoring to your framework to make sure NaNs
don't get through.  And do you hard-wire this to IEEE floats, or generalize
it to any type with "invalid" values?  But the main invalidity test is that
its I/O is asymmetric.  Such a test would have to be hacked in for each
environment (compiler, library, OS, and HW combination).  Weren't we
supposed to be writing less serialization code, not more?

-- 
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT hotmail DOT com