Re: [boost] [serialization docs] Ping?

22 Sep 2005

      David Abrahams <dave <at> boost-consulting.com> writes:

[...]
...
As I've said before, you will need eventually to describe the
relationship between compatible loading and saving archives.  It will
be something like
T x, y;
   // arbitrary operations on x to set its state
   sar & x;
   lar & y;
Postcondition: y is equivalent to x
[...]
...
Joaquin's post takes an "innovative" approach to the problem of
specifying semantics but it isn't at all clear to me that it holds
water.
I can do little to argument against that criticism.
If you have specific concerns about the approach please do bring
them here.
...
The reason that "equivalent" is a fuzzy term in C++ comes down
to the fact that two distinct objects always have detectably distinct
addresses, so no two distinct objects can _truly_ be equivalent.
As far I know, the only definitions for (object) equivalence
in the standard are given in connection with strict weak orderings
induced by comparison functors. Beside that, I failed to find
any reference about what two objects being "equivalent" means.
...
Leaving aside that language corner, the idea of equivalence works
perfectly well.
For the sake of the discussion, let's assume that "a and b
are equivalent" is somehow defined as / related to "a==b". My
thesis is that there are serious objections against this
definition of equivalence in the context of serialization:

1. A serializable type need not be equality comparable.
2. "a==b" is a C++ expression, so implying that a and b are
objects living inside the same program. If I save an object a
on my PC, pass the file to you and you load it a year later as
b on your Linux box, what is "a==b" supposed to mean?
3. A serializable type can be implemented without observing
the "a==b" rule: for instance, a list-like container can
load the elements in reverse order --I understand this is
a perfectly legitimate implementation that shouldn't be banned
because of the "a==b" restriction.

One can argue that (1) and (2) can be overcome with a
"fuzzier" definition of equivalence relying on the reader's
intuition about this relationship, but (3), IMHO, breaks
down any hope of attaching equivalence to serialization
semantics: ultimately, archives are not responsible for
holding the equivalence rule, as they relay to user provided
serialize() functions.

So, from my point of view, the real task of an input/output
archive pair is to ensure that, when a T::serialize function is
invoked on loading, the input context (i.e, permissible >> ops
on the input archive) is a replica of the output sequence.
This rule recursively descends to primitive (in the serialization
sense) types, where an equivalence rule can actually be provided.
My (skectchy) proposal is merely a formalization of this
idea.
...
I suggest you use that, and the established
conventions from the literature, to describe semantics.  You have,
essentially, an emergency on your hands -- this is not the time to try
untested approaches.  First plug the dyke and then, if you have time,
think about a rewrite.
Without wanting to sound harsh, I think that what
you propose as established conventions for describing
serialization semantics hold little real information and,
worse yet, can mislead readers to assume that Boost.Serialization
is constrained by the equivalence rule when it is not
(cf. point 3. above.) The current docs are better in this respect
since at least they don't assert false semantic rules.

Best regards,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

Re: [boost] [serialization docs] Ping?

Joaquin M Lopez Munoz