Re: [boost] [serialization docs] Ping?

22 Sep 2005

      Joaquín Mª López Muñoz <joaquin@tid.es> writes:
...
David Abrahams ha escrito:
...
Joaquín Mª López Muñoz <joaquin@tid.es> writes:
...
In my serialization stuff for Boost.MultiIndex I actually have a
serializable type that does not conform to the equivalence rule. Its
layout kinda looks like:
template<typename Value>
struct node
{
  value v;
template<class Archive>
  void serialize(Archive& ar,const unsigned int)
  {
    // do nothing
  }
}
I use this weird construct to make node trackable, but no contents
information is dumped to the archive (that is taken care of somewhere
else in the program). In case you're curious, this arises in connection
with serialization of iterators.
I can't imagine why you'd need that; a hint would help me to
understand better.
It's a little hard to grasp; this particular issue took me literally
weeks of thinking, but I'll try to explain it a little more. Beware this
doesn't add much to our current discussion, you might want to skip:
Suppose we are implemeting serialization for a custom container:
save and load are straight enough:
class container{
  save(...)
  {
    for(const_iterator it=begin,it_end=end();it!=it_end;++it){
      ar<<*it;
    }
  }
  load(...)
  {
    clear();
    for(iterator it=begin,it_end=end();it!=it_end;++it){
      value_type v;
      ar>>v;
      push_back(v);
      ar.reset_object_address(&v,&back());
Assuming value_type is default constructible and push_back doesn't
invalidate any addresses of other objects, I guess so.  But in that
case I'd still preallocate enough elements and deserialize them in
place.
...
}
  }
};
Now we want to add serialization for iterators. One ugly way would be
as follows:
class iterator{
  save(...){
    ar<<&(operator*()); // save pointer to element
  }
  load(...){
    value_type* pv;
    ar>>pv;
    node* pn;
    // cast from pv to pn: possibly nonportable.
    assign(pn);
Clearly.  The "right thing to do" is to serialize all the nodes as part of
serializing the container.  Then this "just works," no?
...
}
};
This is potentially nonportable and, besides, won't work for nontracked
value_types.
What we want is to archive pointers to the internal nodes, rather than the values:
Right.
...
class iterator
{
  save(...){
    ar<<node_ptr;
  }
  load(...){
    node* pn;
    ar>>pn;
    node_ptr=pn;
  }
};
But for this to work, nodes must be serialized first so that they can be tracked
later.
class container{
  save(...)
  {
    for(const_iterator it=begin,it_end=end();it!=it_end;++it){
      ar<<*it; // save value
      ar<<*it.node_ptr; // save node
Why wouldn't your node just implement serialization that serializes
its contained value?
...
}
  }
  load(...)
  {
    clear();
    for(iterator it=begin,it_end=end();it!=it_end;++it){
      value_type v;
      ar>>v;
      push_back(v);
      ar.reset_object_address(&v,&back());
      ar>>*(--end()).node_ptr; // "load" node
    }
  }
};
That's the purpose of node serialization stuff. The implementation does nothing
except signalling Boost.Serialization where later node pointers must
point to.
Well, I could probably get this if I thought hard enough about it, but
I don't yet.  Of course I could be missing something, it seems like
a hack to me.  Serializing and deserializing the nodes directly seems
a lot cleaner.
...
...
...
* An input archive iar is compatible with an output archive oar if
  1. iar allows a sequence of >> ops matching the corresponding << ops
  made upon oar (matching defined in terms of types involved and
  nesting depth of the call.)
Is the nesting depth of the call really relevant?
Ummm... No, we can drop that: the nesting thing is redundant with
the requirement on serializable types about matching of << and >> ops.
On the other hand, XML archives do enforce the nesting abidance,
but this is more of an implementation artifact.
Good.  It gets simpler.
...
...
...
2. For primitive serialization types, the restored copies are equivalent
  to their original (expand on this, specially with respect to pointers.)
* A type T is serializable if it is primitive serializable or else it defines
  the appropriate serialize (load/save) function such that the sequence
  of >> ops in load() match the << ops in save().
[This is not a requirement] For each serializable type, the implementor
can define "equivalence" in terms of its constituent types. For instance,
for std::vector:
Given a std::vector<T> out, where T is serializable, and a restored copy in,
then in(i).size()==out(i).size() and each in(i)[j] is a restored copy of
out(i)[j].
I don't think this latter part is worth much.  I think it might be
worth defining an EquivalentSerializable concept, though.
One can do the following:
Let T be a serializable type and Pred an associated equality
predicate inducing an equivalence relationship on T. Then T is said
to be EquivalentSerializable (under Pred) if
p(x,y)==true
for all p of type Pred
You said Pred was a predicate; now you're saying it's a type.  I think
you were right the first time.  You'll never satisfy that for all p of
type bool(*)(int,int) for example.
...
and x and y of type T such that y is a restored copy of x.
This leaves to the implementor of an UDT the open task of giving the
appropriate associated equality predicate (by default we can assume
std::equal_to).
I think you mean ==
...
Then we can rewrite the postcondition on std::vector as
if T is EquivalentSerializable under Pred, std::vector<T> is
EquivalentSerializable.
Nope.  You have to say under what predicate it is
EquivalentSerializable.  And when a nonstandard predicate is used for
T there may not be any such predicate for the vector.
...
(The statement is a little more complex if we take a Pred other than
the default.)  Of course, this EquivalentSerializable concept does
not save us the task of first providing archive compatibilty and
Serializable concepts the hard way
Of course not.
...
and it is only applicable intraprogram.
That's only true if you consider Pred to be a callable C++ predicate
rather than a logical one.
...
Does this sound good to you?
Yes and no.  It's crafty, but you have a pretty big gaping hole as
demonstrated by the vector example.  I would be very happy with the
good old fuzzy notion of equivalence here, but if you can close the
hole, I don't mind adding predicates to the mix.

Okay, how about this: the predicate is tightly bound to the type.  So
the predicate for vector<T> is defined to be that the two vectors have
the same length and that each corresponding element of the two vectors
satisfies the predicate that's bound to T.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com