[boost] Re: Using Serialization for binary marshalling

19 Apr 2004

      Robert Ramey <ramey <at> rrsd.com> writes:
...
...
I'm interested in
being able to save to proprietary formats (which often means that the 
applications involved never got around to specifying a standard 
format...).  There must be thousands of ad-hoc binary formats in use 
today.
Hmmm - I'm a little wary of this.  Though I'm not sure I know what you mean.
Yes, I had better explain here.  In the case that I'm addressing, the format
is not a means to an end, but an end in itself.  The goal of this effort is
to use the serialization library framework not to produce reversible 
transformations between arbitrary C++ and a bytestream, but to take 
targetted C++ objects and produce known binary representations.
...
Some persons interested in the library have hoped it can be used to generate
some specific format.  But that format doesn't accommodated all the
information required to rebuild and arbitrary C++ data structure so in order
to do one ends up coupling the serialization of classes to the archive
format - just exactly what the serialization library is designed to avoid.
I don't think it is a case of coupling, merely one of limitation. An object
that can be serialized into XDR format must use only a limited range of
C++ types in it's composition - but having accepted that limitation, it can 
still be serialized into a less limiting archive type with the same code.
...
Actually, My current thinking is to add a section to the documentation ( I
love my documentation ! Many people have contributed to it by careful
reading and criticism) which suggests a transition path from a proprietary
format to usage of the serialization library.  This transition would be
basically
a) make a program which loads all your "old" data in the "old" way.
b) serialize the structures.
So I'm skeptical of trying to adjust to "old" proprietary formats with the
serialization library.
This is a transition from an old format to the brave new world, but in
the context of persistence.  It doesn't apply in the case of serialization
for marshalling.
...
...
...
1) I wonder why you derived ordered_oarchive  from
basic_binary_oprimitive<Archive, OStream>
...
I'm using the save_binary / load_binary functions from 
basic_binary_oprimitive
and basic_binary_iprimitive.
Maybe we'll consider factoring this out into a standalone function.  We'll
keep and eye on this for now.  In fact, if the salient feature of XDR is
Endian awarness, alignment etc.  I'm wondering if some of the functionality
of your classes shouldn't be moved into your own version of load_binary
thereby making inheriting the native one unnecessary.
I don't really see any need for this at this point.  Once the data is the 
correct binary format, one save_binary function is as good as another. I
don't think that inheriting extraneous 'save' members from the 
basic_binary_oprimitive class is a major concern.  Also, as you point out
later, I want to inherit any work regarding issues with streams, locales
and whatever else is going on way down low.
...
...
2) there's some stuff in boost that addresses alignment in a guaranteed?
Portable manner that may relevant here. Sse #include
<boost/aligned_storage.hpp> . BTW - the best way to make your code
portable
without cluttering up with #ifdef etc... is to use more boost stuff - let
other people clutter up their code with #ifdef all over the place.
...
Yes, but the aligned_storage template helps with platform-specific 
alignment within the machine.  I don't see how it helps with 
platform-independent alignment within the content of the archive...
OK - I would like to see that made a little more transparent and better
explained with comments.  It's an important part of the issues being
addressed.
Ok, no problem.
...
...
3) I'm curious about the override for the saving of vector.  ...
...
I need to override the serialization of vector, because the vector must be
serialized with a known policy to yield a required layout in the archive.
The fact that this serialization happens (at this point in time) to be
exactly the same as the default implementation is not relevant - that can
 be changed at any time, but the CDR, XDR and other binary formats must
not change.
I'm still not convinced - its seems to me that it shouldn't need to be
overridden for XDR and CDR which is what these classes do.  What about list,
deque, set, etc.
Well, I never thought these were necessary, remembering that I am providing
for classes, which are designed to be serialized into a particular binary
format.  In a binary format, all collections will boil down to a group of 
repetitions, which are either preceded by a length/count argument, or whose
length is a defined property of the format itself.

For me, vector has always sufficed, in either fixed_length<>, or 
variable_length<> guise.  Perhaps others have differing experience.
...
...
Yes, I see what you're saying.  I'll have a think about this - but the
term 'marshalling' is not particularly prevalent in the code.  Even if 
the term does have broad application, I think I am using it in the 
traditional sense.
Your library does marshalling ( as understand the term is usually used ).
My complaint is that its too modest.  Your library does more than that.
When I started this library there was strong usage of the term "persistence"
which lead to the misconception that the library had nothing to do with
"marshalling".  I see serialization as use in a number of things -
persistence, marshalling and who knows what else? (e.g. generating a crc on
the whole data state of the program to detected changes).  That's why I went
to much effort to avoid this characterization of the library.  Your addition
will gain strength from leveraging on this and by fitting in with the
established pattern will be found easier to use.  This will make it more
successful.  Also by following such a pattern it will almost entirely
eliminate the need for special documentation.
But there are inherent limitations in marshalling (per my usage).  Pointers,
which are adeptly handled by the general library, are not valid elements
of a marshalled data set.  Not all types can be represented in all formats.
These are aspects that need documentation, because they render the archives
fit only for marshalling, not for the more general 'serialization'.  As is,
the library certainly supports the most general concept, but my ambitions
are more mundane.

Of course, higher level constructs can be implemented on marshalling
base archives.  IIRC, 'IIOP' is the layer above CDR in CORBA, which provides
for remote object references, etc.

(I realise the need for documentation, this conversation would have been 
simplified had it existed earlier.)

Matt

[boost] Re: Using Serialization for binary marshalling

Matthew Vogt