Re: [boost] [serialization] fast array serialization (10x speedup)

22 Nov 2005

      David Abrahams wrote:
...
"Robert Ramey" <ramey@rrsd.com> writes:
...
I have one question about this.
What is the ultimate purpose.  That is it just to optimize
serialization of certains types of collections of bit streamable
objects? or does have some more ambitious goal.
I thought I highlighted the ultimate purpose quite clearly already:
,----
 | For many archive formats and common datatypes there exist APIs
 | that can quickly read or write contiguous sequences of those types
 | all at once (**).  Reading or writing such a sequence by
 | separately reading or writing each element (as the serialization
 | library currently does) can be an order of magnitude more
 | expensive.
 `----
We want to be able to capitalize on the existence of those APIs, and
 to do that we need a "hook" that will be used whenever a contiguous
 sequence is going to be (de)serialized.  No such hook exists in
 Boost.Serialization.
(**) Note that this capability is not necessarily tied to bitwise
      serialization or the use of a binary representation.
In particular, I took special pains to clarify above (**) that this is
*not* merely about "serialization of certains types of collections of
bit streamable objects."
...
If that's unclear, maybe you could ask some specific questions so that
I know what needs to be clarified.
Could you give some other examples?  Other than bit serializable types
which can benefit from using binary read/write - none other have occurred
to me.

Another thing I'm wondering about is whether any work has been done
to determine the source of the "10x speed up".  For arrays of primitives,
I would seem that the replacement of  a loop of binary reads with
one binary read of a larger data might explain it.  If that were the case
it might be most fruitful to invest efforts in a different kind of i/o 
stream
which only supports read/write but doesn't deal with all the operators,
code_cvt factets, etc.  In my personal work, I've found that i/o stream,
is very convenient - but it is a performance killer for binary i/o.
Another possibility is a binary archive which doesn't depend upon
i/o stream at all but rather fopen, fwrite, etc.  In fact, in my own
work, I've even found that too slow so I had to replace it with
my own version one step closer to the OS which also exploited asio.h .
This in turn entailed writing an asio implementation which wraps Windows
async i/o API calls.

My guess is that if I wanted to speed up serialization this would be a
more effective direction.

Another thing that I'm curious about is the how much compilers can
really collapse inline code when its theoretically possible.  In he
case of an array of primitives, things should collapse to a loop
of stream read calls without even calling anything inside the
compiled library.  I don't have any realy knowledge as to whether
which compilers - if any - actually are doing that.  I guess I could
display the disassmbly and maybe it will come to that.  But for now
I think don't have all the information I need to understand this.

Robert Ramey