Re: [boost] [serialization] fast array serialization (10x speedup)

26 Nov 2005

      Ian McCulloch wrote:
...
Hi Robert,
I think you should check your benchmark code again.  I think it is
not doing what you think it is doing.
whoops - of course you're correct.  Here are the correct (numbers)

for value_type set to char

Time using serialization library: 1.922
Size is 100000004
Time using direct calls to save in a loop: 1.031
Size is 100000000
Time using direct call to save_array: 0.25
Size is 100000000

for value type set to double

Time using serialization library: 0.86
Size is 100000004
Time using direct calls to save in a loop: 0.36
Size is 100000000
Time using direct call to save_array: 0.265
Size is 100000000
...
Secondly, the buffer in the oprimitive class has much less
functionality
than the vector<char> buffer, as well as the buffer I used previously
(http://lists.boost.org/Archives/boost/2005/11/97156.php).  In
particular,
it does not check for buffer overflow when writing.  Thus it has no
capability for automatic resizing/flushing, and is only useful if you
know
in advance what the maximum size of the serialized data is.  This
kind of buffer is of rather limited use, so I think that this is not
a fair comparison.
I think its much closer to the binary archive implementation the the
current binary_oarchive is. I also think its fairly close to what
an archive class would look like for a message passing application.
The real difference here is that save_binary would be implemented
in such a way that the overhead per call is pretty small.  Maybe
not quite as small as here, but much smaller than the overhead
associated with ostream::write.  So I believe that the above
results give a much more accurate picture than the previous
ones do of the effect of application of the proposed enhancement.
...
FWIW, I include the benchmark I just ran.  Amd64 g++ 3.4.4 on linux
2.6.10, and cheap (slow!) memory ;)
vector<char> buffer:
Time using serialization library: 3.79
Size is 100000004
Time using direct calls to save in a loop: 3.42
Size is 100000000
Time using direct call to save_array: 0.16
Size is 100000000
primitive buffer (with the save_binary() function modified to do
"buffer += count"):
Time using serialization library: 1.57
Size is 100000004
Time using direct calls to save in a loop: 1.35
Size is 100000000
Time using direct call to save_array: 0.16
Size is 100000000
Interestingly, on this platform/compiler combination, without the bug
fix in save_binary() it still takes 1.11 seconds ;)  I would guess
your Windows compiler is doing some optimization that gcc is not, in
that case.
Thanks for doing this - it is very helpful.

Sure you're compiling at maximum optimization -O3 .   In anycase,  this
is not untypical of my personal experience with benchmarks.  They vary
a lot depending on extraneaus variables.  Our results seem pretty comparable
though.

Robert Ramey