
On May 5, 2004, at 4:11 PM, Ian McCulloch wrote:
Dave Harris wrote:
In-Reply-To: <95C853D0-9EAF-11D8-BFD9-000A95DC1C98@itp.phys.ethz.ch> troyer@itp.phys.ethz.ch (Matthias Troyer) wrote (abridged):
As I see it the current serialization library allows both options, depending on your preferences. Any archive may choose which types it view as fundamental, but both have their disadvantages:
I would use variable length integers. Use as many bytes as you need for the integer's actual value. That way the archive format is independent of whether short, int, long or some other type was used by the outputting program. It can also give you byte order independence for free.
Specifically, I'd use a byte oriented scheme where the low 7 bits of each byte contribute to the current number, and the high bit says whether there are more bytes to come. [...] Thus integers less than 128 take 1 byte, less than 16,000 take 2 bytes etc. This gives a compact representation while still supporting 64-bit ints and beyond. You can use boost::numeric_cast<> or similar to bring the uintmax_t down to a smaller size.
Nice idea. I was thinking along the lines of how to achieve high-performance, with shared memory & fast network message passing in mind. I think now though, this is probably too specialized an application for boost::serialization.
I disagree. The above ideas would be for archive formats geared towards persistence, where speed is not the main issue. All we need to achieve high performance is an archive designed for this purpose, with specialized serialization functions for the standard containers. Ignoring portability issues since the ultra-fast networks (such as Infiniband, Cray Red Storm or similar) are between homogeneous nodes with the same hardware. The same serialization code could then be used both for fast network communications as well as for serialization to a portable archive for persistence purposes, just by using different archive types. Matthias