Re: [boost] [serialization] use of unsigned int instead of size_type

13 Feb 2006

      Kim Barrett wrote:
...
At 9:14 AM -0800 2/12/06, Robert Ramey wrote:
...
...
...
David Abrahams wrote:
A strong typedef should work, if all archives implement its
serialization.
...
This is the objection I expected to hear from Robert much earlier in
this discussion. A strong typedef for this purpose effectively widens
the archive concept by adding a new thing that needs to be supported.
That conceptual widening occurs even if the default archive behavior
for a strong typedef is to just serialize the underlying type. It
still needs to be documented as part of the archive concept, and
anyone defining a new kind of archive ought to consider whether that
default is the correct behavior for this new archive type, or whether
something else is needed.
Even if a strong type is used, it is neither necessary nor is it
desireable to add it to every archive.

The procedures would be:

create a header boost/collection_size.hpp which would contain
something like

namespace boost {
BOOST_STRONG_TYPE(collection_size_t, std::size_t)

// now we have a collection type
BOOST_CLASS_IMPLEMENTION_LEVEL<collection_size_t, object)
// no versioning for effiency reasons

template<class Archive>
void seriaize(Archive &ar, collection_size_t &, const unsigned int version){
    ar & t; // if its converted autmatically to size_t
// or
    ar & static_cast<collection_size_t &>(t); // if not converted 
automatically
}
...
And I think he would have a pretty good rationale for feeling that
way. Keeping the archive interface narrow and minimizing the coupling
between serialization and archives minimal is, I think, one of the
strengths of the serialization library's design.
Halleluha!!!
...
I would be in full agreement with Robert here, except that all of the
alternatives I can think of seem worse to me.
1. std::size_t
This causes major problems for portable binary archives. I'm aware
that portable binary archives are tricky (and perhaps not truly
possible in the most general sense of "portable"). In particular,
they require that users of such archives be very careful about the
types that they include in such archives, avoiding all explicit use
of primitive types with implementation-defined representations in
favor of types with a fixed representation. So no int's or long's,
only int32_t and the like. Floating point types add their own
complexity.
A portable binary archive comes down to serializing primitives in
a portable way.  This is what the example included with the
serialization library does.  The example isn't complete but it
does illustrate this point.
...
Some (potential) users of the serialization library (such as us) are
already doing that, and have been working under such restrictions for
a long time (long before we ever heard of boost.serialization),
because cross-platform serialization is important to us.
Hmm - well, maybe you want to just finish the example in the package
by adding floats and doubles - and your done !!.
...
The problem for portable binary archives caused by using std::size_t
as the container count is that it is buried inside the container
serializer, where the library client has no control over it. All the
archive gets out of the serializer is the underlying primitive type,
with which it does whatever it does on the given platform. The
semantic information that this is a container count is lost by the
time the archive sees the value, so there's nothing a "portable"
archive can do to address this loss of information. And this occurs
no matter how careful clients are in their own adherence to use of
fixed representation value types.
This is all true.  But I'm not convinced that its necessary to know
where the primitive came from to handle it.  But I don't really
need to be convinced.  I would be happy to go along with it
if someone who does think this is necessary is willing to address
all the minor little things that will add up to kind of a pain.  This
includes:

a) Selecting a type that will please everyone.
b) Carefully setting up the appropriate serialization traits for such a type
c) Tweaking the collection serialization to use the new type.
d) while making sure that existing archives can still be read - this
entails having a little bit of conditional code in the collection
loading functions.

I believe that a BOOST_STRONG_TYPE is a very good
candidate for this - But that would suggest it might be a good
idea to take a critical look at BOOST_STRONG_TYPE.

So, if its done correctly, its more than trivial "bug fix"
...
This leaves a client needing a portable binary archive with several
unappealing options (in no particular order)
- Modify the library to use one of the other options.
- Override all of the needed container serializers specifically for
the portable archive.
- Don't attempt to serialize any standard container types.
As I said - I don't agree at all here.  To illustrate my point, I
point to the example in the documentation and code
demo_portable_binary
...
2. standard fixed-size type
We already know that uint32_t is inadequate; there are users with
very large containers. Maybe uint64_t is big enough, though of course
predictions of that sort have a history of proving false, sometimes
in a surprisingly short time. And uint64_t isn't truly portable
anyway, since an implementation might not have that type at all.
Also, some might object to always requiring 8 bytes of count
information, even when the actual size will never be anything like
that large. This was my preferred approach before I learned about the
strong typedef approach, in spite of the stated problems.
3. some self-contained variable-size type
This might be possible, but the additional complexity is
questionable. Also, all of the ideas I've thought of along this line
make the various text-based archives less human readable, which some
might reasonably find objectionable.
text archives present no problem.  Numbers coded as a string of
decimal characters have no finite limit as to numbers they can represent.

"portable binary" archives must also have some sort of way to code
numbers in a variable length format.

The only problem arises with the native_binary archive - and it is
explicitly exempt from any portability requirement.

\> So it appears to me that all of the available options have downsides.
...
While my initial reaction to the strong typedef approach was rather
ambivalent because of the associated expansion of the archive
concept, it seems to me to be the best of the available options.
Noooooo - and you were on a roll.

Robert Ramey