
This is in regard to the discussion of the "equivalence" of a serialized object and the deserialized counterpart. It also touches on the Serializable concept and some recent discussion of how classes without default constructors can be handled, and a few other things besides. (Sorry about the entanglement, but I'm not sure how to separate some of these issues.) (Note that I also haven't read *all* of the mail on this subject yet.) The Common Lisp committee (X3J13) needed to deal with essentially the same problem. First, a bit of background. Because CL supports programmatic construction of code which can then be passed to the compiler, and the code "syntax" supports a quoting form for referring to a literal constant, there was a need to address what it meant for various kinds of objects to appear in such a context. And the resulting objects may be referred to in code will be compiled to a file for later loading into the same or some completely different runtime environment. Earlier versions of the language simply listed all the cases for the "built-in" types, and a simple mechanism for the simple record type provided by the language (things defined with defstruct, if you care). But with the addition of OO concepts in various implementations that eventually led to CLOS (the CL Object System), it was realized that this wasn't sufficient. The term that was eventually adopted was "similar" or "similar as constants" where further disambiguation was needed. I think that would be a good term for the serialization library to adopt, as it avoids an implications related to operator== and the ambiguity around the word "equivalent". A protocol was designed to permit instances of arbitrary user-defined classes to be saved and loaded. (Sound familiar? Note that support for different data formats for that saving and restoring was only addressed to the extent that different CL implementations likely had different compiled file formats and were not required to be compatible; something like the idea of binary / text / XML / whatever archives was not addressed, and never even came up, so far as I can recall. A missed opportunity there.) The relevant part of the specification is Section 3.2.4, "Literal Objects in Compiled Files", which can be found at: http://www.lisp.org/HyperSpec/Body/sec_3-2-4.html and in the definition of make-load-form, found here: http://www.lisp.org/HyperSpec/Body/stagenfun_make-load-form.html#make-load-f... (Let me know if / where translation between CL terminology and C++ terminology would be helpful and I'll give it a shot.) The CL term "externalizable object" corresponds to the Serializable concept for the serialization library. Corresponding to the Save / Load Archive compatibility concept, CL says: "The \term{file compiler} must cooperate with the \term{loader} in order to assure that in each case where an \term{externalizable object} is processed as a \term{literal object}, the \term{loader} will construct a \term{similar} \term{object}." Substituting serialization library terminology into that quote: The saving archive must cooperate with the loading archive in order to assure that in each case where a serializable object is saved, the loading archive will construct a similar object. I think it should be reasonably straightforward to massage this into a statement about whether a saving archive and a loading archive are compatible. The CL protocol for loading involved a two step process. First, a constructor is called with some arguments. Then, optionally, an initialization form is called called, which contains references to the constructed object in order to perform additional modifications to it. I *think* this protocol is strictly more powerful than that presently specified by the serialization library. An example of a class that I don't know how to "serialize" is a "symbol" lazily constructed on named lookup; the save/load_construct_data mechanism is inadequate for this. (There are also object graphs containing reference cycles that might not be serializable but are CL externalizable, because the reference cycle can be broken by using the two stage protocol. I haven't looked to see whether the serialization library installs an object in the "pointer table" (whatever it is called) at allocation time or only after it has been initialized via deserialization.) Attempting to translate the CL protocol into C++ terminology, I think it would consist of first calling a static factory function associated with the type, passing it the archive as an argument, and then calling a member function on the object, again passing the archive as an argument.