
troy d. straszheim wrote:
Here's a use case that has been discussed before, but to which I couldn't seem to find any solid resolution the list archives:
class A; some_oarchive oa;
void mutate(shared_ptr<A>& aptr_) { aptr_ = shared_ptr<A>(new A); };
shared_ptr<A> aptr;
for (int i=0; i<LARGE_NUMBER; i++) { mutate(aptr); oa << aptr; }
The problem is that your oarchive ends up with somewhere between two and LARGE_NUMBER distinct A's in it. Usually two. Presumably the "two" is because in this simple example only A's are getting allocated and deallocated, and therefore there are often free A-sized blocks conveniently laying around for reuse. aptr gets assigned A's at two different alternating addresses.
For what its worth, I believe there is an warning in the document against this kind of thng. In fact, the documentation says that this will invoke a compile time error. I just checked - it doesn't produce a compile time error as I expected. I found the code that does this commented out - Now I don't remember why I commented it out. That is, the following are not recommended: a) changing he state of data while its in the process of be saved. b) serializating data of the stack. This will break the tracking as different objects will have the same address and be mis-identified as being different. If I had nothing else to do and could figure out how to do it, I would like implement a warning so that if one used a << operator with a non-const argument one would get a warning or maybe I am in the process of adding two more flags to archives: a) no_tracking - which will suppress tracking for all objects regardless of the setting of their serialization traits. My motivation for doing this was to permit the usage of serialization for things such as debug and transaction logs which would generate cases such as yours above. b) no_object_creation - which will simple reload pointers rather than re-create them. My motivation is to permit serialization to be used to implement the memento patter as described in GoF Patterns book. I currently have these changes in my local code base. And I've run all my old tests and they still work. I'm still struggliing with some small issues regarding loading to stl collections with no_object_creation. I'm also struggling with some issues related to these flags being runtime rather than compile time (i.e. template instantiation) options. I'm missing writing tests, demos and tutorial , and documentation. I'm not sure, but I think these new facilities may address the use cases raised here.
I looked through the archives a bunch and didn't come across anything conclusive. It seemed that some thought this kind of use case was pathological, but I'm not sure why.
My view has been that changing the state of an object while it is in the process of being serialized will inevitably lead to program that are not provably/demonstrably correct. The same goes for archive classes whose behavior can be changed during the course of serialization. Now by supporting the usage of serialization for logging - This concept will be broken. I'm still struggling with this. a) the idea of serialization of mutable objects does have application on logging type applications. its appealing to use if for this purpose. b) It wll break the original concept and lead to cases where errors are introduced which are almost impossible to track down without tracing into the implementation of the serialization library itself. This defeats the whole purpose of having a library in the first place.
What I'd like to be able to do is to tell the archive, "The previous calls to operator<<() represent a 'snapshot' of the state of some group of objects, and now I want you to forget about existent objects because I am going to rearrange them all. Continue to track object types, but forget about the addresses." I realize that this creates the possiblity for memory leaks, but if the serialization is done through one toplevel call to operator<< on a shared_ptr whose pointee contains pointers to a whole universe of home-cooked pointer spaghetti, I don't see a better way to do this, and I don't see how to clearly express what I intend via the export and tracking macro mechanisms.
You can't close and reopen the archive in the top loop, you get duplicate headers.
What about "no_header" ? and what would be wrong with duplicate headers anyway? The stream is still open and could just as well contain multiple archives.
The list archives mention the use case of serializing the state of some memory pool that is very likely to get reused: I think the little example above is probably the simplest case of this.
I'm not sure what this means.
So without asking for a sanity-check, I implemented basic_oarchive::flush(), and some tests. The changes to basic_oarchive
sanity is sometimes overrated.
and basic_oarchive_impl are very small. basic_oarchive has an internal object_set, which tracks object_ids and addresses. I add a num_flushed_objects member, and flush() clears out the set and adds the number of objects flushed to this counter. New tracked objects are assigned object_ids starting at num_flushed_objects + object_set.size(). In this way the class_id's are reused post-flush, but object id's are not. The interface is simply this:
My comments above should make it clear I wouldn't be enthusiastic about this approach. Having said that, I find it personally gratifying to find that some people are so enamored with this library to spend this kind of effort. Certain people have taken the library in "experimental" directions and I have worked in their results into the library. Persons who have made significant contributions are: Pavel Vozenilek - borland compilers and documentation Martin Ecker - DLL versions of Serialization and serialization of classes implemented in DLLS (plug-ins) troy d. straszheim(you) - serialization of variant. At the same time I endeavor to keep it from breaking under the weight of its own success. Its a fine line. Key "fixed points" in my requirements were and still are: a) boost acceptance - i need this for my resume as I'm currently looking for work. b) support of all compilers on which tests are run. c) idiot proof user interface and documentation. Ah - maybe i better say, user interface and documentation such that one can use the library without having to delve into its implementation. Also its important to me that one be able to use the library with a very short learning curve - say 1 hour to get started. Personally, I don't have much more patience than that. Robert Ramey