shared_ptr and serialization

Robert Ramey

17 Nov 2004 17 Nov '04

6:34 p.m.

Would it be possible to add something like friend template<class Archive> boost::serialization::serialize(Archive &ar, T &t); to boost::shared_ptr ? This would let me remove a problematic hack in boost/serialization/shared_ptr which is creating a headache when shared_ptr serialization is used with some other headers. The same may be necessary for some other headers - e.g. weak ptr. Another alternative would be to change the private members to protected or maybe even public. I realize that this creates a dependency between boost::serialization and boost:: shared_ptr that you have hoped to avoid. But until this is resolved at another level, its would be better than the current situation. Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface. This has worked well for the stl collections. I don't know if this was an intentional design decision or just an accident. We really need a real solution now for boost::shared_ptr / boost::serialization . Robert Ramey

Show replies by date

Peter Dimov

17 Nov 17 Nov

6:47 p.m.

Robert Ramey wrote:

...

Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface.

How, exactly? The current public interface is good enough for my serialization/deserialization needs, but I'm willing to consider your suggestions.

Gennadiy Rozental

6:56 p.m.

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:000a01c4ccd5$f1906910$6501a8c0@pdimov2...

...

Robert Ramey wrote:

...
Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface.

How, exactly? The current public interface is good enough for my serialization/deserialization needs, but I'm willing to consider your suggestions.

Do you remember out discussion on the subject. http://lists.boost.org/MailArchives/boost/msg04813.php I still believe We dont need any modification in shared_ptr to make it serializable (even more I believe it's evil od request one). I don't remember ever your answer on my last post. Gennadiy.

Peter Dimov

7:08 p.m.

Gennadiy Rozental wrote:

...

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:000a01c4ccd5$f1906910$6501a8c0@pdimov2...

...
Robert Ramey wrote:

...
Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface.

How, exactly? The current public interface is good enough for my serialization/deserialization needs, but I'm willing to consider your suggestions.

Do you remember out discussion on the subject.

http://lists.boost.org/MailArchives/boost/msg04813.php

I still believe We dont need any modification in shared_ptr to make it serializable (even more I believe it's evil od request one). I don't remember ever your answer on my last post.

Can you please restate your questions, or point me to your last post directly? I don't seem to be able to follow the thread to it. Is there something I need to add to: http://lists.boost.org/MailArchives/boost/msg74455.php

Gennadiy Rozental

7:45 p.m.

...

Can you please restate your questions, or point me to your last post directly? I don't seem to be able to follow the thread to it.

Is there something I need to add to:

http://lists.boost.org/MailArchives/boost/msg74455.php

Sorry, Peter, It was actually directed to Robert. You link point exactly on implementation I meant here: http://lists.boost.org/MailArchives/boost/msg04810.php Gennadiy

Robert Ramey

7:19 p.m.

How about this for a solution: I can just remove my shared_ptr serialization implementation from the serializaition package and you can make yours part of the share_ptr header files. That would be fine from my point of view. Robert Ramey "Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote in message news:cng6t4$jkk$1@sea.gmane.org...

...

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:000a01c4ccd5$f1906910$6501a8c0@pdimov2...

...
Robert Ramey wrote:

...
Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface.

How, exactly? The current public interface is good enough for my serialization/deserialization needs, but I'm willing to consider your suggestions.

I have no idea - shared_ptr implementation code is opaque to me.

...

Do you remember out discussion on the subject.

http://lists.boost.org/MailArchives/boost/msg04813.php

I still believe We dont need any modification in shared_ptr to make it serializable (even more I believe it's evil od request one). I don't remember ever your answer on my last post.

Hmm - I believe I concluded that the proposed solutions where inefficient compared to my implementation and required essentially replicating functionality already in the library. Honestly I don't remember now. Essentially I saw my implemenation as very straight forward in that is was identical to the way one does it for other user types. It was identical the way its done for collections and this has worked out very well.

...

Gennadiy.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Peter Dimov

8:11 p.m.

Robert Ramey wrote:

...

How about this for a solution:

I can just remove my shared_ptr serialization implementation from the serializaition package and you can make yours part of the share_ptr header files. That would be fine from my point of view.

The problem with that is (a) I don't know how to add the necessary state to the archive and (b) I'm not familiar with boost::serialization and I don't know how to reuse some of the functionality that is already present WRT saving/loading polymorphic types.

Robert Ramey

18 Nov 18 Nov

5:06 a.m.

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote in message news:cng6t4$jkk$1@sea.gmane.org...

...

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:000a01c4ccd5$f1906910$6501a8c0@pdimov2...

...
Robert Ramey wrote:

...
Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface.

How, exactly? The current public interface is good enough for my serialization/deserialization needs, but I'm willing to consider your suggestions.

Do you remember out discussion on the subject.

http://lists.boost.org/MailArchives/boost/msg04813.php

I still believe We dont need any modification in shared_ptr to make it serializable (even more I believe it's evil od request one). I don't remember ever your answer on my last post.

evil to REQUEST a modification - wow. Don't even look at the hack I had to do - I might get the death penalty! I thought about a little more. If I rememeber correctly the essense of the problem during de-serialization was: template<class Archive, class T> void serialize(Archive &ar, shared_ptr<T> & t){ T * raw_ptr; ar >> raw_ptr; t = shared_ptr(raw_ptr); // problem is here - not matched with other shared pointers that might point to raw_ptr // current shared_ptr implementation depends upon an internal pointer to a shared count. } Attempts to resolve this all ended up re-implementing some aspect of the serialization library. I didn't really consider modifying the library to expose the relevant aspects IT'S implementation. That would entail much more than changing a private to public or adding a friend. As far as I was concerned - implementing the the library once was enough. The implementation I made is straight forward and is compatible with all aspects of the serialization library including exported pointers, pointers to polymorphic base classes, non-default and private constructors - everything. So maybe we can consider a practical compromise. a) Grant me access to the internals of shared_ptr for now b) When someone comes up with more satisfactory implementation of serialization of boost::shared_ptr and its passes a mini-review - the current one can be deprecated and the new one can be used. Robert Ramey

Gennadiy Rozental

6:49 a.m.

...

...
I still believe We dont need any modification in shared_ptr to make it serializable (even more I believe it's evil od request one). I don't remember ever your answer on my last post.

evil to REQUEST a modification - wow. Don't even look at the hack I had to do - I might get the death penalty!

...

I thought about a little more. If I rememeber correctly the essense of

Byevil I meant that I do not see any need for shared_ptr counters direct acces neither for serialization nor for deserialization and the fact that you request it seems suspicious. the

...

problem during de-serialization was:

template<class Archive, class T> void serialize(Archive &ar, shared_ptr<T> & t){ T * raw_ptr; ar >> raw_ptr; t = shared_ptr(raw_ptr); // problem is here - not matched with other shared pointers that might point to raw_ptr

Here we do very simple (or complex,depends on point of view) trick instead of above Let say we have somewhere map<T*,shared_ptr<T>*> registry; if( registry[raw_ptr] == 0 ) { t = shared_ptr<T>( raw_ptr ) registry.add( raw_ptr, &t ); } else t = *registry[raw_ptr];

...

// current shared_ptr implementation depends upon an internal pointer

...

a shared count. }

Do I miss something important? Gennadiy

Peter Petrov

9:27 a.m.

Gennadiy Rozental <gennadiy.rozental <at> thomson.com> writes:

...

Here we do very simple (or complex,depends on point of view) trick instead of above Let say we have somewhere map<T*,shared_ptr<T>*> registry;

if( registry[raw_ptr] == 0 ) { t = shared_ptr<T>( raw_ptr ) registry.add( raw_ptr, &t ); } else t = *registry[raw_ptr];

...
// current shared_ptr implementation depends upon an internal pointer

to

...
a shared count. }

Do I miss something important?

What happens when some time after deserialization, one of the deserialized shared_ptr's is destroyed? Your "registry" has no way to know that and will still hold a shared_ptr referencing the same object, which is not correct.

Peter Dimov

12:22 p.m.

Peter Petrov wrote:

...

Gennadiy Rozental <gennadiy.rozental <at> thomson.com> writes:

...
Here we do very simple (or complex,depends on point of view) trick instead of above Let say we have somewhere map<T*,shared_ptr<T>*> registry;

if( registry[raw_ptr] == 0 ) { t = shared_ptr<T>( raw_ptr ) registry.add( raw_ptr, &t ); } else t = *registry[raw_ptr];

...
// current shared_ptr implementation depends upon an internal pointer to a shared count. }

Do I miss something important?

What happens when some time after deserialization, one of the deserialized shared_ptr's is destroyed? Your "registry" has no way to know that and will still hold a shared_ptr referencing the same object, which is not correct.

The registry is not meant to outlive the deserialization process.

Peter Petrov

1:19 p.m.

Peter Dimov <pdimov <at> mmltd.net> writes:

...

Peter Petrov wrote:

...
Gennadiy Rozental <gennadiy.rozental <at> thomson.com> writes:

...
Here we do very simple (or complex,depends on point of view) trick instead of above Let say we have somewhere map<T*,shared_ptr<T>*> registry;

if( registry[raw_ptr] == 0 ) { t = shared_ptr<T>( raw_ptr ) registry.add( raw_ptr, &t ); } else t = *registry[raw_ptr];

...
// current shared_ptr implementation depends upon an internal pointer to a shared count. }

Do I miss something important?

What happens when some time after deserialization, one of the deserialized shared_ptr's is destroyed? Your "registry" has no way to know that and will still hold a shared_ptr referencing the same object, which is not correct.

The registry is not meant to outlive the deserialization process.

How can you define the lifetime of the deserialization process? IIUC, the registry ought to live at least for as long as the archive from which we deserialize. The archive, in turn, may well live during the entire lifetime of the application (for example, if it is bound to a pipe).

Peter Dimov

1:53 p.m.

Peter Petrov wrote:

...

Peter Dimov <pdimov <at> mmltd.net> writes:

...
The registry is not meant to outlive the deserialization process.

How can you define the lifetime of the deserialization process?

IIUC, the registry ought to live at least for as long as the archive from which we deserialize. The archive, in turn, may well live during the entire lifetime of the application (for example, if it is bound to a pipe).

This has never been a problem for me, because I haven't encountered the need for an infinite archive. An infinite archive that contains shared_ptr instances should, in fact, keep the objects alive forever: - read shared_ptr #28, along with the object - reset() the shared_ptr read above - read shared_ptr sharing ownership with #28

Joe Gottman

19 Nov 19 Nov

12:57 a.m.

"Peter Petrov" <ppetrov@ppetrov.com> wrote in message news:loom.20041118T102318-791@post.gmane.org...

...

Gennadiy Rozental <gennadiy.rozental <at> thomson.com> writes:

...
Here we do very simple (or complex,depends on point of view) trick instead of above Let say we have somewhere map<T*,shared_ptr<T>*> registry;

if( registry[raw_ptr] == 0 ) { t = shared_ptr<T>( raw_ptr ) registry.add( raw_ptr, &t ); } else t = *registry[raw_ptr];

...
// current shared_ptr implementation depends upon an internal pointer

to

...
a shared count. }

Do I miss something important?

What happens when some time after deserialization, one of the deserialized shared_ptr's is destroyed? Your "registry" has no way to know that and will still hold a shared_ptr referencing the same object, which is not correct.

Maybe we can do this instead: map<T*,weak_ptr<T> > registry; weak_ptr<T> ®istered = registry[raw_ptr]; t = registered.lock(); if (!t) { t = shared_ptr<T>(raw_ptr); registered = t; } After all, one of the main uses for weak_ptr is to implement a cache of shared_ptr's. Joe Gottman

Peter Dimov

18 Nov 18 Nov

12:24 p.m.

Gennadiy Rozental wrote:

...

Let say we have somewhere map<T*,shared_ptr<T>*> registry;

if( registry[raw_ptr] == 0 ) { t = shared_ptr<T>( raw_ptr ) registry.add( raw_ptr, &t ); } else t = *registry[raw_ptr];

...
}

Do I miss something important?

The problem is that a shared_ptr<T> and a shared_ptr<U> can share ownership, and that they both can actually contain a V, derived from T and U.

Peter Dimov

12:21 p.m.

Robert Ramey wrote:

...

I thought about a little more. If I rememeber correctly the essense of the problem during de-serialization was:

template<class Archive, class T> void serialize(Archive &ar, shared_ptr<T> & t){ T * raw_ptr; ar >> raw_ptr; t = shared_ptr(raw_ptr); // problem is here - not matched with other shared pointers that might point to raw_ptr // current shared_ptr implementation depends upon an internal pointer to a shared count. }

No, the problems lie elsewhere. The pseudocode of the deserializer is as follows: typedef std::map< int, shared_ptr<void> > load_sp_map; template<class Archive, class T> void load(Archive & ar, shared_ptr<T> & t) { int pid; ar >> pid; if( pid == 0 ) { t.reset(); } else { load_sp_map & map = get_load_sp_map( ar ); // #1 if( map.count(pid) ) { t = convert_from_derived_to_T( map[pid] ); // #2 } else { Derived * p; // #3 ar >> p; shared_ptr<T> tmp( p ); t = p; map[pid] = p; } } } The line marked #1 is a general problem with the library, there is no way to associate user data with an archive, and some types require it. The lines marked #2 and #3 come into play when the actual pointer that was serialized was shared_ptr<T> t( new Derived ); #3 is not doable without help from the library, because - if I understand correctly - only the library knows how the type of a polymorphic object is stored in the archive. IOW the external representation is opaque, an "implementation detail". #2 is a problem because map[pid] was read in a previous call to load by #3 and points to some type that is not necessarily T.

...

Attempts to resolve this all ended up re-implementing some aspect of the serialization library.

Indeed.

...

I didn't really consider modifying the library to expose the relevant aspects IT'S implementation. That would entail much more than changing a private to public or adding a friend.

Maybe, but perhaps this would help others that are faced with a similar problem?

...

As far as I was concerned - implementing the the library once was enough. The implementation I made is straight forward and is compatible with all aspects of the serialization library including exported pointers, pointers to polymorphic base classes, non-default and private constructors - everything.

So maybe we can consider a practical compromise.

a) Grant me access to the internals of shared_ptr for now

a) You aren't going to get access to the internals of std::tr1::shared_ptr. b) Your implementation doesn't work when a weak_ptr is read before the corresponding shared_ptr. c) I "reserve the right to" change the implementation of boost::shared_ptr. IIUC with this implementation this will break every data file that contains a boost::shared_ptr. Correct?

Russell Hind

12:47 p.m.

Peter Dimov wrote:

...

...
So maybe we can consider a practical compromise.

a) Grant me access to the internals of shared_ptr for now

a) You aren't going to get access to the internals of std::tr1::shared_ptr.

b) Your implementation doesn't work when a weak_ptr is read before the corresponding shared_ptr.

But in this case, the weak_ptr was also written before the shared_ptr so when the weak_ptr was written, if it had a valid point, it should have written the shared_ptr's object out to the archive. What is then needed is a way to store the re-constructed weak_ptr on reading so that it isn't automatically destroyed before a valid shared_ptr of the same object is read back in.

...

c) I "reserve the right to" change the implementation of boost::shared_ptr. IIUC with this implementation this will break every data file that contains a boost::shared_ptr. Correct?

Thats fair enough but as Neil pointed out, serialization is now an accepted part of boost and both shared_ptr and serialization are important part of our applications, so shouldn't we make every effort to make the two play nicely together? If we can't do this for std::tr1::shared_ptr, we could still do this for boost::shared_ptr. Cheers Russell

Peter Dimov

1:45 p.m.

Russell Hind wrote:

...

Peter Dimov wrote:

...
c) I "reserve the right to" change the implementation of boost::shared_ptr. IIUC with this implementation this will break every data file that contains a boost::shared_ptr. Correct?

Thats fair enough but as Neil pointed out, serialization is now an accepted part of boost and both shared_ptr and serialization are important part of our applications, so shouldn't we make every effort to make the two play nicely together?

Yes, we should. We differ in our views as to what constitutes "every effort".

...

If we can't do this for std::tr1::shared_ptr, we could still do this for boost::shared_ptr.

When developing my serialization library, I have made "every effort" to support boost::shared_ptr without needing to access its implementation details, even though - being the author of shared_ptr - I can do that safely. I could have even changed shared_ptr's interface if I wanted to. But I didn't. And as a result, I can now serialize std::tr1::shared_ptr, and the serialized form does not depend on implementation details, so that I (or anyone else, for that matter, since the format is well-specified) can deserialize a boost::shared_ptr from the same archive. So "we can do this", but apparently we don't want to. If this is the case, and the current scheme is acceptable, it's easy for me to just add a friend declaration to shared_ptr and be done with it.

Andreas Huber

6:18 p.m.

Peter Dimov wrote:

...

So "we can do this", but apparently we don't want to. If this is the case, and the current scheme is acceptable, it's easy for me to just add a friend declaration to shared_ptr and be done with it.

IIRC, then this would not solve one problem you described in an earlier post. Namely how a weak_ptr can be loaded before the first shared_ptr referencing the same object is loaded. As you stated, it seems that solving this problem *requires* that serialization users can assoicate arbitrary data with an archive during load. Regards, -- Andreas Huber When replying by private email, please remove the words spam and trap from the address shown in the header.

Robert Ramey

19 Nov 19 Nov

6:56 a.m.

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:009001c4cd69$3bb90d60$6501a8c0@pdimov2...

...

Robert Ramey wrote:

...
So maybe we can consider a practical compromise.

a) Grant me access to the internals of shared_ptr for now

a) You aren't going to get access to the internals of std::tr1::shared_ptr.

Honestly, I don't need to serialize std::tr1::shared_ptr - why would any of us ever use it as long as we have boost::shared_ptr which (with serialization is a superset of std::tr1::shared_ptr.)

...

b) Your implementation doesn't work when a weak_ptr is read before the corresponding shared_ptr.

It would seem to me that serialization a weak pointer before serializing its corresponding shared pointer would be a user error. Easily detectable at load time - (alas, perhaps too late). In general this couldn't be detected a save time. The problem is that the one might not be serializing *all* the shared pointers to a particular object. This shouldn't prevent the de-serialization from restoring a consistent subset of to original set of shared_ptrs. This is not an implementation issue. Its what has to happen to maintain consistency accross persistence/marshalling. I would expect a simple and correct serialization of weak_ptr to look something like: template<class Archive, class T> void save(Archive &ar, const weak_ptr<T> &wp){ shared_ptr<T> sp = wp; ar << sp; } template<class Archive, class T> void load(Archive &ar, weak_ptr<T> &wp){ shared_ptr<T> sp; ar >> sp; wp = sp; // note exception on exit if weak pointer loaded before corresponding shared pointer }

...

c) I "reserve the right to" change the implementation of

boost::shared_ptr.

...

IIUC with this implementation this will break every data file that contains a boost::shared_ptr. Correct?

Currently. shared_ptr<T> has a default class version of 0. This version number is in the archive and is available when the archive is loaded. Should the implemenation change in such a way that a different de-serializaton algorithm is required, the de-serialization method can be conditioned on the version number. This is described in the serialization documentation. It is possible that the change in implementation would be so drastic that this mechanism couldn't deal with it. In any case, would not such a change be subject to a mini-review? Robert Ramey

Andreas Huber

10:11 a.m.

Robert Ramey <ramey <at> rrsd.com> writes:

...

...
b) Your implementation doesn't work when a weak_ptr is read before the corresponding shared_ptr.

It would seem to me that serialization a weak pointer before serializing its corresponding shared pointer would be a user error.

Why make something a user error that can easily be made to work? I see that it is slightly more difficult to implement, but it does not seem to be that complicated? Regards, -- Andreas Huber When replying by private email, please remove the words spam and trap from the address shown in the header.

Jonathan Wakely

11:09 a.m.

On Thu, Nov 18, 2004 at 10:56:24PM -0800, Robert Ramey wrote:

...

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:009001c4cd69$3bb90d60$6501a8c0@pdimov2...

...
Robert Ramey wrote:

...
So maybe we can consider a practical compromise.

a) Grant me access to the internals of shared_ptr for now

a) You aren't going to get access to the internals of std::tr1::shared_ptr.

Honestly, I don't need to serialize std::tr1::shared_ptr - why would any of us ever use it as long as we have boost::shared_ptr which (with serialization is a superset of std::tr1::shared_ptr.)

By "us" I assume you mean the Boost developers. Do you suppose that the Boost developers are the only, or even typical, users of Boost? One day std::tr1 will come provided with most standard libraries, and there might be reasons it is preferred in some projects to the boost versions. If one of those projects wanted to use the Boost serialization library would they have to switch to boost::shared_ptr ? Even if you can't think of a good reason, do you suppose that means there isn't one? How about a large system that must have no external requirements in some components, so Boost cannot be used, but std::tr1::shared_ptr is used. One isolated part of this system wants to use Boost.Serialization, but finds that to do so would require that the entire system be switched to use boost::shared_ptr. I don't have any other opinion on the subject, I just wanted to point out what I thought was rather presumptuous. jon -- "Because the only people for me are the mad ones, the ones who are mad to live, mad to talk, mad to be saved, desirous of everything at the same time, the ones who never yawn or say a commonplace thing, but burn, burn, burn like fabulous yellow roman candles exploding like spiders across the stars and in the middle you see the blue centerlight pop and everybody goes "Awww!" - Jack Kerouac

Robert Ramey

4:49 p.m.

"Jonathan Wakely" <cow@compsoc.man.ac.uk> wrote in message news:20041119110911.GG9648@compsoc.man.ac.uk...

...

On Thu, Nov 18, 2004 at 10:56:24PM -0800, Robert Ramey wrote:

...
"Peter Dimov" <pdimov@mmltd.net> wrote in message news:009001c4cd69$3bb90d60$6501a8c0@pdimov2...

...
Robert Ramey wrote:

...
So maybe we can consider a practical compromise.

a) Grant me access to the internals of shared_ptr for now

a) You aren't going to get access to the internals of std::tr1::shared_ptr.

Honestly, I don't need to serialize std::tr1::shared_ptr - why would any of us ever use it as long as we have boost::shared_ptr which (with serialization is a superset of std::tr1::shared_ptr.)

By "us" I assume you mean the Boost developers. Do you suppose that the Boost developers are the only, or even typical, users of Boost?

By "us" I meant a user of boost libraries. A user of boost libraries has access to boost::shared_ptr as well as boost::serialization .

...

One day std::tr1 will come provided with most standard libraries, and there might be reasons it is preferred in some projects to the boost versions.

If one of those projects wanted to use the Boost serialization library would they have to switch to boost::shared_ptr ?

At this point yes since the std::tr1::shared_ptr doesn't make any provision for its serialization. And no one has figured out how to do it based soley on its public interface.So far its the only standard library component that anyone has wanted to serialize that hasn't been possible from its public interface. Maybe the advocates of std::tr1::shared_ptr have to spend some more time thinking about this. As an aside - this reinforces my personal reservations that standardization of library components that are solely dependent on the language is not a great idea. Personally, I think we would be better served to limit these efforts to those library facilities meant to abstract away platform differences.

...

Even if you can't think of a good reason, do you suppose that means there isn't one?

I didn't suppose anything - I just asked the question. If someone knows a good reason I would like to hear it.

...

How about a large system that must have no external requirements in some components, so Boost cannot be used, but std::tr1::shared_ptr is used.

If boost cannot be used - boost serialization cannot be used so there is no issue here as far as boost serialization is concerned.

...

One isolated part of this system wants to use Boost.Serialization, but finds that to do so would require that the entire system be switched to use boost::shared_ptr.

Its hard to know without a specific case. Its very common for me to have boost in some modules and not in others. (Actually, this is the usual case for me).

...

I don't have any other opinion on the subject, I just wanted to point out what I thought was rather presumptuous.

Opinion noted. Robert Ramey

Peter Dimov

1:46 p.m.

Robert Ramey wrote:

...

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:009001c4cd69$3bb90d60$6501a8c0@pdimov2...

...
Robert Ramey wrote:

...
So maybe we can consider a practical compromise.

a) Grant me access to the internals of shared_ptr for now

a) You aren't going to get access to the internals of std::tr1::shared_ptr.

Honestly, I don't need to serialize std::tr1::shared_ptr - why would any of us ever use it as long as we have boost::shared_ptr which (with serialization is a superset of std::tr1::shared_ptr.)

I see your point of view, but I'm not sure that you see mine. At issue is which is more fundamental, serialization or shared_ptr. Your desire to treat shared_ptr as yet another user-defined type is understandable, but if you substitute std::map for shared_ptr, you'll see that this stance is not sustainable in general. Even if we had a boost::map. There will always exist types that must be serialized non-intrusively.

...

...
b) Your implementation doesn't work when a weak_ptr is read before the corresponding shared_ptr.

It would seem to me that serialization a weak pointer before serializing its corresponding shared pointer would be a user error.

No, it's not a user error at all, it's a perfectly reasonable scenario. I have, at the moment, three such structures. Here's an example: vector< shared_ptr<X> > v; where the relevant part of X is: struct X { weak_ptr<X> target_; }; v[0]'s target_ can easily be v[1].

...

...
c) I "reserve the right to" change the implementation of boost::shared_ptr. IIUC with this implementation this will break every data file that contains a boost::shared_ptr. Correct?

Currently. shared_ptr<T> has a default class version of 0. This version number is in the archive and is available when the archive is loaded. Should the implemenation change in such a way that a different de-serializaton algorithm is required, the de-serialization method can be conditioned on the version number. This is described in the serialization documentation. It is possible that the change in implementation would be so drastic that this mechanism couldn't deal with it.

In any case, would not such a change be subject to a mini-review?

A mini review for each change of undocumented implementation details? Sign me up.

Robert Ramey

5:24 p.m.

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:008d01c4ce3e$2b37d4f0$0600a8c0@pdimov...

...

Robert Ramey wrote:

...
"Peter Dimov" <pdimov@mmltd.net> wrote in message news:009001c4cd69$3bb90d60$6501a8c0@pdimov2...

...
Robert Ramey wrote:

...
So maybe we can consider a practical compromise.

a) Grant me access to the internals of shared_ptr for now

a) You aren't going to get access to the internals of std::tr1::shared_ptr.

Honestly, I don't need to serialize std::tr1::shared_ptr - why would any of us ever use it as long as we have boost::shared_ptr which (with serialization is a superset of std::tr1::shared_ptr.)

I see your point of view, but I'm not sure that you see mine.

At issue is which is more fundamental, serialization or shared_ptr. Your desire to treat shared_ptr as yet another user-defined type is understandable, but if you substitute std::map for shared_ptr, you'll see that this stance is not sustainable in general. Even if we had a boost::map. There will always exist types that must be serialized non-intrusively.

I do see you're point. I just don't see a way to address it. So far shared_ptr is the only standard libary component which can't be serialized from its public interface. I came upon this is issue after doing pretty much everything else and need and non-trivial use case. I concede its hard to create a public interface for shared_ptr to support serializaion (It's probably not even possible). I know its hard (if its possible at all) to create a public interface in the serialization library to permit serialization something like a shared_ptr If it were possible, it would end up adding lots of complexity to the serialization library to handle one case (so far).

...

...
...
b) Your implementation doesn't work when a weak_ptr is read before the corresponding shared_ptr.

It would seem to me that serialization a weak pointer before serializing its corresponding shared pointer would be a user error.

No, it's not a user error at all, it's a perfectly reasonable scenario. I have, at the moment, three such structures. Here's an example:

vector< shared_ptr<X> > v;

where the relevant part of X is:

struct X { weak_ptr<X> target_; };

v[0]'s target_ can easily be v[1].

the de-serialization of a vector is somethng like: template<class Archive, class T> void load(Archive ar, vector<T> & v){ unsigned int count; ar >> count; v.clear(); while(--count){ T t; ar >> t; v.push_back(t) } }

...

From my reading of the weak_ptr document, a weak_ptr cannot point to anything if there is no existent correspnding shared pointer. So the above would necessarily fail unless a shared_ptr was previously serialized in the same archive - which is not guarenteed. In order to use such a data structure, the default implementation of the serialization of vector would have to be overriden.

It is possible to concieve of data structures that are not serializable by this library. I havn't had such cases reported to me so I presume they are infrequently, if ever, encountered by users of the boost serialization library. So perhaps rather than the term "user error", it might be more palatable to describe it as "unsupported by the current default serialization library implementation".

...

...
...
c) I "reserve the right to" change the implementation of boost::shared_ptr. IIUC with this implementation this will break every data file that contains a boost::shared_ptr. Correct?

Currently. shared_ptr<T> has a default class version of 0. This version number is in the archive and is available when the archive is loaded. Should the implemenation change in such a way that a different de-serializaton algorithm is required, the de-serialization method can be conditioned on the version number. This is described in the serialization documentation. It is possible that the change in implementation would be so drastic that this mechanism couldn't deal with it.

In any case, would not such a change be subject to a mini-review?

A mini review for each change of undocumented implementation details? Sign me up.

I understand your objection to boost::serialization depending upon the implementation details of boost::shared_ptr. I just don't see any way to address this other than making big changes to either the serialization libary or shared_ptr. Even then, its not obvious to me that its even possible. So the question is what do provide for users now? Robert Ramey

Russell Hind

20 Nov 20 Nov

6:05 a.m.

Robert Ramey wrote:

...

the de-serialization of a vector is somethng like:

template<class Archive, class T> void load(Archive ar, vector<T> & v){ unsigned int count; ar >> count; v.clear(); while(--count){ T t; ar >> t; v.push_back(t) } }

...
From my reading of the weak_ptr document, a weak_ptr cannot point to anything if there is no existent correspnding shared pointer. So the above would necessarily fail unless a shared_ptr was previously serialized in the same archive - which is not guarenteed. In order to use such a data structure, the default implementation of the serialization of vector would have to be overriden.

From a users's point of view, how I'd expect/like this to work is that if a weak pointer is serialized first, then the actual shared object is serialized. On the way back in (loading) the weak pointer can be read and will point to a valid shared object *until* the archive is closed. At that point, if the user doesn't have a shared_ptr to the object, then the object is deleted but up until then, the weak_ptr is valid. Yes it is ultimately a user error to serialize *only* a weak pointer, but I don't believe it should be a user error to serialize a weak_ptr before a shared_ptr. IMHO, the library should keep the object alive (i.e. hold a shared_ptr to any de-serialized pointers, shared or weak) until the archive is closed. But I haven't really looked at the implementation of shared_ptr and serialization, so this is just how I'd 'expect' it to work Cheers Russell

Robert Ramey

7:05 a.m.

...

Robert Ramey wrote:

...
the de-serialization of a vector is somethng like:

template<class Archive, class T> void load(Archive ar, vector<T> & v){ unsigned int count; ar >> count; v.clear(); while(--count){ T t; ar >> t; v.push_back(t) } }

...
From my reading of the weak_ptr document, a weak_ptr cannot point to anything if there is no existent correspnding shared pointer. So the

above

...
would necessarily fail unless a shared_ptr was previously serialized in

Truth is, the scenario you describe never occurred to me. It might be addressable with a more elaborate implementation of weak_ptr serialization than I proposed - but maybe not. I'll have to think about it some more. Robert Ramey "Russell Hind" <rh_gmane@mac.com> wrote in message news:cnmmt0$7jk$1@sea.gmane.org... the

...

...
same archive - which is not guarenteed. In order to use such a data structure, the default implementation of the serialization of vector would have to be overriden.

From a users's point of view, how I'd expect/like this to work is that if a weak pointer is serialized first, then the actual shared object is serialized. On the way back in (loading) the weak pointer can be read and will point to a valid shared object *until* the archive is closed. At that point, if the user doesn't have a shared_ptr to the object, then the object is deleted but up until then, the weak_ptr is valid.

Yes it is ultimately a user error to serialize *only* a weak pointer, but I don't believe it should be a user error to serialize a weak_ptr before a shared_ptr. IMHO, the library should keep the object alive (i.e. hold a shared_ptr to any de-serialized pointers, shared or weak) until the archive is closed.

But I haven't really looked at the implementation of shared_ptr and serialization, so this is just how I'd 'expect' it to work

Cheers

Russell

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey

21 Nov 21 Nov

5:34 a.m.

I've concluded that serializing shared pointer only depending on the std::tr1::shared_ptr interface isn't going to be possible within the current serialization library. This will have to be handled via some separate module. My aspiration is to be able to serialize boost::shared_ptr In order to do this, I would like to see access of currently private members of boost::shared_ptr changed to protected. This will allow me access through static downcasting. Feel free to include a warning/disclaimer in the source code against doing this. This should be sufficient to prevent users of shared_ptr from accidently coming to depend on the boost::shared_ptr implemenation. I will include the following in the boost serialization documentation and source code: a) a disclaimer that serialization of boost::shared_ptr does not imply serialization of std::tr1::shared_ptr is implemented. b) a warning that there is no guarentee that boost::shared_ptr will not change so much in the future as to guarentee automatic compatibility with previously written archives. "Peter Dimov" <pdimov@mmltd.net> wrote in message news:009001c4cd69$3bb90d60$6501a8c0@pdimov2...

...

Robert Ramey wrote:

...

...
So maybe we can consider a practical compromise.

a) Grant me access to the internals of shared_ptr for now

a) You aren't going to get access to the internals of std::tr1::shared_ptr.

I realize that - I aspire only to serialize boost::shared_ptr

...

b) Your implementation doesn't work when a weak_ptr is read before the corresponding shared_ptr.

Correct, that will have to be a limitation for the time being.

...

c) I "reserve the right to" change the implementation of boost::shared_ptr.

...

IIUC with this implementation this will break every data file that contains a boost::shared_ptr. Correct?

Correct - if the change is such that the current versioning mechanism can't be used to maintain compatibility with this version. From what I've seen, this would seem unlikely. Robert Ramey

Peter Dimov

3:55 p.m.

Robert Ramey wrote:

...

I've concluded that serializing shared pointer only depending on the std::tr1::shared_ptr interface isn't going to be possible within the current serialization library. This will have to be handled via some separate module.

I'll try to come up with something. No timelines, though. There are some places where the library can be of assistance, but this can be done later in a series of incremental improvements. I looked at the implementation of the serialization library. Quite complicated and remarkable in its own way. ;-) The root cause is, in my opinion, that you started off with something and then added features as requested. Had you started with the _requested features_ you might not have needed to add the rest. Anyway, to get back to shared_ptr. The key points are: 1. The ability to inject a data member (the pointer map) into an archive. I intend to handle this by creating custom archive types that hold the required pointer map. 2. The ability to upcast a shared_ptr<void>/type_info to another shared_ptr<void>/type_info. The library has void_cast, but it supports raw pointers, not shared_ptr<void>. A separate upcast registry will be required, unless the void_cast mechanism is expanded to also provide shared_ptr<void> to shared_ptr<void> conversions. 3. The ability to serialize a polymorphic object. I'll just serialize a raw pointer; the library takes care of the rest. 4. The ability to deserialize a polymorphic object. Initially I'll deserialize a raw pointer. The problem with this approach is that the deserialized shared_ptr<T> will contain a deleter that will attempt to destroy a T, whereas the original may have had a deleter destroying a Derived. This can affect users if ~T is inaccessible or non-virtual or if T is incomplete. It is not possible to solve this without help from the library; basically, the T* deserializer (which internally operates with void*/type_info pairs) will need to be duplicated to return shared_ptr<T> (internally shared_ptr<void>/type_info). 5. The ability to downcast the deserialized shared_ptr<T> to shared_ptr<void> pointing to the most derived object. This can be done with dynamic_pointer_cast<void>. The external representation of shared_ptr / weak_ptr is as demonstrated by the earlier example: - int pid; - (opt) T * px; where pid == 0 denotes an empty pointer, a newly seen pid denotes a new object and is followed by px, and an already seen pid denotes a reference to an existing object. shared_ptr probably needs to be marked as "never track, unversioned" to not clutter the archive, but I'm not sure what is the official serialization library policy regarding std:: types.

Robert Ramey

5:32 p.m.

I've also been thinking about this. My very first attempt a long time ago was something like: template<class Archive, class T> void serialize(Archive &ar, shared_ptr<T> & t){ T * raw_ptr; ar >> raw_ptr; t = shared_ptr(raw_ptr); // problem is here - not matched with other shared pointers that might point to raw_ptr // current shared_ptr implementation depends upon an internal pointer to a shared count. } Adding a map in the archive instance to permit shared_ptrs to be "matched up" would seem to me to solve the problem. So a custom archive which includeded such a map would be all that was needed I believe. My main objection to this is coupling the archives to specific data types. I haven't seen a way to do this in a generic way so that it could handle the "next" shared_ptr.

...

I looked at the implementation of the serialization library. Quite complicated and remarkable in its own way. ;-)

I'll choose to take that as a complement.

...

The root cause is, in my opinion, that you started off with something and then added features as requested.

...

Had you started with the _requested features_ you might not have needed to add the rest. Anyway, to get back to shared_ptr.

Actually I started with the features I felt were missing in all the other libraries I looked at. Then there were the requested features. They all fit in a more or less natural spot. There is no feature in there that someone has not considered really, really indispensible. There is only one feature that I don't think has been used - that is the extended type info which doesn't rely on std::type_info. I do get regular feed back. The draft package on my personal website has been downloaded 3000 times. I don't know how many times it was downloaded from the yahoo file section. Every couple of weeks I get feedback from someone congratulating me on getting it into boost - and giving me an explanation of how their own system works. I also get a lot of short explanations on how simple it would be to ... In incredible amount of time is spent addressing the vagarities of different C++ implementations. I don't believe that anyone who hasn't done something like this appreciates the amount of effort this aspect consumes.

...

4. The ability to deserialize a polymorphic object.

Initially I'll deserialize a raw pointer.

The problem with this approach is that the deserialized shared_ptr<T> will contain a deleter that will attempt to destroy a T, whereas the original may have had a deleter destroying a Derived.

This can affect users if ~T is inaccessible or non-virtual or if T is incomplete.

It is not possible to solve this without help from the library; basically, the T* deserializer (which internally operates with void*/type_info pairs) will need to be duplicated to return shared_ptr<T> (internally shared_ptr<void>/type_info).

That's quite a mouthful - I'll have to think about that.

...

5. The ability to downcast the deserialized shared_ptr<T> to shared_ptr<void> pointing to the most derived object.

This can be done with dynamic_pointer_cast<void>.

The external representation of shared_ptr / weak_ptr is as demonstrated by the earlier example:

- int pid; - (opt) T * px;

where pid == 0 denotes an empty pointer, a newly seen pid denotes a new object and is followed by px, and an already seen pid denotes a reference to an existing object.

shared_ptr probably needs to be marked as "never track, unversioned" to not clutter the archive, but I'm not sure what is the official serialization library policy regarding std:: types.

I would have to think about that as well. I'm not convinced that you couldn't just cast to void * and use that to lookup in the map. I think marking it unversioned would be a mistake as this won't be trivial in any case. Marking never track is probably a good idea - though in practice, the default is that things are tracked only if used as pointers. I doubt anyone will want to serialize a shared_ptr through a raw pointer. (LOL - I'm being presumptuious again - I can't help myself) Robert Ramey

Robert Ramey

22 Nov 22 Nov

4:37 a.m.

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:002d01c4cfe2$96d92bf0$6501a8c0@pdimov2...

...

Robert Ramey wrote:

...
I've concluded that serializing shared pointer only depending on the std::tr1::shared_ptr interface isn't going to be possible within the current serialization library. This will have to be handled via some separate module.

I'll try to come up with something. No timelines, though. There are some places where the library can be of assistance, but this can be done later in a series of incremental improvements.

I looked at the implementation of the serialization library. Quite complicated and remarkable in its own way. ;-) The root cause is, in my opinion, that you started off with something and then added features as requested. Had you started with the _requested features_ you might not have needed to add the rest. Anyway, to get back to shared_ptr.

The key points are:

...

2. The ability to upcast a shared_ptr<void>/type_info to another shared_ptr<void>/type_info.

I don't see why this is necessary

...

The problem with this approach is that the deserialized shared_ptr<T> will contain a deleter that will attempt to destroy a T, whereas the original may have had a deleter destroying a Derived.

...

This can affect users if ~T is inaccessible or non-virtual or if T is incomplete.

It is not possible to solve this without help from the library; basically, the T* deserializer (which internally operates with void*/type_info pairs) will need to be duplicated to return shared_ptr<T> (internally shared_ptr<void>/type_info).

I don't see why this is necessary. During deserialization - the shared_ptr already exists - usually as a member variable of same data structure. Its already been created with the appropriate deleter. If the current deleter is different than the original - then its an issue to be addressed with class versioning. In general, the serialization library presumes that the s tructure being recovered is the same as that originally saved - any differences are addressed through versioning.

...

5. The ability to downcast the deserialized shared_ptr<T> to shared_ptr<void> pointing to the most derived object.

This can be done with dynamic_pointer_cast<void>.

I don't see why this is necessary. The library does the appropriate downcasting of raw pointer. As far as I know there's no reason why a share_ptr<T> can't have a raw_ptr which, though it points to a T, actually corresponds to a derivation of T.

...

The external representation of shared_ptr / weak_ptr is as demonstrated by the earlier example:

- int pid; - (opt) T * px;

where pid == 0 denotes an empty pointer, a newly seen pid denotes a new object and is followed by px, and an already seen pid denotes a reference

...

an existing object.

I don't see why this is necessary. The library already correctly handles null raw pointers.

...

shared_ptr probably needs to be marked as "never track, unversioned" to not clutter the archive, but I'm not sure what is the official serialization library policy regarding std:: types.

For std collections, the policy I've used has been: a) unversioned - on the idea that std collections are "cast in stone". Given the complexity of serialization of shared_ptr, the possibility that its serialization may depend upon its implementation (at least for boost::shared_ptr, and that you've reserved the right to change the implementation) I think it would be prudent to leave it as the default - i.e versioned. This only adds 1 integer per template instantiation per archive. A small price to pay to maintain future readability. b) tracking - default. This means that instances are tracked if and only if anywhere in the program the the type is serialized through a pointer. I wouldn't expect that to happen so I expect the default would be just fine. The only real issue as I see it is that of pointer mapping. Making an archive derivation that includes the map would work fine - but I would hope we could find something that doesn't require a special archive type to serialize a specific data type. Perhaps we can permit one to register an exit routine with an archive. Then when a map is created a deletere is registered with the archive. I wouldn't be in love with this but I could probably live with it. The other thing I was thinking about was the possibility of adding access to the tracking map for matching up de-serialized shared_ptrs. So far, I haven't resolved this in my own mind - I'm still thinking about it. It's probably a dead end. Robert Ramey

Robert Ramey

20 Nov 20 Nov

6:22 p.m.

Postings on this thread have motivated me to think some more on this topic. "Peter Dimov" <pdimov@mmltd.net> wrote in message news:000a01c4ccd5$f1906910$6501a8c0@pdimov2...

...

Robert Ramey wrote:

...
Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface.

...

How, exactly? The current public interface is good enough for my serialization/deserialization needs,

I don't know that your needs are the same as everyone elses. The list of everyone else's needs is very long and varied and was the key obstacle to getting any serialization library even considered - let alone accepted.

...

but I'm willing to consider your suggestions.

Here's another idea: require that std::tr1::shared_ptr support the following interface as a member template function: template<class Archive, class T> void serialize(Archive &ar, share_ptr<T> &t); where Archive is a class that is required to provide the following member functions template<class T> Archive & operator<<(const T &t); template<class T> Archive & operator>>(T &t); template<class T> Archive & operator&(T &t); for any type T I realize that such a suggestion may seem outlandish, premature and lots of other things. But it is not fundamentally any more arbitrary than the requirement that a serialization library include the interfaces/functionality to support serialization of a shared_ptr through a specific interface (std::tr1::share_ptr). I believe that in the longer run, something of this nature will be the only viable solution to this and other similar problems. Robert Ramey

Robert Ramey

21 Nov 21 Nov

5:04 a.m.

Since posting this I realized that although it would address the issue of code portability - it wouldn't guarentee archive portability. Even though the idea of some standard archive format is waaaaaaaaaaaaay off into the future, this idea would conflict with it. I presume someone would find this to be an issue. Robert Ramey "Robert Ramey" <ramey@rrsd.com> wrote in message news:cno217$o2s$1@sea.gmane.org...

...

Postings on this thread have motivated me to think some more on this

topic.

...

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:000a01c4ccd5$f1906910$6501a8c0@pdimov2...

...
Robert Ramey wrote:

...
Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface.

...
How, exactly? The current public interface is good enough for my serialization/deserialization needs,

I don't know that your needs are the same as everyone elses. The list of everyone else's needs is very long and varied and was the key obstacle to getting any serialization library even considered - let alone accepted.

...
but I'm willing to consider your suggestions.

Here's another idea:

require that std::tr1::shared_ptr support the following interface as a member template function:

template<class Archive, class T> void serialize(Archive &ar, share_ptr<T> &t);

where Archive is a class that is required to provide the following member functions

template<class T> Archive & operator<<(const T &t); template<class T> Archive & operator>>(T &t); template<class T> Archive & operator&(T &t);

for any type T

I realize that such a suggestion may seem outlandish, premature and lots

...

other things. But it is not fundamentally any more arbitrary than the requirement that a serialization library include the interfaces/functionality to support serialization of a shared_ptr through a specific interface (std::tr1::share_ptr).

I believe that in the longer run, something of this nature will be the only viable solution to this and other similar problems.

Robert Ramey

Neal D. Becker

17 Nov 17 Nov

7:02 p.m.

Robert Ramey wrote:

...

Would it be possible to add something like

friend template<class Archive> boost::serialization::serialize(Archive &ar, T &t);

to boost::shared_ptr

?

This would let me remove a problematic hack in boost/serialization/shared_ptr which is creating a headache when shared_ptr serialization is used with some other headers.

The same may be necessary for some other headers - e.g. weak ptr.

Another alternative would be to change the private members to protected or maybe even public.

I realize that this creates a dependency between boost::serialization and boost:: shared_ptr that you have hoped to avoid. But until this is resolved at another level, its would be better than the current situation.

Alternatively, you might want to consider expanding the shared_ptr interface with enough information to permit serialization to be implemented via the public interface. This has worked well for the stl collections. I don't know if this was an intentional design decision or just an accident.

We really need a real solution now for boost::shared_ptr / boost::serialization .

This is true of some other boost libs, such as mersenne_twister. We really should have one coherent policy IMO. I propose: 1) We accepted serialization. Let's show that we really support it by using it. 2) Any place serialization is supported, let's do it in an obvious direct way. Saying you can find some workaround via various public interfaces is not acceptable. It should be simple and obvious. If a class is supposed to support boost::serialization, add the standard serialization interface. 3) Add BOOST_NO_SERIALIZATION of some such to bypass the code when not wanted.

7551

Age (days ago)

7556

Last active (days ago)

List overview

Download

33 comments

9 participants

participants (9)

Andreas Huber
Gennadiy Rozental
Joe Gottman
Jonathan Wakely
Neal D. Becker
Peter Dimov
Peter Petrov
Robert Ramey
Russell Hind