
----- Mensaje original ----- De: Robert Ramey <ramey@rrsd.com> Fecha: Viernes, Septiembre 28, 2007 6:06 pm Asunto: Re: [boost] [serialization] Proposal for an extension API to the Archive concept Para: boost@lists.boost.org
I've tried to describe the options as dispassionately as I could, so as to lay a common ground for further discussion. Do you see any error in the description? Are you satisfied with this rendering of the available options?
A very good summary and explanation.
Great, I'm glad we've got the discussion grounds agreed upon.
But an important one (and more are coming, e.g. std::tr1::shared_ptr).> You are of course free to avoid expanding the Archive concept as in E) and F), but the logical implication of this is that shared_ptr is technically not serializable. Nothing wrong about that, if that's your declared intention, but users should know.
OK - now I see better your concern. I would say:
shared_ptr as concieved and implemented by boost does not provide sufficient exposure to be support the concept of Serializable Type as defined by Boost Serialization: [...] shared_ptr is the only type which has come up in several years which has this problem.
Some application specific types might have this issue, but in those cases, I would expect the ability to attach an application specific helper to the archive, thereby expanding the Archive Concept for just that application would be acceptable.
OK, I understand your position on the grounds that you deem shared_ptr (and other potential types on the same vein) as pathological cases. I will try in the lines below to convince you (or at least instill some drops of doubt in you) that this is not necessarily the case. [...]
The power of any software (or mathematical/logical ) module resides in its logical coherence. Arbitraritly extending it in tangent directions requires that the a use of the module or concept consider a bunch of special cases every time he uses even for simple cases. Such types of extensions result in a net reduction in the utility of the library.
Some remarks on this: I cannot but wholeheartedly agree with you that logical coherence must drive the extension of any firmly grounded design. But from my point of view, the helper API is a perfectly sound addition to B.S, because it fullfills a very general need in very general fashion: The helper API is just about keeping state information associated to the serialization process of objects of a given type. Think about it: as it's currently modeled by B.S, serialization is esentially a stateless process: when an object t of type T is about to be serialized, the assumption is that it won't rely on other T instances which have been previously serialized. Is this assumption reasonable? Well, I contend that in some, non-pathological situations, stateful serialization is needed. An almost insultingly obvious case: object tracking. If you give me the helper API and an archive without object tracking, I can implement object tracking myself without relying on the archive implementation or B.S facilities --and it's very easy to do, if you think about it. How's that for applicability of the helper API? Down below I'm explaining also the type of mine which has originated the discussion, for another example of use of the helper API. [...]
Remember, this whole issue has come about because shared_ptr (unlike any other type so far) has been written in a manner that it is effectively closed for extension. Maybe you want to direct some observations in that direction.
That's fair criticism. But consider that there are cases where the type to be serialized is closed beyond the user's control. And my example below is open yet not serializable without helper API.
I've just come accross a very relevent instance of this. the 1.35 version of the binary
In order to support optimization of a couple of collection types extra machinery was added to basic_binary_?archive (in 1.35) This was subject of a fairly acrimonious discussion last year.
Can you provide me with a link to that thread? Thanks! [...]
In my particular case, I can assure you that the need of a helper API is a genuine one: I cannot serialize my type efficiently without that API, no matter how much I want to expose the guts of my type for serialization purposes. Please believe me on that, this is not a case of me wanting to keep my type encapulated, "pure", or anything. It's sheer impossibility to do it otherwise.
Hmm - on one hand, I don't doubt your sincerity nor your assessment. On the other hand, I've yet to see such a type myself. The original serialization of shared_ptr was implemented by exposing more of the shared_ptr internals so in that one case it is possible. Of coursewhether that solution is desirable would be a matter for discussion.
I wanted to keep this discussion untied from my particular problem, but you're entitled to see the case and evaluate by yourself, so here it goes, the following is a simplified description of the type to keep things short. My type implements a sort of flyweight idiom, by which objects with the same value internally keep a pointer to the same representation, so as to avoid duplication of data and excessive memory consumption: flyweight<string> fw("hello"); flyweight<string> fw2("hello"); // fw and fw2 internally have pointers to the same string object. A crude approximation to the implementation of flyweight is: template<typename T, template <typename> Container=...> class flyweight { private: typedef Container<T> factory_type; // used to keep value objects static factory_type factory; // global value factory // A flyweight maintains an iterator to the associated value typedef typename factory_type::iterator handle_type; handle_type handle; public: flyweight(const T& t) // ctor #1 { // retrieve an iterator to an equivalent value or else // insert a new one if no equivalent value is found h=factory.insert(t).first; } flyweight(const flyweight& x) // ctor #2 { // point to the same value as x h=x.h; } ... }; Now, I want to serialize flyweight *efficiently*. The first naive approach is this one: template<class Archive,T,...> void save(Archive& ar,const flyweight<T,...> & fw,const unsigned int) { ar<<*(fw.h); // serialize the associated value } template<class Archive,T,...> void load(Archive& ar,flyweight<T,...> & fw,const unsigned int) { T t; ar>>t; fw=flyweight<T,...>(t); } but this is not efficient because, on loading time, duplicate values are created through ctor #1, which incurs a factory lookup, when I'd want to use ctor #2 (direct copy from a previously loaded equivalent flyweight object). So that's the problem. In the particular case where handle_type is a pointer, the thing can be done by B.S object tracking, but as the type is an unspecified iterator I cannot do that. To me, this is a clear example of serialization needing state info. With the helper API, serializing flyweight<T> is implemented so efficiently and beatifully that it almost hurts :) [...]
So the current situation, is
a) We have an Archive Concept, and Serializable Concept which are fairly coherent. The are currently only broken by basic_binary?archive. b) The classes in the library common_?archve, basic_?archive, are implementation features. It is not required to derive from any of theseclasses to implement the concepts. Of course its convenient to do so as they implement common aspects of the Archive Concept - but they are not required to.
c) Some types - so far only shared_ptr and your ? - do not conform to the concept of a serializable type as they stand. My position is
i) they are very infrequent. ii) the Archive Concept can be extended for these special cases throughcomposition (inheritance) to provide ad hoc solutions.
I hope I've been able to cast some doubt on your commitment to i) Also, I'd like to add that, from my experience as a lib maintainer, functionality usually predates usage: it is not until you provide some new stuff that people begin seeing application scenarios for it, not the other way around. Much more so if you're keeping an advanced lib as Boost libs are held to be, where contributors for new ideas are scarcer.
That's what 1.34 did for shared_ptr and I'm willing to continue doing that. The only think i want to do is to move the helper API out of the basic_?archive where it pollutes the Archive concept (which is why i didn't document it) and package it as a mix-in which is used with naked_?archive to producethe "shared_ptr" friendly archive classes.
What do you see wrong with that?
About this last point of yours, so according to your proposals what archive types would provide the helper API? You say basic_?archive won't provide it? Was it this way in 1.34? If not, won't you be breaking shared_ptr serialization code then? I look forward to your opinions about my position on the generality of the helper API and about the flyweight case. In general, I understand your position and I think it's a reasonable one, given the particular weights you assign to the factors involved --different to mine. When we come to this point the thing it's not then about hard facts but opinions, but I hope I'll be able to pile some more arguments to my tip of the balance Joaquín M López Muñoz Telefónica, Investigación y Desarrollo