[serialization] skipping data

Joel de Guzman

10 Jul 2006 10 Jul '06

1:13 a.m.

Hi, Here's a use case that I can't seem to find a way to do using boost.serialization. Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later? (I'm particularly interested with XML archives. If I had control over the xml parser, that would simply mean ignoring everything from the current tag until the next matching tag is found). Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Show replies by date

Kim Barrett

10 Jul 10 Jul

2:29 a.m.

At 9:13 AM +0800 7/10/06, Joel de Guzman wrote:

...

Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later?

(I'm particularly interested with XML archives. If I had control over the xml parser, that would simply mean ignoring everything from the current tag until the next matching tag is found).

I think such a simple approach will not work in general. Consider the case where 1. your Y contains a pointer to P 2. pointer tracking is enabled for P's type 3. the occurrence of P in Y is the first occurrence within the archive 4. there are later occurrences of P in the archive after Y If you simply discard everything involved in constructing Y, you will discard the information needed to construct and record P, and those later occurrences will be left dangling.

Joel de Guzman

5:55 a.m.

Kim Barrett wrote:

...

At 9:13 AM +0800 7/10/06, Joel de Guzman wrote:

...
Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later?

(I'm particularly interested with XML archives. If I had control over the xml parser, that would simply mean ignoring everything from the current tag until the next matching tag is found).

I think such a simple approach will not work in general. Consider the case where

1. your Y contains a pointer to P 2. pointer tracking is enabled for P's type 3. the occurrence of P in Y is the first occurrence within the archive 4. there are later occurrences of P in the archive after Y

If you simply discard everything involved in constructing Y, you will discard the information needed to construct and record P, and those later occurrences will be left dangling.

Don't worry. I know what I am doing :) Trust the programmer ;) Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Robert Ramey

4:43 a.m.

Joel de Guzman wrote:

...

Hi,

Here's a use case that I can't seem to find a way to do using boost.serialization.

Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later?

Of course this is the easiest way. Remember that after your user loads version 1 if he saves it again it will be saved in version 2 without the extraneous Y. So it will be just a temporary problem except in some unusual cases. For the case whereby a version 1 archive is read again and again you would have to make some special provision to update it.

...

(I'm particularly interested with XML archives. If I had control over the xml parser, that would simply mean ignoring everything from the current tag until the next matching tag is found).

I believe you do have the control you desire. Here's a sketch: suppose object A includes memory variables x, y and z specialize template for xml input archives and type A void load (boost::archive::xml_iarchive & ia, A & a, const unsigned version){ ia & BOOST_SERIALIZATION_NVP(x); if(version == 1){ // the next data should look like <x ...>...</x> // skip over this data using your favorite xml parser - or use the one // that the serialization library uses - built with spirit - note that this // is not exposed by the library and is hidden from users as an // implementation detail } ia & BOOST_SERIALIZATION_NVP(z); } Robert Ramey

Joel de Guzman

5:52 a.m.

Robert Ramey wrote:

...

...
(I'm particularly interested with XML archives. If I had control over the xml parser, that would simply mean ignoring everything from the current tag until the next matching tag is found).

I believe you do have the control you desire. Here's a sketch:

suppose object A includes memory variables x, y and z

specialize template for xml input archives and type A

void load (boost::archive::xml_iarchive & ia, A & a, const unsigned version){ ia & BOOST_SERIALIZATION_NVP(x); if(version == 1){ // the next data should look like <x ...>...</x> // skip over this data using your favorite xml parser - or use the one // that the serialization library uses - built with spirit - note that this // is not exposed by the library and is hidden from users as an // implementation detail } ia & BOOST_SERIALIZATION_NVP(z); }

You didn't answer my question. What I am asking is what do I write in the commented out portion of the code using only Boost.Serialization interface. I do not want to do any undocumented back-door tricks. I want a way to skip reading some data regardless of what the archive type is. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Robert Ramey

6:03 a.m.

Joel de Guzman wrote:

...

You didn't answer my question. What I am asking is what do I write in the commented out portion of the code using only Boost.Serialization interface. I do not want to do any undocumented back-door tricks. I want a way to skip reading some data regardless of what the archive type is.

I don't see a way to do this except in some special cases like xml. Even then it would only work for some cases like tracked pointer data, etc. The basic problem is that Y serializes itself and only Y "knows" its size. So really only Y has the information to know how much to skip - but the problem presupposes that we're not going to have a Y. So I don't think there is anyway to do this. Robert Ramey

Peter Dimov

1:15 p.m.

Joel de Guzman wrote:

...

Hi,

Here's a use case that I can't seem to find a way to do using boost.serialization.

Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later?

You need to define a proxy Y object that can deserialize itself into thin air.

Joel de Guzman

2:07 p.m.

Peter Dimov wrote:

...

Joel de Guzman wrote:

...
Hi,

Here's a use case that I can't seem to find a way to do using boost.serialization.

Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later?

You need to define a proxy Y object that can deserialize itself into thin air.

I thought about that. But that would mean that blank_proxy<Y> knows all about Y -- its composition down to the leaves -- and wrap them all as blank_proxy<T>s. A hairy proposition, IMO. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Peter Dimov

2:32 p.m.

Joel de Guzman wrote:

...

Peter Dimov wrote:

...
Joel de Guzman wrote:

...
Hi,

Here's a use case that I can't seem to find a way to do using boost.serialization.

Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later?

You need to define a proxy Y object that can deserialize itself into thin air.

I thought about that. But that would mean that blank_proxy<Y> knows all about Y -- its composition down to the leaves -- and wrap them all as blank_proxy<T>s. A hairy proposition, IMO.

This is the only way that works reliably with any archive. :-) (And, since the serialization library does not document its external format, the only way that works reliably with it.) The option of actually reading a temporary Y and destroying it afterwards is also feasible. You don't necessarily have to optimize the new version so that it reads v1 files faster than v1 itself. :-)

Rene Rivera

2:45 p.m.

Peter Dimov wrote:

...

Joel de Guzman wrote:

...
Peter Dimov wrote:

...
...
Hi,

Here's a use case that I can't seem to find a way to do using boost.serialization.

Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later? You need to define a proxy Y object that can deserialize itself into

Joel de Guzman wrote: thin air. I thought about that. But that would mean that blank_proxy<Y> knows all about Y -- its composition down to the leaves -- and wrap them all as blank_proxy<T>s. A hairy proposition, IMO.

This is the only way that works reliably with any archive. :-) (And, since the serialization library does not document its external format, the only way that works reliably with it.)

The option of actually reading a temporary Y and destroying it afterwards is also feasible. You don't necessarily have to optimize the new version so that it reads v1 files faster than v1 itself. :-)

It's not a matter of optimization. It could be that the v1 Y object is just not present at all in the v2 application. And for that matter that if you could have it present loading it would cause unwanted side effects. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Joel de Guzman

2:58 p.m.

Peter Dimov wrote:

...

Joel de Guzman wrote:

...
Peter Dimov wrote:

...
...
Hi,

Here's a use case that I can't seem to find a way to do using boost.serialization.

Say with version 1, I want to serialize X Y and Z. These are hierarchical and expensive data structures. Now, with version 2, I do not need to create Y objects anymore. Yet, I still have to open version 1 files for backward compatibility. So, when loading version 1 files, I want to skip reading Y objects. How do I do that without having to create a temporary Y object that will be discarded later? You need to define a proxy Y object that can deserialize itself into

Joel de Guzman wrote: thin air. I thought about that. But that would mean that blank_proxy<Y> knows all about Y -- its composition down to the leaves -- and wrap them all as blank_proxy<T>s. A hairy proposition, IMO.

This is the only way that works reliably with any archive. :-) (And, since the serialization library does not document its external format, the only way that works reliably with it.)

The option of actually reading a temporary Y and destroying it afterwards is also feasible. You don't necessarily have to optimize the new version so that it reads v1 files faster than v1 itself. :-)

There might be a possibility that Y is already obsolete at version 2. Sure you can "emulate" it through the proxy thing, but then again that's less than ideal. I simply want to skip some data and I can't do it with Boost.Serialization. I can do it with simple streams, for example, by prepending the data with a length (in bytes). I can also do it with XML by ignoring everything in between the current tag and its matching end tag. Rene (Rivera) noted that some archive types also allow it. For example, IFF (http://en.wikipedia.org/wiki/Interchange_File_Format) allows it through chunking. "Because the spec includes explicit lengths for each chunk, it is possible for a parser to skip over chunks which it either can't or doesn't care to process." IMO, this is a valid use case that Boost.Serialization can address, at least, for archive forms that allow it. Perhaps a SkippableArchive concept? IMO, the ability to skip is a prerequisite for transparent versioning. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Robert Ramey

3:18 p.m.

Joel de Guzman wrote:

...

Peter Dimov wrote:

...
Joel de Guzman wrote:

...
Peter Dimov wrote:

...

There might be a possibility that Y is already obsolete at version 2. Sure you can "emulate" it through the proxy thing, but then again that's less than ideal. I simply want to skip some data and I can't do it with Boost.Serialization. I can do it with simple streams, for example, by prepending the data with a length (in bytes). I can also do it with XML by ignoring everything in between the current tag and its matching end tag. Rene (Rivera) noted that some archive types also allow it. For example, IFF (http://en.wikipedia.org/wiki/Interchange_File_Format) allows it through chunking. "Because the spec includes explicit lengths for each chunk, it is possible for a parser to skip over chunks which it either can't or doesn't care to process."

...

IMO, this is a valid use case that Boost.Serialization can address, at least, for archive forms that allow it. Perhaps a SkippableArchive concept? IMO, the ability to skip is a prerequisite for transparent versioning.

It would be feasible to make an archive type which would support this. (I believe xml archives could do it now.) But that would require extra information in the archive that would otherwise be unnecessary. Natually this would provoke howls of protest were it to be included in every archive. So we're back to a "special purpose" archive or archive adaptor which adds the extra information to every archive. Of course this wouldn't help with version 1 archives which presumably have already been written.

...

...
This is the only way that works reliably with any archive. :-) (And, since the serialization library does not document its external format, the only way that works reliably with it.)

The internal format of an archive is not documented - in fact its not even defined by the archive concept. This is not an oversight - it was a deliberate decision on my part specifically to permit extention to cases like this one. Here - less IS more. Robert Ramey

Joel de Guzman

3:32 p.m.

Robert Ramey wrote:

...

Joel de Guzman wrote:

...
Peter Dimov wrote:

...
Joel de Guzman wrote:

...
Peter Dimov wrote:

...
There might be a possibility that Y is already obsolete at version 2. Sure you can "emulate" it through the proxy thing, but then again that's less than ideal. I simply want to skip some data and I can't do it with Boost.Serialization. I can do it with simple streams, for example, by prepending the data with a length (in bytes). I can also do it with XML by ignoring everything in between the current tag and its matching end tag. Rene (Rivera) noted that some archive types also allow it. For example, IFF (http://en.wikipedia.org/wiki/Interchange_File_Format) allows it through chunking. "Because the spec includes explicit lengths for each chunk, it is possible for a parser to skip over chunks which it either can't or doesn't care to process."

...
IMO, this is a valid use case that Boost.Serialization can address, at least, for archive forms that allow it. Perhaps a SkippableArchive concept? IMO, the ability to skip is a prerequisite for transparent versioning.

It would be feasible to make an archive type which would support this. (I believe xml archives could do it now.) But that would require extra information in the archive that would otherwise be unnecessary. Natually this would provoke howls of protest were it to be included in every archive. So we're back to a "special purpose" archive or archive adaptor which adds the extra information to every archive. Of course this wouldn't help with version 1 archives which presumably have already been written.

No, the "imaginary" version 1 file has not been written yet. I am programming in the future tense :) I also believe that the XML archive format *can* do it now, which is why I am asking ;) And, as I mentioned, I'm particularly interested with XML; but I do not want to subvert the Boost.Serialization interface. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Peter Dimov

3:58 p.m.

Joel de Guzman wrote:

...

Peter Dimov wrote:

...

...
This is the only way that works reliably with any archive. :-) (And, since the serialization library does not document its external format, the only way that works reliably with it.)

The option of actually reading a temporary Y and destroying it afterwards is also feasible. You don't necessarily have to optimize the new version so that it reads v1 files faster than v1 itself. :-)

There might be a possibility that Y is already obsolete at version 2.

By basing an external file format on Y's internal structure, you've ensured that Y never becomes obsolete. :-)

...

Sure you can "emulate" it through the proxy thing, but then again that's less than ideal. I simply want to skip some data and I can't do it with Boost.Serialization. I can do it with simple streams, for example, by prepending the data with a length (in bytes). I can also do it with XML by ignoring everything in between the current tag and its matching end tag. Rene (Rivera) noted that some archive types also allow it. For example, IFF (http://en.wikipedia.org/wiki/Interchange_File_Format) allows it through chunking. "Because the spec includes explicit lengths for each chunk, it is possible for a parser to skip over chunks which it either can't or doesn't care to process."

You are right that when the file format has a "shell" structure that encompasses the serialization format, it is possible to skip over parts of the stream. To write IFF, however, you either need a seekable stream, or you need to serialize the chunk contents into a temporary archive in order to obtain its size. I'm not sure whether the current serialization framework can support something like that.

...

IMO, this is a valid use case that Boost.Serialization can address, at least, for archive forms that allow it. Perhaps a SkippableArchive concept?

This could be possible if you don't mind constraining your class to only support SkippableArchives.

...

IMO, the ability to skip is a prerequisite for transparent versioning.

Transparent versioning via skipping? I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place?

Joel de Guzman

11:02 p.m.

Peter Dimov wrote:

...

...
...
The option of actually reading a temporary Y and destroying it afterwards is also feasible. You don't necessarily have to optimize the new version so that it reads v1 files faster than v1 itself. :-) There might be a possibility that Y is already obsolete at version 2.

By basing an external file format on Y's internal structure, you've ensured that Y never becomes obsolete. :-)

Unless, you allow for skippable archives, like IFF does. In effect, what you do is provide a nullable abstraction to your data structure. Skippable archives allow data to be obsolete.

...

...
Sure you can "emulate" it through the proxy thing, but then again that's less than ideal. I simply want to skip some data and I can't do it with Boost.Serialization. I can do it with simple streams, for example, by prepending the data with a length (in bytes). I can also do it with XML by ignoring everything in between the current tag and its matching end tag. Rene (Rivera) noted that some archive types also allow it. For example, IFF (http://en.wikipedia.org/wiki/Interchange_File_Format) allows it through chunking. "Because the spec includes explicit lengths for each chunk, it is possible for a parser to skip over chunks which it either can't or doesn't care to process."

You are right that when the file format has a "shell" structure that encompasses the serialization format, it is possible to skip over parts of the stream. To write IFF, however, you either need a seekable stream, or you need to serialize the chunk contents into a temporary archive in order to obtain its size. I'm not sure whether the current serialization framework can support something like that.

Yeah, that's the problem.

...

I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place?

Here's one scenario (I'm sure there are others): Your class uses a 3rd party library called Y. Later, you decide to replace it with a better engine using a library called Z. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Peter Dimov

11:27 p.m.

Joel de Guzman wrote:

...

Peter Dimov wrote:

...
I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place?

Here's one scenario (I'm sure there are others): Your class uses a 3rd party library called Y. Later, you decide to replace it with a better engine using a library called Z.

Still not getting it. Do you have an example?

Rene Rivera

11:36 p.m.

Peter Dimov wrote:

...

Joel de Guzman wrote:

...
Peter Dimov wrote:

...
I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place? Here's one scenario (I'm sure there are others): Your class uses a 3rd party library called Y. Later, you decide to replace it with a better engine using a library called Z.

Still not getting it. Do you have an example?

Another use case... If you make two versions of a product, one a demo/trial and another the full release. One person makes a data file in the release version and sends it to a friend who has the demo. There happens to be a feature not available in the demo which has an impact on the data format. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Peter Dimov

11:52 p.m.

Rene Rivera wrote:

...

Peter Dimov wrote:

...
Joel de Guzman wrote:

...
Peter Dimov wrote:

...
I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place? Here's one scenario (I'm sure there are others): Your class uses a 3rd party library called Y. Later, you decide to replace it with a better engine using a library called Z.

Still not getting it. Do you have an example?

Another use case... If you make two versions of a product, one a demo/trial and another the full release. One person makes a data file in the release version and sends it to a friend who has the demo. There happens to be a feature not available in the demo which has an impact on the data format.

Yeah, this is actually an example of forward compatibility, a well-known use case for skippable chunks. In your case the demo version is v1 (it only understands text) and the release version is v2 (which also understands embedded images, say). My question was about backward compatibility, since this is what the serialization library versioning supports. It doesn't make much sense for v2 to suddenly start ignoring the embedded images saved with v1. It may start writing them in a PNG format instead of RLE, but when importing v1 it would still need to read the RLE image.

Joel de Guzman

11 Jul 11 Jul

1:42 p.m.

Peter Dimov wrote:

...

Rene Rivera wrote:

...
Peter Dimov wrote:

...
Joel de Guzman wrote:

...
Peter Dimov wrote:

...
I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place? Here's one scenario (I'm sure there are others): Your class uses a 3rd party library called Y. Later, you decide to replace it with a better engine using a library called Z. Still not getting it. Do you have an example? Another use case... If you make two versions of a product, one a demo/trial and another the full release. One person makes a data file in the release version and sends it to a friend who has the demo. There happens to be a feature not available in the demo which has an impact on the data format.

Yeah, this is actually an example of forward compatibility, a well-known use case for skippable chunks. In your case the demo version is v1 (it only understands text) and the release version is v2 (which also understands embedded images, say). My question was about backward compatibility, since this is what the serialization library versioning supports. It doesn't make much sense for v2 to suddenly start ignoring the embedded images saved with v1. It may start writing them in a PNG format instead of RLE, but when importing v1 it would still need to read the RLE image.

Right. Well, I'm glad Rene came out with another use case. Thinking back now, it's not really about obsolesence. It's more about allowing for optional data. Thanks for the insights, Peter! Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Joel de Guzman

10 Jul 10 Jul

11:49 p.m.

Peter Dimov wrote:

...

Joel de Guzman wrote:

...
Peter Dimov wrote:

...
I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place? Here's one scenario (I'm sure there are others): Your class uses a 3rd party library called Y. Later, you decide to replace it with a better engine using a library called Z.

Still not getting it. Do you have an example?

That was the example ;)... Ok let me see if I can clarify... With version 1 you have a class A with this structure: class A { W, w, X x; Y rep; }; Now with version 2, you want to use a new engine to replace Y: class A { W, w, X x; Z rep; }; Z is so different from Y that it does not need any of its data. It is also plausible that W,X,Y and Z get some data from other external sources. So, in effect, Z starts with a default data and gets filled as the app runs. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Peter Dimov

11 Jul 11 Jul

midnight

Joel de Guzman wrote:

...

Peter Dimov wrote:

...
Joel de Guzman wrote:

...
Peter Dimov wrote:

...
I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place? Here's one scenario (I'm sure there are others): Your class uses a 3rd party library called Y. Later, you decide to replace it with a better engine using a library called Z.

Still not getting it. Do you have an example?

That was the example ;)... Ok let me see if I can clarify... With version 1 you have a class A with this structure:

class A { W, w, X x; Y rep; };

Now with version 2, you want to use a new engine to replace Y:

class A { W, w, X x; Z rep; };

Z is so different from Y that it does not need any of its data. It is also plausible that W,X,Y and Z get some data from other external sources. So, in effect, Z starts with a default data and gets filled as the app runs.

Right. What I was saying is that if the pair (w, x) is all you need to reconstruct an A, then there is no need to store the Y part (except short-term convenience).

Joel de Guzman

1:38 p.m.

Peter Dimov wrote:

...

Joel de Guzman wrote:

...
Peter Dimov wrote:

...
Joel de Guzman wrote:

...
Peter Dimov wrote:

...
I can't help but wonder: if some parts of your v1 format are so completely redundant as to allow you to reconstruct the original data even if you skip them, what was the point of writing them in the first place? Here's one scenario (I'm sure there are others): Your class uses a 3rd party library called Y. Later, you decide to replace it with a better engine using a library called Z. Still not getting it. Do you have an example? That was the example ;)... Ok let me see if I can clarify... With version 1 you have a class A with this structure:

class A { W, w, X x; Y rep; };

Now with version 2, you want to use a new engine to replace Y:

class A { W, w, X x; Z rep; };

Z is so different from Y that it does not need any of its data. It is also plausible that W,X,Y and Z get some data from other external sources. So, in effect, Z starts with a default data and gets filled as the app runs.

Right. What I was saying is that if the pair (w, x) is all you need to reconstruct an A, then there is no need to store the Y part (except short-term convenience).

Good point! Maybe I ought to review the situation. It might be that what's actually needed in this case is a converter from Y to Z. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Robert Ramey

10 Jul 10 Jul

3:08 p.m.

Joel de Guzman wrote:

...

Peter Dimov wrote:

...

...
You need to define a proxy Y object that can deserialize itself into thin air.

Very clever - I'm embarassed I didn't think about that.

...

I thought about that. But that would mean that blank_proxy<Y> knows all about Y -- its composition down to the leaves -- and wrap them all as blank_proxy<T>s. A hairy proposition, IMO.

Hmm - sounds like one needs proxy objects for all of Y's members and so one down to the primitive leaves. And here's another idea. create an archive adaptor which takes any archive and replaces its loading of primitives to something that just throws them away. so at some point one would try load(Archive & ar, Y & y, const unsigned version){ ar >> x; if(version == 1){ dummy_archive<Archive> da(ar); da >>y; } ar >> z; } I havn't written documentation on archive adaptor but polymorphic_?archives are examples of such a thing. Robert Ramey

Joel de Guzman

3:20 p.m.

Robert Ramey wrote:

...

And here's another idea.

create an archive adaptor which takes any archive and replaces its loading of primitives to something that just throws them away.

so at some point one would try

load(Archive & ar, Y & y, const unsigned version){ ar >> x; if(version == 1){ dummy_archive<Archive> da(ar); da >>y; } ar >> z; }

I havn't written documentation on archive adaptor but polymorphic_?archives are examples of such a thing.

Clever! But again, what if Y is already obsolete at version 2? Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Robert Ramey

5:30 p.m.

Joel de Guzman wrote:

...

Robert Ramey wrote:

...
And here's another idea.

create an archive adaptor which takes any archive and replaces its loading of primitives to something that just throws them away.

so at some point one would try

load(Archive & ar, Y & y, const unsigned version){ ar >> x; if(version == 1){ dummy_archive<Archive> da(ar); da >>y; } ar >> z; }

I havn't written documentation on archive adaptor but polymorphic_?archives are examples of such a thing.

Clever! But again, what if Y is already obsolete at version 2?

LOL - once an object of class Y is stored in an archive and kept around - by definition - it can't be obsolete. Obsolete suggests it will never be used - archiving suggest it will be. One might work around this with something like: void save(Archive & ar, const A &a, const unsigned version){ assert(1 == version); ar << x; ar << sizeof(y); ar << y; ar << z; } void load(Archive & ar, const A &a, const unsigned version){ std::size_t s; ar >> x; ar << s; if(1 == version){ // skip over stream s bytes } ar << z; } Of course the current archive concept doesn't include the stream and archives aren't guarenteed to have them. So the skip operation would have to be added to the archive itself either by making an archive adaptor that could be applied to any archive with a stream. Robert Ramey

Joel de Guzman

11:24 p.m.

Robert Ramey wrote:

...

...
Clever! But again, what if Y is already obsolete at version 2?

LOL - once an object of class Y is stored in an archive and kept around - by definition - it can't be obsolete. Obsolete suggests it will never be used - archiving suggest it will be.

That's one way to look at it. Your archive just adds and adds and accumulates data over time. Yet, if you think outside the box, a skippable archive allows you to subtract data and flag some data as obsolete. No one is ever so brilliant as to plan from version 1 what happens with version 100 :) If adding is a possibility, I do not see why subtracting is not. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

6930

Age (days ago)

6931

Last active (days ago)

List overview

Download

25 comments

5 participants

participants (5)

Joel de Guzman
Kim Barrett
Peter Dimov
Rene Rivera
Robert Ramey