[serialization] Regressions?

Vladimir Prus

6 May 2005 6 May '05

9:54 a.m.

Hi Robert, I've just updated by Boost tree and run in a number of compile errors. One is pretty easy -- I've committed the attached trivial patch to split_member.hpp. Another is this: ....boost/boost/archive/detail/oserializer.hpp:551: error: incomplete type `boost::STATIC_ASSERTION_FAILURE<false>' does not have member `value' instantiated from this code of mine: ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish(); Where 'finish' returns auto_ptr<Data>. It's looks like serialization checks if the serialized type is 'const' and if not, complains. Basically, it's some heuritic to prevent saving on-stack object (though I don't understand why it would work at al). I find this a bit too much . I have no reason whatsoever to make 'finish' return auto_ptr<const Data>. And writing oa << const_cast<const Data&>(*data_builder.finish()); looks very strange. And modifying all places where I get this error is not nice too. So, can this static assert be removed? Then, I get ambiguity in iserializer.hpp, in this code: #ifndef BOOST_NO_FUNCTION_TEMPLATE_ORDERING template<class Archive, class T> inline void load(Archive &ar, const serialization::nvp<T> &t){ load(ar, const_cast<serialization::nvp<T> &>(t)); } For some reason, both boost::archive::load and some other 'load' in 'boost' namespace (part of earlier serialization lib) that I still use are considered overload candidates. Adding explicit boost::archive:: fixes this. See attached patch. Then I get error at this code: ar << boost::lexical_cast<std::string>(*this); The error message is: error: no match for 'operator<<' in 'ar << boost::lexical_cast(Source) [with Target = std::basic_string<char, std::char_traits<char>, std::allocator<char> >, Source = lvk::nm_model::BlockFormula]()' /space/NM/boost/boost/archive/detail/interface_oarchive.hpp:75: error: candidates are: Archive& boost::archive::detail::interface_oarchive<Archive>::operator<<(T&) [with T = std::basic_string<char, std::char_traits<char>, std::allocator<char> >, Archive = boost::archive::binary_oarchive] ...unrelated operator<< definitions snipped... Apparently, passing rvalue to "T&" does not work. Yes another attached patch fixes this issue. - Volodya

Attachments:

split_member.diff (text/x-diff — 607 bytes)
iserializer.diff (text/x-diff — 874 bytes)
interface_oarchive.diff (text/x-diff — 557 bytes)

Show replies by date

Robert Ramey

6 May 6 May

4:35 p.m.

Vladimir Prus wrote:

...

...
Hi Robert, I've just updated by Boost tree and run in a number of compile errors.

One is pretty easy -- I've committed the attached trivial patch to split_member.hpp.

Another is this:

....boost/boost/archive/detail/oserializer.hpp:551: error: incomplete type `boost::STATIC_ASSERTION_FAILURE<false>' does not have member `value'

instantiated from this code of mine:

ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish();

This code should be considered erroneous. The documentation in 1.32 addressed this but unfortunately the enforcement was lost in the shuffle. The intention is to trap the saving of tracked non-const objects. This is to prevent users from doing something like For(... A a; ... ar << a; // will save the address of a - which is on the stack If a is tracked here, instances of a after the first will be stored only as reference ids. When the data is restored, all the as will be the same. Not what the programmer intended - and a bear to find. This is really the save counterpart to the load situation which required the implementation of reset_object_address. Note that the documentation suggests that he above be reformulated as For(... ar << a[i]; // Enforcing const-ness also has the effect of preventing serialization from altering the state of the object being serialized - another almost impossible to find bug. Remember that when a tracked object is saved more than once, only the first time is the data saved. If the object can be changed during serialization, we have a problem. Having said that - the & operator doesn't do the const checking. Doing so inhibits its usage. Also, in spite of much effort, I was unable to make the const checking function to my taste when objects are wrapped in an nvp wrapper. Also, I had to tweak a number of my tests and demos to make them work with this new rule. However, the tweaking was not difficult. If altering one's code to conform this rule isn't easy, one should make an effort to understand why. Its possible that a very subtle and hard to understand bug might be being introduced. I hope this explains my reasoning.

...

...
Where 'finish' returns auto_ptr<Data>. It's looks like serialization checks if the serialized type is 'const' and if not, complains. Basically, it's some heuritic to prevent saving on-stack object (though I don't understand why it would work at al).

I find this a bit too much . I have no reason whatsoever to make 'finish' return auto_ptr<const Data>. And writing

oa << const_cast<const Data&>(*data_builder.finish());

This would "fix" it but might not be necessary. I envisioned that the operator << would usually be used in a function templateof the following signature save(Archive & ar, const T &t ,... so that normally the issue wouldn't arise. If it does arise, its sort a warning to really think about what the code is doing. If the return value of data_builde.finish() can be changed during the serialization process, one is going to have a problem. That is why the STATIC_ASSERT only trips when tracking is enabled for the corresponding datatype.

...

...
looks very strange. And modifying all places where I get this error is not nice too. So, can this static assert be removed?

As I said, if its in a lot of places one should think about this. If this is truely intolerable and you don't mind driving without seat belts, use the & operator instead.

...

...
Then, I get ambiguity in iserializer.hpp, in this code:

#ifndef BOOST_NO_FUNCTION_TEMPLATE_ORDERING template<class Archive, class T> inline void load(Archive &ar, const serialization::nvp<T> &t){ load(ar, const_cast<serialization::nvp<T> &>(t)); }

For some reason, both boost::archive::load and some other 'load' in 'boost' namespace (part of earlier serialization lib) that I still use are considered overload candidates. Adding explicit boost::archive:: fixes this. See attached patch.

Then I get error at this code:

ar << boost::lexical_cast<std::string>(*this);

The error message is:

error: no match for 'operator<<' in 'ar << boost::lexical_cast(Source) [with Target = std::basic_string<char, std::char_traits<char>, std::allocator<char> >, Source = lvk::nm_model::BlockFormula]()' /space/NM/boost/boost/archive/detail/interface_oarchive.hpp:75: error: candidates are: Archive&

boost::archive::detail::interface_oarchive<Archive>::operator<<(T&) [with T = std::basic_string<char, std::char_traits<char>, std::allocator<char> >, Archive = boost::archive::binary_oarchive] ...unrelated operator<< definitions snipped...

Apparently, passing rvalue to "T&" does not work. Yes another attached patch fixes this issue.

This is the same issue as before. Passing a non-const tracked type to the archive << operator. Robert Ramey

Vladimir Prus

10 May 10 May

7:36 a.m.

Robert Ramey wrote:

...

...
...
instantiated from this code of mine:

ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish();

This code should be considered erroneous. The documentation in 1.32 addressed this but unfortunately the enforcement was lost in the shuffle.

Excuse me, but I refuse to accept that this code is erroneous. As I've said, this is my existing design -- the 'finish' function returns non-const pointer. The fact that somewhere else, I serialize the result of 'finish' has nothing to do with 'finish' return type, and I should not be forced to change my existing interface just because I use serialization somewhere.

...

The intention is to trap the saving of tracked non-const objects. This is to prevent users from doing something like For(... A a; ... ar << a; // will save the address of a - which is on the stack

If a is tracked here, instances of a after the first will be stored only as reference ids. When the data is restored, all the as will be the same. Not what the programmer intended - and a bear to find. This is really the save counterpart to the load situation which required the implementation of reset_object_address.

So, you're saing that "const" == "no allocated on stack"? I don't see why this statement is true. I can just as well do this: void foo(const A& a) { ar << a; } and circumvent your protection. Further, how often is it that non-pointer object is tracked? I think it's rare case, while saving a pointer is a common case, and making common case inconvenient for the sake of non-common case does not seem right.

...

Note that the documentation suggests that he above be reformulated as For(... ar << a[i]; //

Enforcing const-ness also has the effect of preventing serialization from altering the state of the object being serialized - another almost impossible to find bug.

Can you explain: 1. How this bug can happen 2. Why the "watch" command on gdb is not a reliable way to catch such bugs?

...

Remember that when a tracked object is saved more than once, only the first time is the data saved. If the object can be changed during serialization, we have a problem.

Having said that - the & operator doesn't do the const checking. Doing so inhibits its usage. Also, in spite of much effort, I was unable to make the const checking function to my taste when objects are wrapped in an nvp wrapper.

Also, I had to tweak a number of my tests and demos to make them work with this new rule. However, the tweaking was not difficult. If altering one's code to conform this rule isn't easy, one should make an effort to understand why. Its possible that a very subtle and hard to understand bug might be being introduced.

Are you saying that my code is buggy?

...

...
...
Where 'finish' returns auto_ptr<Data>. It's looks like serialization checks if the serialized type is 'const' and if not, complains. Basically, it's some heuritic to prevent saving on-stack object (though I don't understand why it would work at al).

I find this a bit too much . I have no reason whatsoever to make 'finish' return auto_ptr<const Data>. And writing

oa << const_cast<const Data&>(*data_builder.finish());

This would "fix" it but might not be necessary. I envisioned that the operator << would usually be used in a function templateof the following signature

save(Archive & ar, const T &t ,...

so that normally the issue wouldn't arise.

In the above case, I'm saving a specific object with a specific type. Using any function template is not an option.

...

If it does arise, its sort a warning to really think about what the code is doing. If the return value of data_builde.finish() can be changed during the serialization process, one is going to have a problem.

How can it be changed? Due to bug in 'save' for by class type? How often it happens?

...

That is why the STATIC_ASSERT only trips when tracking is enabled for the corresponding datatype.

...
...
looks very strange. And modifying all places where I get this error is not nice too. So, can this static assert be removed?

As I said, if its in a lot of places one should think about this. If this is truely intolerable and you don't mind driving without seat belts, use the & operator instead.

This is inferiour solution, because operator<< is more logical.

...

...
...
Then, I get ambiguity in iserializer.hpp, in this code:

#ifndef BOOST_NO_FUNCTION_TEMPLATE_ORDERING template<class Archive, class T> inline void load(Archive &ar, const serialization::nvp<T> &t){ load(ar, const_cast<serialization::nvp<T> &>(t)); }

For some reason, both boost::archive::load and some other 'load' in 'boost' namespace (part of earlier serialization lib) that I still use are considered overload candidates. Adding explicit boost::archive:: fixes this. See attached patch.

Then I get error at this code:

ar << boost::lexical_cast<std::string>(*this);

The error message is:

error: no match for 'operator<<' in 'ar << boost::lexical_cast(Source) [with Target = std::basic_string<char, std::char_traits<char>, std::allocator<char> >, Source = lvk::nm_model::BlockFormula]()' /space/NM/boost/boost/archive/detail/interface_oarchive.hpp:75: error: candidates are: Archive&

boost::archive::detail::interface_oarchive<Archive>::operator<<(T&) [with T = std::basic_string<char, std::char_traits<char>, std::allocator<char> >, Archive = boost::archive::binary_oarchive] ...unrelated operator<< definitions snipped...

Apparently, passing rvalue to "T&" does not work. Yes another attached patch fixes this issue.

This is the same issue as before. Passing a non-const tracked type to the archive << operator.

Ok, 1. Why std::string is 'tracked type'? 2. How do you suggest me to fix the above? - Volodya

Robert Ramey

11 May 11 May

6:06 a.m.

Vladimir Prus wrote:

...

Robert Ramey wrote:

...
...
...
instantiated from this code of mine:

ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish();

This code should be considered erroneous. The documentation in 1.32 addressed this but unfortunately the enforcement was lost in the shuffle.

Excuse me, but I refuse to accept that this code is erroneous. As I've said, this is my existing design -- the 'finish' function returns non-const pointer. The fact that somewhere else, I serialize the result of 'finish' has nothing to do with 'finish' return type, and I should not be forced to change my existing interface just because I use serialization somewhere.

OK let me rephrase. This code conflicts with the description of the description of the functioning of the operator << as described in the documentation.

...

...
The intention is to trap the saving of tracked non-const objects. This is to prevent users from doing something like For(... A a; ... ar << a; // will save the address of a - which is on the stack

If a is tracked here, instances of a after the first will be stored only as reference ids. When the data is restored, all the as will be the same. Not what the programmer intended - and a bear to find. This is really the save counterpart to the load situation which required the implementation of reset_object_address.

So, you're saing that "const" == "no allocated on stack"? I don't see why this statement is true. I can just as well do this:

void foo(const A& a) { ar << a; }

and circumvent your protection.

The intention is that const indicates that the process of serialization will not change the object being serialized. Serializing and object that cannot easily be passed as a const is quite possibly an error. Of course, the contrary is not necessarily true. That isan object can be passed as const referene and still be modified during the process of serialization. So this is not bullet proof - but I believe it is very helpful.

...

Further, how often is it that non-pointer object is tracked? I think it's rare case, while saving a pointer is a common case, and making common case inconvenient for the sake of non-common case does not seem right.

It can happen more often than you might think. Its easy to serialize an object and sometime later serialize a pointer to the same object. In order to avoid creating duplicates all the instances of an object - not just the pointers, have to be tracked if a pointer to that class of object is used even once. (As an aside, the serialization libary detects the situation where an object is never serialized through a pointer and suppresses tracking in this case).

...

...
Note that the documentation suggests that he above be reformulated as For(... ar << a[i]; //

Enforcing const-ness also has the effect of preventing serialization from altering the state of the object being serialized - another almost impossible to find bug.

...

Can you explain: 1. How this bug can happen

Its very easy to write for(...{ X x = *it; // create a copy of ar << x } all the x's are at the same address so if they happen to be tracked because a pointer to some X in serialized somewhere in the program, then the subsequent copies would be supprsed by tracking. With the above formulation this cannot happen. This error is detected a compile time - very convenient. Of course the following would work as well for(...{ const & X x = *it; // save a const reference to the orgiinal ar << x } while the following would create an error that would go undetected. for(...{ const X x = *it; // create a copy of ar << x }

...

2. Why the "watch" command on gdb is not a reliable way to catch such bugs?

I'm not sure what the gdb watch command does but I'm sure it doesn't detect compile time errors.

...

...
Also, I had to tweak a number of my tests and demos to make them work with this new rule. However, the tweaking was not difficult. If altering one's code to conform this rule isn't easy, one should make an effort to understand why. Its possible that a very subtle and hard to understand bug might be being introduced.

Are you saying that my code is buggy?

I'm saying you've passed up an opportunity to permit the compiler to flag someting that could be an error.

...

...
...
...
ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish();

Looking at the above, I have to concede I never envisioned it being used this way. I would need more context to really comment intelligently on it. I will say that I alwas envisioned the interface as being used in a declarative style. That is, I envisioned the the serialization declarations ar & x or ar << x as a shorthand for "this member is persistent". I'm not saying that you're doing anything wrong, its just not what I expected.

...

...
...
...
Where 'finish' returns auto_ptr<Data>. It's looks like serialization checks if the serialized type is 'const' and if not, complains.

correct

...

...
...
...
Basically, it's some heuritic to prevent saving on-stack object

That's one case I would like trap.

...

...
...
...
(though I don't understand why it would work at al).

I don't understand this.

...

...
...
...
I find this a bit too much . I have no reason whatsoever to make 'finish' return auto_ptr<const Data>. And writing

oa << const_cast<const Data&>(*data_builder.finish());

When finish returns something that is not a const it suggests that its returning a pointer to a mutable object. The serialization library presumes that objects don't change in the course of serialization. So passing a non-const conflicts with one of the assumptions made in the implementation of the library.

...

...
That is why the STATIC_ASSERT only trips when tracking is enabled for the corresponding datatype.

...
...
looks very strange. And modifying all places where I get this error is not nice too. So, can this static assert be removed?

Its there for a purpose. How many places do you get this error? Returning an auto_ptr to the top of the stack is interesting to me. doesn't that destroy the one in the original location? If its a temporary you could just as well return an auto_ptr to a const T. But really without more context its hard for me to commment.

...

...
As I said, if its in a lot of places one should think about this. If this is truely intolerable and you don't mind driving without seat belts, use the & operator instead.

This is inferiour solution, because operator<< is more logical.

They're both arbitrary.

...

Ok, 1. Why std::string is 'tracked type'?

so that storage space isn't wasted storing repeated strings

...

2. How do you suggest me to fix the above?

I did suggest using & instead. But I'm curious to see more context. I'm curious to see more. Personally, I believe const-ness is under appreciated and is helpful in catching bugs at compile time. In the future as more mult-threading is used, I think it will be even more important. Much effort was invested in using const (and asserts) to detect bugs at compile time. I can't catch them all but I catch what I can and I believe that it has made the library much easier to use. Robert Ramey

Vladimir Prus

7:01 a.m.

Robert Ramey wrote:

...

...
...
...
...
ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish();

This code should be considered erroneous. The documentation in 1.32 addressed this but unfortunately the enforcement was lost in the shuffle.

Excuse me, but I refuse to accept that this code is erroneous. As I've said, this is my existing design -- the 'finish' function returns non-const pointer. The fact that somewhere else, I serialize the result of 'finish' has nothing to do with 'finish' return type, and I should not be forced to change my existing interface just because I use serialization somewhere.

OK let me rephrase. This code conflicts with the description of the description of the functioning of the operator << as described in the documentation.

Can you give me specific doc URL? I can't find the relevant part in TOC.

...

...
So, you're saing that "const" == "no allocated on stack"? I don't see why this statement is true. I can just as well do this:

void foo(const A& a) { ar << a; }

and circumvent your protection.

The intention is that const indicates that the process of serialization will not change the object being serialized. Serializing and object that cannot easily be passed as a const is quite possibly an error. Of course, the contrary is not necessarily true. That isan object

What's "isan"?

...

can be passed as const referene and still be modified during the process of serialization. So this is not bullet proof - but I believe it is very helpful.

Let me draw an analogy: there's std::ostream. For years, we save objects to ostreams and we're not required to save "const" object. The fact that serialization acts in a different way is inconsistent and confusing. I think you'll agree that users should not read documentation just to save an object. And let me clarify again -- is this indended to stack only stack allocated objects?

...

...
...
Note that the documentation suggests that he above be reformulated as For(... ar << a[i]; //

Enforcing const-ness also has the effect of preventing serialization from altering the state of the object being serialized - another almost impossible to find bug.

...
Can you explain: 1. How this bug can happen

Its very easy to write for(...{ X x = *it; // create a copy of ar << x }

all the x's are at the same address so if they happen to be tracked because a pointer to some X in serialized somewhere in the program, then the subsequent copies would be supprsed by tracking.

So, you're saying that the behaviour of the above code will be different, depending on whether some pointer to X is saved somewhere else? This is very strange -- here, saving is not done via pointer, so I expect that no tracking is ever done. I'd argue this is a bug in serialization. The only problematic case is for(...{ X x = *it; // create a copy of ar << x } X* x = new X: ar << x; where address of newed 'x' is the same as address of saved 'x'. But this can never happen, because heap memory and stack memory are distinct.

...

With the above formulation this cannot happen. This error is detected a compile time - very convenient. Of course the following would work as well

for(...{ const & X x = *it; // save a const reference to the orgiinal ar << x }

Will that indeed save a const reference? How will you read const reference from an archive?

...

while the following would create an error that would go undetected.

for(...{ const X x = *it; // create a copy of ar << x }

Probably the right way to fix this is just don't track 'x' here. Again, no heap allocated object will have the same address as 'x'.

...

...
2. Why the "watch" command on gdb is not a reliable way to catch such bugs?

I'm not sure what the gdb watch command does but I'm sure it doesn't detect compile time errors.

It allows to find all places where a specific memory location is modified. So, if you have a bug where a value of an object is modified by serialization, it's easy to track this down to specific code line.

...

...
...
Also, I had to tweak a number of my tests and demos to make them work with this new rule. However, the tweaking was not difficult. If altering one's code to conform this rule isn't easy, one should make an effort to understand why. Its possible that a very subtle and hard to understand bug might be being introduced.

Are you saying that my code is buggy?

I'm saying you've passed up an opportunity to permit the compiler to flag someting that could be an error.

...
...
...
...
ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish();

Looking at the above, I have to concede I never envisioned it being used this way. I would need more context to really comment intelligently on it. I will say that I alwas envisioned the interface as being used in a declarative style. That is, I envisioned the the serialization declarations ar & x or ar << x as a shorthand for "this member is persistent". I'm not saying that you're doing anything wrong, its just not what I expected.

What kind of context should I provide? This is top-level code, it reads some files, passes them to 'data_builder' and then calls 'finish' that builds the proper data which is then saved.

...

...
...
...
...
Where 'finish' returns auto_ptr<Data>. It's looks like serialization checks if the serialized type is 'const' and if not, complains.

correct

...
...
...
...
Basically, it's some heuritic to prevent saving on-stack object

That's one case I would like trap.

Any others?

...

...
...
...
...
(though I don't understand why it would work at al).

I don't understand this.

...
...
...
...
I find this a bit too much . I have no reason whatsoever to make 'finish' return auto_ptr<const Data>. And writing

oa << const_cast<const Data&>(*data_builder.finish());

When finish returns something that is not a const it suggests that its returning a pointer to a mutable object. The serialization library presumes that objects don't change in the course of serialization. So passing a non-const conflicts with one of the assumptions made in the implementation of the library.

Ehm... the standard way to solve this is to declare "operator<<" with the "const T&" type, just like iostreams do.

...

...
...
That is why the STATIC_ASSERT only trips when tracking is enabled for the corresponding datatype.

...
...
looks very strange. And modifying all places where I get this error is not nice too. So, can this static assert be removed?

Its there for a purpose. How many places do you get this error? Returning an auto_ptr to the top of the stack is interesting to me. doesn't that destroy the one in the original location?

No, returning auto_ptr from a function is one of its intended usages.

...

If its a temporary you could just as well return an auto_ptr to a const T. But really without more context its hard for me to commment.

1. I don't want to change my design due to serialization library. It's intrusive. 2. Inside 'finish', I need to work with auto_ptr<T> -- because I modify the data. And conversion from auto_ptr<T> to auto_ptr<const T> does not work for me on gcc.

...

...
Ok, 1. Why std::string is 'tracked type'?

so that storage space isn't wasted storing repeated strings

...
2. How do you suggest me to fix the above?

I did suggest using & instead. But I'm curious to see more context. I'm curious to see more.

Here's the enclosing method: template<class Archive> void save(Archive & ar, unsigned int version) const { // Serialization has problems with shared_ptr, so use strings. ar << boost::lexical_cast<std::string>(*this); } - Volodya

Robert Ramey

5:33 p.m.

...

...
OK let me rephrase. This code conflicts with the description of the description of the functioning of the operator << as described in the documentation.

Can you give me specific doc URL? I can't find the relevant part in TOC.

Serialization/Reference/Special Considerations/Object tracking explains this.

...

That isan object

What's "isan"?

That is, an

...

...
can be passed as const referene and still be modified during the process of serialization. So this is not bullet proof - but I believe it is very helpful.

...

Let me draw an analogy: there's std::ostream. For years, we save objects to ostreams and we're not required to save "const" object. The fact that serialization acts in a different way is inconsistent and confusing. I think you'll agree that users should not read documentation just to save an object.

std::ostream doesn't need to track the objects saved. So passing objects which change while in the course of creating an output file doesn't create a problem, while in the serialization system it DOES create an error.

...

And let me clarify again -- is this indended to stack only stack allocated objects?

No. Its for any object whose contents might change while in the course of creating an archive. Example. X * x; .. ar << x; .... *x = y; ... ar << x; // uh - oh bug introduced since x is being tracked. The static assert will flag this as an error at compile time. if this is what one really wants to do then X should be marked untracked in which case no error will be issued.

...

...
...
...
Note that the documentation suggests that he above be reformulated as For(... ar << a[i]; //

Enforcing const-ness also has the effect of preventing serialization from altering the state of the object being serialized - another almost impossible to find bug.

...
Can you explain: 1. How this bug can happen

Its very easy to write for(...{ X x = *it; // create a copy of ar << x }

all the x's are at the same address so if they happen to be tracked because a pointer to some X in serialized somewhere in the program, then the subsequent copies would be supprsed by tracking.

...

So, you're saying that the behaviour of the above code will be different, depending on whether some pointer to X is saved somewhere else? This is very strange -- here, saving is not done via pointer, so I expect that no tracking is ever done. I'd argue this is a bug in serialization.

Hey - that's not a bug - its a feature ! tracking is only done if its necessary to guarentee that the objects can be loaded correctly. If you believe this is a bug, you can use serialization traits to mark the particular class "track_always"

...

The only problematic case is

for(...{ X x = *it; // create a copy of ar << x }

X* x = new X: ar << x;

where address of newed 'x' is the same as address of saved 'x'. But this can never happen, because heap memory and stack memory are distinct.

In the loop, a new value for x is set every iteration through the loop. But the address of x (on the stack) is the same every time. If X is tracked, only the first one will be saved. When the archive is loaded, all the x values will be the same - not what is probably intended. So, the question is what is really intended here and trapping as an error requires that this question be considered.

...

...
With the above formulation this cannot happen. This error is detected a compile time - very convenient. Of course the following would work as well

for(...{ const & X x = *it; // save a const reference to the orgiinal ar << x }

Will that indeed save a const reference? How will you read const reference from an archive?

This works as one would expect - that is all the objects are saved and tracked separtly. The const reference doesn't have its own address - taking the address of it returns the address of the object being refered to - just what we want for tracking.

...

...
while the following would create an error that would go undetected.

for(...{ const X x = *it; // create a copy of ar << x }

Probably the right way to fix this is just don't track 'x' here. Again, no heap allocated object will have the same address as 'x'.

in this case x is being allocated on the stack - NOT on the heap. And all the x's have the same stack address.

...

...
...
2. Why the "watch" command on gdb is not a reliable way to catch such bugs?

I'm not sure what the gdb watch command does but I'm sure it doesn't detect compile time errors.

It allows to find all places where a specific memory location is modified. So, if you have a bug where a value of an object is modified by serialization, it's easy to track this down to specific code line.

Well, that would be helpful in finding such an error. But its not as good as making the error impossible to occur in the first place.

...

...
...
...
...
...
ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish();

...

What kind of context should I provide? This is top-level code, it reads some files, passes them to 'data_builder' and then calls 'finish' that builds the proper data which is then saved.

...

...
...
...
...
...
Basically, it's some heuritic to prevent saving on-stack object

That's one case I would like trap.

Any others?

Well since you bring it up - the & operator totally blows a way this const checking. It has to do this in order to effectively provide the functionality for which its intended. So the usage of & would fail to trap the bugs that << might detect. On the other hand this is not such a problem as the & operator lends itsefl mostly to situation where the save/load are exactly symetric and its much harder to (maybe immpossible) to create the kind of situtation we have here. Also there are situations in which I could not really enforce the const-ness as I wanted. Most notable is the case where once uses ar << BOOST_SERIALIZATION_NVP(x) . So had you wrapped your x above in this macro to support xml archives, you would not ahve noticed the issue. So I can't always catch it, but I catch it where I can.

...

...
When finish returns something that is not a const it suggests that its returning a pointer to a mutable object. The serialization library presumes that objects don't change in the course of serialization. So passing a non-const conflicts with one of the assumptions made in the implementation of the library.

...

Ehm... the standard way to solve this is to declare "operator<<" with the "const T&" type, just like iostreams do.

We're doing this now. that is Archive::operator<<(const T & ...) But using const indicates that the callee won't mutate the object. It doesn't require that the object being passed be imutable - which is what we're trying to check for here. In other words even though a paramter is declared const - that doesn't require tha the object passed be const. In fact, the compiler is permitted to create a copy of an non-const and pass it to a const paramter. I haven't seen a compiler actually do this - but it is permitted. It would create havoc with the tracking system which is essential to re-creating archived pointers. The new code follows compilier rules more carefully and should work even if some compiler makes a copy to when converting a non-const paramter to a const.

...

...
If its a temporary you could just as well return an auto_ptr to a const T. But really without more context its hard for me to commment.

1. I don't want to change my design due to serialization library. It's intrusive.

This trap is telling you that maybe something in your design conflicts serialization in a fundamental way and this should be checked.

...

2. Inside 'finish', I need to work with auto_ptr<T> -- because I modify the data.

That's it. when you call finish from with an archive save, you're telling me that the process of saving is changing the data. The serialization library implementation (tracking) assume that won't happen. So if your design is such that this is what you really want to do, then T should be marked track_never. This will inhibit tracking and suppress the error message. Without this checking, this issue would go undetected until runtime - and during a load at that. And be very difficult to find. Of course, in the course of reviewing the question, you might conclude that rather than marking the type "track_never" you might want to make a different kind of change. Which is fine of course. My intention is to make it easier to write code that works correctly with the serialization library. The interface of the serializtion libary is very simpile - deceptively so. The intention of including all this checking is to make it harder to introduce hard to find bugs. This is just one of many examples.

...

And conversion from auto_ptr<T> to auto_ptr<const T> does not work for me on gcc.

As noted above, there's really more to it than changing a data type. I raises the question as to what one thinks he wants to conflicts with the serializiaton implementation. Its the case whenever we use a library. Sometimes it has requirements which are too onerous and we don't use it. Other times we make compromises to be able to make use of it.

...

...
...
1. Why std::string is 'tracked type'?

so that storage space isn't wasted storing repeated strings

Also so that serializaton of a pointer to a std::string would function as expected.

...

...
...
2. How do you suggest me to fix the above?

I did suggest using & instead. But I'm curious to see more context. I'm curious to see more.

Here's the enclosing method:

template<class Archive> void save(Archive & ar, unsigned int version) const { // Serialization has problems with shared_ptr, so use strings. ar << boost::lexical_cast<std::string>(*this); }

LOL - well I would disagree that serialization has problems with shared_ptr - My view is that shared_ptr has problems with serialization. The next version will reconcile differing points of view on this subject. That aside I would expect to use something like the following template<class Archive> void save(Archive & ar, unsigned int version) const { // Serialization has problems with shared_ptr, so use strings. const std::string s = boost::lexical_cast<std::string>(*this); ar << s } Maybe the above might be ar << boost::lexical_cast<const std::string>(*this); template<class Archive> void load(Archive & ar, unsigned int version) // note no "const" here { // Serialization has problems with shared_ptr, so use strings. std::string s ar >> s; *this = s; // whatever means } Now this could create a tracking problem. So if ths is really what needs to be done, I would create an wrapper: struct untracked_string : public std::string { ... }; and set untracked_string to "track_never" BTW in the case above we're not serialization a shared_ptr as far as I can tell so I would expect to just have serialize(Archive &ar, unsigned int version){ ar & ... members } Its really odd to me to see something like ar << *this or ar << *ptr which means we're circumventing and maybe re-implementing part of the serialization library. Robert Ramey

Vladimir Prus

12 May 12 May

8:57 a.m.

Robert Ramey wrote:

...

...
...
OK let me rephrase. This code conflicts with the description of the description of the functioning of the operator << as described in the documentation.

Can you give me specific doc URL? I can't find the relevant part in TOC.

Serialization/Reference/Special Considerations/Object tracking

explains this.

Ok, I found this.

...

...
...
referene and still be modified during the process of serialization. So this is not bullet proof - but I believe it is very helpful.

...
Let me draw an analogy: there's std::ostream. For years, we save objects to ostreams and we're not required to save "const" object. The fact that serialization acts in a different way is inconsistent and confusing. I think you'll agree that users should not read documentation just to save an object.

std::ostream doesn't need to track the objects saved. So passing objects which change while in the course of creating an output file doesn't create a problem, while in the serialization system it DOES create an error.

I'll comment on the error later. But the important point is that difference from iostream, no matter how much explanation you put in docs, will be confusing. Moreover, as soon as users get into habit of const_casting serialized things, your protection no longer works. for(;;) { A x = a[i]; ar << x; } User gets error above changes this to for(;;) { const A x = a[i]; ar << x; // Oops, error here, change this to } and circumvents your protection. If there's no way to serialize non-const object, users will just start using const.

...

...
And let me clarify again -- is this indended to stack only stack allocated objects?

No. Its for any object whose contents might change while in the course of creating an archive. Example.

X * x; .. ar << x; .... *x = y; ... ar << x; // uh - oh bug introduced since x is being tracked.

The static assert will flag this as an error at compile time.

Ok, at least I understand your intentions now. But again note that if this static check triggers in situation users consider save, they'll quickly learn to use casts. Everywhere.

...

...
...
Its very easy to write for(...{ X x = *it; // create a copy of ar << x }

all the x's are at the same address so if they happen to be tracked because a pointer to some X in serialized somewhere in the program, then the subsequent copies would be supprsed by tracking.

...
So, you're saying that the behaviour of the above code will be different, depending on whether some pointer to X is saved somewhere else? This is very strange -- here, saving is not done via pointer, so I expect that no tracking is ever done. I'd argue this is a bug in serialization.

Hey - that's not a bug - its a feature !

tracking is only done if its necessary to guarentee that the objects can be loaded correctly.

I understand the need for tracking. What I don't understand is why tracking is enabled when I'm saving *non pointer*. Say I'm saving object of class X: X x; ar << x; Then, for reasons I've explained in the previous email, no heap allocated object will have the same address as 'x', so no tracking is needed at all.

...

...
The only problematic case is

for(...{ X x = *it; // create a copy of ar << x }

X* x = new X: ar << x;

where address of newed 'x' is the same as address of saved 'x'. But this can never happen, because heap memory and stack memory are distinct.

In the loop, a new value for x is set every iteration through the loop. But the address of x (on the stack) is the same every time. If X is tracked, only the first one will be saved. When the archive is loaded, all the x values will be the same - not what is probably intended. So, the question is what is really intended here and trapping as an error requires that this question be considered.

Exactly. As I've said above, I believe saves of 'x' inside the loop should not do tracking since we're not saving via a pointer.

...

...
...
With the above formulation this cannot happen. This error is detected a compile time - very convenient. Of course the following would work as well

for(...{ const & X x = *it; // save a const reference to the orgiinal ar << x }

Will that indeed save a const reference? How will you read const reference from an archive?

This works as one would expect - that is all the objects are saved and tracked separtly. The const reference doesn't have its own address - taking the address of it returns the address of the object being refered to - just what we want for tracking.

This behaviour is strange. In C++ reference almost always acts like non-reference type, and it's common to use referece to create convenient local alias. I'd expect saving of "const &X" to works exactly as saving of "const X".

...

...
...
while the following would create an error that would go undetected.

for(...{ const X x = *it; // create a copy of ar << x }

Probably the right way to fix this is just don't track 'x' here. Again, no heap allocated object will have the same address as 'x'.

in this case x is being allocated on the stack - NOT on the heap. And all the x's have the same stack address.

Yes, they have the same stack address, but it does not matter, because we're not saving them via pointer. If we wrote something like: ar << &x; then the object would have to be tracked. But saving pointer to stack object is problematic on its own.

...

...
...
When finish returns something that is not a const it suggests that its returning a pointer to a mutable object. The serialization library presumes that objects don't change in the course of serialization. So passing a non-const conflicts with one of the assumptions made in the implementation of the library.

...
Ehm... the standard way to solve this is to declare "operator<<" with the "const T&" type, just like iostreams do.

We're doing this now. that is

Archive::operator<<(const T & ...)

But using const indicates that the callee won't mutate the object. It doesn't require that the object being passed be imutable - which is what we're trying to check for here. In other words even though a paramter is declared const - that doesn't require tha the object passed be const. In fact, the compiler is permitted to create a copy of an non-const and pass it to a const paramter. I haven't seen a compiler actually do this - but it is permitted.

The only case I know about is binding rvalue to const reference.

...

It would create havoc with the tracking system which is essential to re-creating archived pointers. The new code follows compilier rules more carefully and should work even if some compiler makes a copy to when converting a non-const paramter to a const.

Again, when saving non-pointer no tracking should be necessary.

...

...
...
If its a temporary you could just as well return an auto_ptr to a const T. But really without more context its hard for me to commment.

1. I don't want to change my design due to serialization library. It's intrusive.

This trap is telling you that maybe something in your design conflicts serialization in a fundamental way and this should be checked.

But I've checked and nothing's wrong. So I either have to modify my design -- which I don't want, or add very strange-looking cast.

...

...
2. Inside 'finish', I need to work with auto_ptr<T> -- because I modify the data.

That's it. when you call finish from with an archive save, you're telling me that the process of saving is changing the data. The serialization library implementation (tracking) assume that won't happen. So if your design is such that this is what you really want to do, then T should be marked track_never. This will inhibit tracking and suppress the error message.

Without this checking, this issue would go undetected until runtime - and during a load at that. And be very difficult to find.

Sorry, probably I did not elaborate enough. The finish code looks like: auto_ptr<Data> finish() { auto_ptr<Data> result; // modify result return result; } I modify 'result' inside 'finish', not inside any serialization code. And I can't change return type to auto_ptr<const Data> because the code won't compile. And return auto_ptr<const Data>(result.release()) is scary. So, there's no bug in my code yet ;-)

...

...
...
...
2. How do you suggest me to fix the above?

I did suggest using & instead. But I'm curious to see more context. I'm curious to see more.

Here's the enclosing method:

template<class Archive> void save(Archive & ar, unsigned int version) const { // Serialization has problems with shared_ptr, so use strings. ar << boost::lexical_cast<std::string>(*this); }

LOL - well I would disagree that serialization has problems with shared_ptr - My view is that shared_ptr has problems with serialization. The next version will reconcile differing points of view on this subject.

That would be much welcome, no matter how it's done.

...

That aside I would expect to use something like the following

template<class Archive> void save(Archive & ar, unsigned int version) const { // Serialization has problems with shared_ptr, so use strings. const std::string s = boost::lexical_cast<std::string>(*this);

I don't like the copy.

...

ar << s }

Maybe the above might be ar << boost::lexical_cast<const std::string>(*this);

I think this wont work because loading from stream to "const std::string" won't compile.

...

template<class Archive> void load(Archive & ar, unsigned int version) // note no "const" here { // Serialization has problems with shared_ptr, so use strings. std::string s ar >> s; *this = s; // whatever means }

Now this could create a tracking problem.

Tracking problem here? I'm just saving a string!

...

So if ths is really what needs to be done, I would create an wrapper:

struct untracked_string : public std::string { ... };

and set untracked_string to "track_never"

I hope you'll agree that this solution is rather inconvenient.

...

BTW in the case above we're not serialization a shared_ptr as far as I can tell so I would expect to just have

serialize(Archive &ar, unsigned int version){ ar & ... members }

There's shared_ptr somewhere inside 'members'.

...

Its really odd to me to see something like

ar << *this

or

ar << *ptr

which means we're circumventing and maybe re-implementing part of the serialization library.

I think that for a split 'save/load' you'll always have to use operator<< and operator>> of the archive. - Volodya

Robert Ramey

4:19 p.m.

Vladimir Prus wrote:

...

I'll comment on the error later. But the important point is that difference from iostream, no matter how much explanation you put in docs, will be confusing. Moreover, as soon as users get into habit of const_casting serialized things, your protection no longer works.

for(;;) { A x = a[i]; ar << x; }

User gets error above changes this to

for(;;) { const A x = a[i]; ar << x; // Oops, error here, change this to }

and circumvents your protection. If there's no way to serialize non-const object, users will just start using const.

I do mention specifically somewhere in the docs that Archives are NOT streams. However, I do concede that its natural to make the analogy. In my view, that supports my idea that the compiler should be used if possible to detect those cases where the difference between Archives and streams is important. This is one of those cases. Without this check, a user who identifies Archives and streames will never, ever find his bug because he will be looking in the wrong place. I'm aware that many programmers avoid using "const" because its seems to slow down development. Using const on a regular basis generates compiler complaints similar to this one on a regular basis and many just minimize use of const to avoid this annoyance. In my view this is misguided. By using const everywhere one can, one is comunicating to the compiler ones view that a particular object is not expected to be changed by a particular operation. When the compiler traps an error there are two reactions a) The compiler is being overly picky - I'll just remove the const. b) Hmmm - what's going on here - I'm trying to modify something that that I wouldn't expect to be modified. picking a) is easier because addressing b) often starts a chain reaction of "const" problems. I've come to the view that choosing b) makes for better code with less bugs. I believe also that in the future this will come to be even more important when we start to use multi-threading more. Also, its a fact that some of STL algortihms and collections are a little "const" unfriendly so this means sometimes I have to use a const_cast. But all in all I've come to strongly believe that picking b) is a much better choice and my handling of this issue in the serialization library reflects that.

...

Ok, at least I understand your intentions now. But again note that if this static check triggers in situation users consider save, they'll quickly learn to use casts. Everywhere.

Of course your correct on this. But lot's of people drive without seat belts too. That doesn't mean that the rest of us should be prohibited from using them.

...

I understand the need for tracking. What I don't understand is why tracking is enabled when I'm saving *non pointer*.

Say I'm saving object of class X:

X x; ar << x;

Then, for reasons I've explained in the previous email, no heap allocated object will have the same address as 'x', so no tracking is needed at all.

Default trait is non-primitive objects is "track_selectivly" This means that objects will be tracked if and only anywhere in the source, some object of this class is serialized through a pointer. So when I'm compiling one module above and checking at compile time I realy don't know that the object will in fact be tracked. So I trap unless the serialization trait is set to "track never". To reiterate, in this case the object won't be tracked unless somewhere else its serialized as a pointer. As an aside one might want to track object never serialized as pointers. That's why there is a serialization trait "track_always". This might occur where objects might be the objects of references from several instances of another class: class a ( X & m_x; .... }; tracking would guarentee that only one copy of the same X would be written to the archive - thus saving space. This would be an unusual case but its supported if necessary.

...

...
...
The only problematic case is

for(...{ X x = *it; // create a copy of ar << x }

X* x = new X: ar << x;

where address of newed 'x' is the same as address of saved 'x'. But this can never happen, because heap memory and stack memory are distinct.

In the loop, a new value for x is set every iteration through the loop. But the address of x (on the stack) is the same every time. If X is tracked, only the first one will be saved. When the archive is loaded, all the x values will be the same - not what is probably intended. So, the question is what is really intended here and trapping as an error requires that this question be considered.

...

Exactly. As I've said above, I believe saves of 'x' inside the loop should not do tracking since we're not saving via a pointer.

How do we know that one of the x's saved in the loop is not serialized as a pointer somewhere else? We have to track ALL x's because we don't know which ones if any are being tracked somewhere else. It could even be in a different module.

...

This behaviour is strange. In C++ reference almost always acts like non-reference type, and it's common to use referece to create convenient local alias.

...

I'd expect saving of "const &X" to works exactly as saving of "const X".

saving it does. The problem is the const X x = some_x; is quite different than const X & x = some_x That is creating a reference is altogether different from making a copy of an object. The strong analogy between these operations and the automatic invokation of the copy operation constitutes one of the main features of C++ which can be considered both a blessing and curse. On one hand it makes the language expressive by hiding the natural copies while on the other hand these hidden copies are a major source of hard to find bugs. In any case, its out of my hands.

...

Yes, they have the same stack address, but it does not matter, because we're not saving them via pointer. If we wrote something like:

X x; // ramey

...

ar << &x;

then the object would have to be tracked. But saving pointer to stack object is problematic on its own.

This would definitatly be an error. I did spend a significant amount of time tweaking the code to detect this but I don't think I was successful. ( I don't remember at this point). That is, in my world I want to trap X * x_ptr; ar << x_ptr while permiting const X * x_ptr; to pass unmolested. I don't exactly remember right now but I think I was inhibited from implementing this at least in all cases.

...

...
Archive::operator<<(const T & ...)

But using const indicates that the callee won't mutate the object. It doesn't require that the object being passed be imutable - which is what we're trying to check for here. In other words even though a paramter is declared const - that doesn't require tha the object passed be const. In fact, the compiler is permitted to create a copy of an non-const and pass it to a const paramter. I haven't seen a compiler actually do this - but it is permitted.

The only case I know about is binding rvalue to const reference.

which is what we're doing here.

...

...
It would create havoc with the tracking system which is essential to re-creating archived pointers. The new code follows compilier rules more carefully and should work even if some compiler makes a copy to when converting a non-const paramter to a const.

Again, when saving non-pointer no tracking should be necessary.

see above

...

...
...
...
If its a temporary you could just as well return an auto_ptr to a const T. But really without more context its hard for me to commment.

1. I don't want to change my design due to serialization library. It's intrusive.

This trap is telling you that maybe something in your design conflicts serialization in a fundamental way and this should be checked.

But I've checked and nothing's wrong. So I either have to modify my design -- which I don't want, or add very strange-looking cast.

You have three other options: a) use & operator instead of << b) set the tracking trait to "track_never" c) tweak your code so the trap is never invoked. (hypothetical only) By the way the const_cast is a good choice for another reason. Its specifically flags a case which should be checked if the program has surprising behavior. Suppose you've checked everything and its what you want to do so you put in a const_cast to avoid the trap. Then months later you add a module to your program which serializes a pointer to X. Now your code is broken in a surprising way. When you start debugging you'll see the "const_cast" and it might draw your attention so something that should be checked. Of course if you had tweaked your code without using the const_cast, then adding a module wouldn't break the code in any case - but its your decision.

...

...
...
2. Inside 'finish', I need to work with auto_ptr<T> -- because I modify the data.

That's it. when you call finish from with an archive save, you're telling me that the process of saving is changing the data. The serialization library implementation (tracking) assume that won't happen. So if your design is such that this is what you really want to do, then T should be marked track_never. This will inhibit tracking and suppress the error message.

Without this checking, this issue would go undetected until runtime - and during a load at that. And be very difficult to find.

Sorry, probably I did not elaborate enough. The finish code looks like:

auto_ptr<Data> finish() { auto_ptr<Data> result; // modify result return result; }

I modify 'result' inside 'finish', not inside any serialization code. And I can't change return type to auto_ptr<const Data> because the code won't compile. And

return auto_ptr<const Data>(result.release())

is scary.

Is this any more scary than const Data * result: which we are quite comfortable with. In fact I believe auto_ptr<const Data> finish() { auto_ptr<Data> result; // modify result return result; } expresses your intention quite well. That you're returning an auto_ptr to an object that you don't expect should be changed by anyone who gets the pointer this way. But it also makes me question: What is // modify result doing here? It seems that result might change every time we call it. This would violate the assumptions underlying assumptions of the serialization library. So I would say that we should be using: auto_ptr<const Data> some_class::finish() const { auto_ptr<Data> result; // modify result return result; } I'm still suspicious of the // modify result. But at least I know that the class being serialized isn't being modified by the save operation.

...

So, there's no bug in my code yet ;-)

That we know about.

...

...
...
...
...
2. How do you suggest me to fix the above?

That aside I would expect to use something like the following

template<class Archive> void save(Archive & ar, unsigned int version) const { // Serialization has problems with shared_ptr, so use strings. const std::string s = boost::lexical_cast<std::string>(*this);

I don't like the copy.

...
ar << s }

Maybe the above might be ar << boost::lexical_cast<const std::string>(*this);

I think this wont work because loading from stream to "const std::string" won't compile.

I believe we're doing the opposite here. creating a const std::string (on the stack) from a non-const one. This is exactly equivalent to the above const std::string s = boost::lexical_cast<std::string>(*this); ar << s; its just that the copy is hidden.

...

...
template<class Archive> void load(Archive & ar, unsigned int version) // note no "const" here { // Serialization has problems with shared_ptr, so use strings. std::string s ar >> s; *this = s; // whatever means }

Now this could create a tracking problem.

Tracking problem here? I'm just saving a string!

note that std::string is unusual in that it is classified as a "primitive" like int so it is by default "track_never". So the only tracking problem would be serialization of pointers to strings won't be loaded correctly and the error will be undetected. So it won't trap as an error - which could be a problem.

...

...
So if ths is really what needs to be done, I would create an wrapper:

struct untracked_string : public std::string { ... };

and set untracked_string to "track_never"

I hope you'll agree that this solution is rather inconvenient.

I do. That's why I prefer one of the other ones above. But it does point to an interesting issue. The seralization traits to primitives are set to "track_never". Its sort of an easily remembered hueristic in that usuallly we wouldn't want to track all ints even if someone want's to serialize a pointer to an int somewhere in the program. However, if someone does serialize a pointer to an int, then it won't be tracked and that could create an error. One option would be to set tracking for int as "track_selectively". However this would track ALL the serialized ints which for sure are all over the place - potentially a big efficiency issue. A better option would be to make a wrapper arond int class trackable_int which can can set as "track_selectively) which would be in always the same as int except that it would be tracked. This might be automated along the lines of BOOST_STRONG_TYPEDEF. In practice this hasn't come up as an issue though.

...

...
Its really odd to me to see something like

ar << *this

or

ar << *ptr

which means we're circumventing and maybe re-implementing part of the serialization library.

I think that for a split 'save/load' you'll always have to use operator<< and operator>> of the archive.

not true- insied save/load one can just as well use & What is odd about the above in my mind is the ptr dereferencing. If we're already inside the object, why not just serialize the members? doesn't ar << *this just make a recurrsive call itself? I'm curious if anyone else is following this thread. Its getting pretty deep in the details of the serializaiton library. Robert Ramey

Jeff Flinn

8:39 p.m.

"Robert Ramey" <ramey@rrsd.com> wrote in message news:d5vv7c$jg9$1@sea.gmane.org... [..]

...

I'm curious if anyone else is following this thread. Its getting pretty deep in the details of the serializaiton library.

Just lurking about. I've been bitten/confused by this topic in the past, although the workarounds were minimal in my case. This discussion is starting to help lift the fog. It was more of a case of not finding/remembering the documentation on this topic. It would be nice if these issues were a little more prominently described. Jeff Flinn

Vladimir Prus

23 Jun 23 Jun

2:13 p.m.

On Thursday 12 May 2005 20:19, Robert Ramey wrote:

...

I'm aware that many programmers avoid using "const" because its seems to [snip explanations why 'const' is good] But all in all I've come to strongly believe that picking b) is a much better choice and my handling of this issue in the serialization library reflects that.

Each check is only reasonable if it finds more bugs than it causes problems. We seem to disagree about the proportion for the *specific case* of the check in serialization library.

...

...
Ok, at least I understand your intentions now. But again note that if this static check triggers in situation users consider save, they'll quickly learn to use casts. Everywhere.

Of course your correct on this. But lot's of people drive without seat belts too. That doesn't mean that the rest of us should be prohibited from using them.

I don't think the analogy is correct. I guess if you were required to refasten the belt each time you change gear, you won't be using it.

...

...
I understand the need for tracking. What I don't understand is why tracking is enabled when I'm saving *non pointer*.

Say I'm saving object of class X:

X x; ar << x;

Then, for reasons I've explained in the previous email, no heap allocated object will have the same address as 'x', so no tracking is needed at all.

Default trait is non-primitive objects is "track_selectivly" This means that objects will be tracked if and only anywhere in the source, some object of this class is serialized through a pointer. So when I'm compiling one module above and checking at compile time I realy don't know that the object will in fact be tracked. So I trap unless the serialization trait is set to "track never". To reiterate, in this case the object won't be tracked unless somewhere else its serialized as a pointer.

I'm not sure this behaviour is right. It certainly matters if I save the *same* object as pointer or not. Why does it matter if I have *another* object by pointer. Suppose you've saving an object with the same address twice. There possible situations are: 1. Both saves are via pointers. You enable tracking for this address; only one object is actually saved. User is responsible for making sure that the object does not change between saves. 2. First save is via pointer, the second is not by pointer. You throw pointer_conflict. 3. First save is not by pointer, second is by pointer. You ehable tracking for this address. 4. Both saves are not via pointer. You don't track anything. Is there anything wrong with above behaviour?

...

As an aside one might want to track object never serialized as pointers. That's why there is a serialization trait "track_always". This might occur where objects might be the objects of references from several instances of another class:

class a ( X & m_x; .... };

tracking would guarentee that only one copy of the same X would be written to the archive - thus saving space. This would be an unusual case but its supported if necessary.

And how would you deserialize this, given that references are not rebindable?

...

...
...
...
for(...{ X x = *it; // create a copy of ar << x }

X* x = new X: ar << x;

where address of newed 'x' is the same as address of saved 'x'. But this can never happen, because heap memory and stack memory are distinct.

In the loop, a new value for x is set every iteration through the loop. But the address of x (on the stack) is the same every time. If X is tracked, only the first one will be saved. When the archive is loaded, all the x values will be the same - not what is probably intended. So, the question is what is really intended here and trapping as an error requires that this question be considered.

Exactly. As I've said above, I believe saves of 'x' inside the loop should not do tracking since we're not saving via a pointer.

How do we know that one of the x's saved in the loop is not serialized as a pointer somewhere else?

You keep a set of addresses of all saved objects.

...

We have to track ALL x's because we don't know which ones if any are being tracked somewhere else. It could even be in a different module.

Right, you need to track all addressed while saving, but in archive the saves from the above loop need not be marked as tracked.

...

...
This behaviour is strange. In C++ reference almost always acts like non-reference type, and it's common to use referece to create convenient local alias.

I'd expect saving of "const &X" to works exactly as saving of "const X".

saving it does. The problem is the

const X x = some_x;

is quite different than

const X & x = some_x

That is creating a reference is altogether different from making a copy of an object. The strong analogy between these operations and the automatic invokation of the copy operation constitutes one of the main features of C++ which can be considered both a blessing and curse. On one hand it makes the language expressive by hiding the natural copies while on the other hand these hidden copies are a major source of hard to find bugs. In any case, its out of my hands.

I don't understand anything of the above. To give another example: const X x; const X& x2 = x; are you saying that saving them works differently?

...

...
But I've checked and nothing's wrong. So I either have to modify my design -- which I don't want, or add very strange-looking cast.

You have three other options: a) use & operator instead of << b) set the tracking trait to "track_never" c) tweak your code so the trap is never invoked. (hypothetical only)

By the way the const_cast is a good choice for another reason. Its specifically flags a case which should be checked if the program has surprising behavior. Suppose you've checked everything and its what you want to do so you put in a const_cast to avoid the trap. Then months later you add a module to your program which serializes a pointer to X. Now your code is broken in a surprising way. When you start debugging you'll see the "const_cast" and it might draw your attention so something that should be checked.

If saving unrelated pointers does not magically change save behaviour of all other 'X' instances, then adding another module won't break my program in the first place.

...

which we are quite comfortable with. In fact I believe

auto_ptr<const Data> finish() { auto_ptr<Data> result; // modify result return result; }

expresses your intention quite well. That you're returning an auto_ptr to an object that you don't expect should be changed by anyone who gets the pointer this way.

Except that it does not compile.

...

...
...
...
...
...
2. How do you suggest me to fix the above?

That aside I would expect to use something like the following

template<class Archive> void save(Archive & ar, unsigned int version) const { // Serialization has problems with shared_ptr, so use strings. const std::string s = boost::lexical_cast<std::string>(*this);

I don't like the copy.

...
ar << s }

Maybe the above might be ar << boost::lexical_cast<const std::string>(*this);

I think this wont work because loading from stream to "const std::string" won't compile.

I believe we're doing the opposite here. creating a const std::string (on the stack) from a non-const one. This is exactly equivalent to the above

const std::string s = boost::lexical_cast<std::string>(*this); ar << s;

its just that the copy is hidden.

Did you check lexical_cast.hpp? It will try to change 'const std::string'.

...

...
...
So if ths is really what needs to be done, I would create an wrapper:

struct untracked_string : public std::string { ... };

and set untracked_string to "track_never"

I hope you'll agree that this solution is rather inconvenient.

I do. That's why I prefer one of the other ones above. But it does point to an interesting issue. The seralization traits to primitives are set to "track_never". Its sort of an easily remembered hueristic in that usuallly we wouldn't want to track all ints even if someone want's to serialize a pointer to an int somewhere in the program.

However, if someone does serialize a pointer to an int, then it won't be tracked and that could create an error.

I recall that in earlier versions users simple *could not* serialize a pointer to int. Did that change or I am wrong?

...

...
I think that for a split 'save/load' you'll always have to use operator<< and operator>> of the archive.

not true- insied save/load one can just as well use &

That would be non-intitive a bit.

...

What is odd about the above in my mind is the ptr dereferencing. If we're already inside the object, why not just serialize the members? doesn't ar << *this just make a recurrsive call itself?

I'm curious if anyone else is following this thread. Its getting pretty deep in the details of the serializaiton library.

Yea, looks like nobody cares much. - Volodya

David Abrahams

3:48 p.m.

Vladimir Prus <ghost@cs.msu.su> writes:

...

On Thursday 12 May 2005 20:19, Robert Ramey wrote:

I'm not sure this behaviour is right. It certainly matters if I save the *same* object as pointer or not. Why does it matter if I have *another* object by pointer.

Suppose you've saving an object with the same address twice. There possible situations are:

1. Both saves are via pointers. You enable tracking for this address; only one object is actually saved. User is responsible for making sure that the object does not change between saves.

2. First save is via pointer, the second is not by pointer. You throw pointer_conflict.

3. First save is not by pointer, second is by pointer. You ehable tracking for this address.

4. Both saves are not via pointer. You don't track anything.

Is there anything wrong with above behaviour?

That looks perfect to me.

...

...
...
Exactly. As I've said above, I believe saves of 'x' inside the loop should not do tracking since we're not saving via a pointer.

How do we know that one of the x's saved in the loop is not serialized as a pointer somewhere else?

You keep a set of addresses of all saved objects.

I know I'm coming in late here, but I have always expected that the library worked this way (actually keeping a map of saved object address to archive identifier is a usual arrangement). If it doesn't do that, I'm very surprised. How else do you do tracking? Also, if tracking a stack object is being made illegal unless it's const, I find that highly surprising, and I see no relationship between its constness and safety due to lifetime issues in this case. I could easily build a small self-referential structure on the stack that I'd like to send to a archive, and non-constness would be essential in such a scenario. Of course I could create const references to each object and serialize those but why make me jump through hoops?

...

I don't understand anything of the above. To give another example:

const X x; const X& x2 = x;

are you saying that saving them works differently?

I don't think that's possible; the expressions 'x' and 'x2' are identical in all contexts.

...

...
...
But I've checked and nothing's wrong. So I either have to modify my design -- which I don't want, or add very strange-looking cast.

You have three other options: a) use & operator instead of << b) set the tracking trait to "track_never" c) tweak your code so the trap is never invoked. (hypothetical only)

By the way the const_cast is a good choice for another reason.

const_cast is almost never a good choice for _adding_ constness. Use implicit_cast.

...

...
I'm curious if anyone else is following this thread. Its getting pretty deep in the details of the serializaiton library.

Yea, looks like nobody cares much.

I'm starting to care. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

24 Jun 24 Jun

4:30 a.m.

David Abrahams wrote:

...

Vladimir Prus <ghost@cs.msu.su> writes:

...

I know I'm coming in late here,

Yes you are. the original post was sometime ago and many of the issue on this subject were discussed.

...

but I have always expected that the library worked this way (actually keeping a map of saved object address to archive identifier is a usual arrangement). If it doesn't do that, I'm very surprised. How else do you do tracking?

Of course it works that way. It turns out the the library is sufficiently smart to skip tracking (and skip instantiating the code for tracking the indicated type) for types which are never serialized through pointers. The mechanism for implementing this is somewhat unusual and it seems may have caused a mis-apprehension as to what is going on here.

...

Also, if tracking a stack object is being made illegal unless it's const, I find that highly surprising, and I see no relationship between its constness and safety due to lifetime issues in this case.

Tracking a stack object makes no sense. doing so will result in an archive that cannot be loaded. Tracking an object whose value can change during the lifetime of an archive will also fail when the archive is loaded. by requiring the << operator to take a const argument when the object is of a type that is being tracked, violations of the above rules can often be detected at compile time - thereby saving the programmer days of work looking for a mistake that will only be detected after a failed attempt to load a corrupted archive. (BTW, I'm the one that get's the email when this occurs) Its concievable that there mght be a legitimate case where one wants to track an object that is not const during the course of serialization - but no one has presented a credible use case so far. There are cases where the STATIC_ASSERT trips when its not stricly necessary but in all of my cases I found it easy to slightly restructure the code to avoid the problem. See Joaquin's previous post on this subject. In he rare case where one needs to do ar << t where t is not a const and it is inconvenient to alter the code to make it so, one has two options: Use a const_cast or use the & operator instead of the << operator. Is this such a price to pay to get traps in usage of the library that is very likely an error? Is fair that the rest of have to forego this facility just so one programmer doesn't have to write ar & t instead of ar << t ?

...

I could easily build a small self-referential structure on the stack that I'd like to send to a archive, and non-constness would be essential in such a scenario. Of course I could create const references to each object and serialize those but why make me jump through hoops?

That's not the case here. you might build it on the stack but it would have a type. Normally that type would be serialized by something like save(Archve &ar, const T & t). From then on T is a const and there is no problem and no trap. The problem comes when you do somethign like for(... X x = ....() ar << x where X is type that maybe tracked. I'm sure you can see the problem here if you're trying to track either to recover pointers or eliminate duplication.

...

I'm starting to care.

The whole thing has been blown waaay out of proportion. One little observation. It has come to my attention that const is avoided by some programmers due to its tendency to ripple its effect throughout the code. Personally I think this is a great mistake and the extrat pain in the neck caused by this ripple effect is more than compensated by the detection of bugs resulting from side effects. This feature reflects my views on this subject. Robert Ramey

Vladimir Prus

7:37 a.m.

Robert Ramey wrote:

...

Tracking a stack object makes no sense. doing so will result in an archive that cannot be loaded.

How it does correspond to your previous statement:

...

Its very easy to write for(...{ X x = *it; // create a copy of ar << x }

all the x's are at the same address so if they happen to be tracked because a pointer to some X in serialized somewhere in the program, then the subsequent copies would be supprsed by tracking.

In the example you say that x's can happen to be tracked. In your last email you say that tracking a stack object makes no sense. And all x's are stack objects. - Volodya

David Abrahams

1:19 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
Vladimir Prus <ghost@cs.msu.su> writes:

...
I know I'm coming in late here,

Yes you are. the original post was sometime ago and many of the issue on this subject were discussed.

I'm sure. I tried to go backward in time, but got lost.

...

...
but I have always expected that the library worked this way (actually keeping a map of saved object address to archive identifier is a usual arrangement). If it doesn't do that, I'm very surprised. How else do you do tracking?

Of course it works that way. It turns out the the library is sufficiently smart to skip tracking (and skip instantiating the code for tracking the indicated type) for types which are never serialized through pointers. The mechanism for implementing this is somewhat unusual

I'm pretty sure I do something similar in Boost.Python, so I can imagine it. If what you're doing is legal, it has to involve setting up a function pointer that can be called at runtime to do the tracking... or else my years of banging my head against similar problems were in vain...

...

and it seems may have caused a mis-apprehension as to what is going on here.

...but the means for doing it is completely independent from the question of whether the behavior is a good idea. :)

...

...
Also, if tracking a stack object is being made illegal unless it's const, I find that highly surprising, and I see no relationship between its constness and safety due to lifetime issues in this case.

Tracking a stack object makes no sense.

Au contraire; it does. It's easy enough to set up a little graph of objects on the stack. I'd like to be able to load that back in and ("obviously") get dynamically allocated objects. Why is that nonsense? I know a computational physicist who wants to serialize very large matrices, which are invariably going to be objects on the stack. Why is that nonsense?

...

doing so will result in an archive that cannot be loaded.

I don't know why that should be true.

...

Tracking an object whose value can change during the lifetime of an archive will also fail when the archive is loaded.

I don't know why that should be true. I can imagine wanting to keep a "was_serialized" mark on some object. There is often data in an object that's part of its real logical state w.r.t. the program (and thus shouldn't be marked mutable), but that shouldn't be serialized. Why does it matter if that data changes? I don't understand why the object's value -- other than its internal pointers and references -- is important to the success of loading an archive.

...

by requiring the << operator to take a const argument when the object is of a type that is being tracked, violations of the above rules can often be detected at compile time - thereby saving the programmer days of work looking for a mistake that will only be detected after a failed attempt to load a corrupted archive. (BTW, I'm the one that get's the email when this occurs)

Understood. Now you're getting email about the consequences of trying to prevent it. You can't win ;-)

...

Its concievable that there mght be a legitimate case where one wants to track an object that is not const during the course of serialization

I'm pretty sure you mean "is not constant." The problem here, or at least one big problem, is that non-const does not imply non-constant.

...

- but no one has presented a credible use case so far.

On the other hand, credible use cases for serializing non-const objects are common.

...

There are cases where the STATIC_ASSERT trips when its not stricly necessary but in all of my cases I found it easy to slightly restructure the code to avoid the problem. See Joaquin's previous post on this subject.

Still looking for the pointer to it.

...

In he rare case where one needs to do ar << t where t is not a const and it is inconvenient to alter the code to make it so, one has two options: Use a const_cast or use the & operator instead of the << operator. Is this such a price to pay to get traps in usage of the library that is very likely an error? Is fair that the rest of have to forego this facility just so one programmer doesn't have to write ar & t instead of ar << t ?

You mean ar & my_non_const_object works? If so, I'm less worried. However, the non-uniformity seems a bit gratuitous, and I think you're setting a bad precedent by equating non-const with "will change," even if that interpretation is overridable.

...

...
I could easily build a small self-referential structure on the stack that I'd like to send to a archive, and non-constness would be essential in such a scenario. Of course I could create const references to each object and serialize those but why make me jump through hoops?

That's not the case here. you might build it on the stack but it would have a type.

? Everything has a type.

...

Normally that type would be serialized by something like save(Archve &ar, const T & t). From then on T is a const and there is no problem and no trap.

No, the problem is that there are other nodes in the structure that t refers to via non-const reference or pointer.

...

The problem comes when you do somethign like

for(... X x = ....() ar << x

where X is type that maybe tracked. I'm sure you can see the problem here if you're trying to track either to recover pointers or eliminate duplication.

Yes, I see the problem. But for(... X const x = ....() ar << x is no less problematic from that point of view.

...

...
I'm starting to care.

The whole thing has been blown waaay out of proportion.

Maybe. These days, I am putting a lot more attention on small details of libraries that I hadn't seen much of before. It isn't personal; I am just trying to keep the overall quality high.

...

One little observation. It has come to my attention that const is avoided by some programmers due to its tendency to ripple its effect throughout the code. Personally I think this is a great mistake and the extrat pain in the neck caused by this ripple effect is more than compensated by the detection of bugs resulting from side effects.

I agree, but AFAICT you're giving const (or lack thereof) a meaning for which there is no precedent in C++. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Jeff Garland

25 Jun 25 Jun

1:13 a.m.

On Fri, 24 Jun 2005 09:19:05 -0400, David Abrahams wrote

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
David Abrahams wrote:

...
Vladimir Prus <ghost@cs.msu.su> writes: Also, if tracking a stack object is being made illegal unless it's const, I find that highly surprising, and I see no relationship between its constness and safety due to lifetime issues in this case.

Tracking a stack object makes no sense.

Au contraire; it does. It's easy enough to set up a little graph of objects on the stack. I'd like to be able to load that back in and ("obviously") get dynamically allocated objects. Why is that nonsense?

I know a computational physicist who wants to serialize very large matrices, which are invariably going to be objects on the stack. Why is that nonsense?

It's a valid use case -- the need for stack-based serializaton happens all the time with value-based objects. There are all sorts of places in a program where objects are constructed on the stack, initialized, changed, and then perhaps serialized. I might even serialize the same object more than once with modified contents. They can't be const because they are changed or assigned after construction. I really don't want a different syntax to remember. operator<< invokes a good analogy in my brain to remember -- '&' doesn't.

...

...
In he rare case where one needs to do ar << t where t is not a const and it is inconvenient to alter the code to make it so, one has two options: Use a const_cast or use the & operator instead of the << operator. Is this such a price to pay to get traps in usage of the library that is very likely an error? Is fair that the rest of have to forego this facility just so one programmer doesn't have to write ar & t instead of ar << t ?

You mean

ar & my_non_const_object

works? If so, I'm less worried. However, the non-uniformity seems a bit gratuitous, and I think you're setting a bad precedent by equating non-const with "will change," even if that interpretation is overridable.

I'm worried people will get in the 'habit' of casting to use serialization. And in the real world that won't be using fancy C++ casts -- they'll get out the big bad c-cast hammer. And IME once the casting starts it has a way of growing -- programmers see the casts and 'learn from them'.

...

...
...
I'm starting to care.

The whole thing has been blown waaay out of proportion.

Maybe. These days, I am putting a lot more attention on small details of libraries that I hadn't seen much of before. It isn't personal; I am just trying to keep the overall quality high.

I never stopped caring, I just got tired. Since the moment this change went into the library and blew up date-time tests and some of my other programs that were working perfectly I was unhappy. But neither Vladimir or myself have been able to convince Robert that this change is ill advised -- so I just stopped and modified my stuff. I ended up adding something to the date-time docs so that we can demonstrate stack-based serialization: NOTE: due to a change in the serialization library interface, it is now required that all streamable objects be const prior to writing to the archive. The following template function will allow for this (and is used in the date_time tests). At this time no special steps are necessary to read from an archive. template<class archive_type, class temporal_type> void save_to(archive_type& ar, const temporal_type& tt) { ar << tt; } Feels/looks like an ugly workaround to me... mr-blow-it-all-out-of-proportion-yours ;-) Jeff

David Abrahams

3:57 a.m.

"Jeff Garland" <jeff@crystalclearsoftware.com> writes:

...

On Fri, 24 Jun 2005 09:19:05 -0400, David Abrahams wrote

...
You mean

ar & my_non_const_object

works? If so, I'm less worried. However, the non-uniformity seems a bit gratuitous, and I think you're setting a bad precedent by equating non-const with "will change," even if that interpretation is overridable.

I'm worried people will get in the 'habit' of casting to use serialization. And in the real world that won't be using fancy C++ casts -- they'll get out the big bad c-cast hammer. And IME once the casting starts it has a way of growing -- programmers see the casts and 'learn from them'.

Yeah, with all due respect to the author -- who has designed a library that's by all accounts very satisfying -- this design choice is just all wrong. It doesn't detect what it purports to, and gives plenty of false positives. Because it's a compile-time check people will get used to doing what is required to subvert it. It's well known that error reports that are commonly wrong are worse than no report at all. The hashing idea is a lot closer to the mark.

...

...
...
...
I'm starting to care.

The whole thing has been blown waaay out of proportion.

Maybe. These days, I am putting a lot more attention on small details of libraries that I hadn't seen much of before. It isn't personal; I am just trying to keep the overall quality high.

I never stopped caring, I just got tired. Since the moment this change went into the library and blew up date-time tests and some of my other programs that were working perfectly I was unhappy. But neither Vladimir or myself have been able to convince Robert that this change is ill advised

I've noticed a similar dynamic with a few other Boost libraries recently. It would be so much easier if everyone would just do it my way :^)

...

-- so I just stopped and modified my stuff. I ended up adding something to the date-time docs so that we can demonstrate stack-based serialization:

NOTE: due to a change in the serialization library interface, it is now required that all streamable objects be const prior to writing to the archive.

Actually that's not even accurate. Forming a const reference to an object doesn't make the object const. The fact that this design makes you write something hard to explain might be a clue that it isn't helping.

...

The following template function will allow for this (and is used in the date_time tests). At this time no special steps are necessary to read from an archive.

template<class archive_type, class temporal_type> void save_to(archive_type& ar, const temporal_type& tt) { ar << tt; }

Feels/looks like an ugly workaround to me...

Yah. Just another pointless hoop to jump through. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

6:11 a.m.

David Abrahams wrote:

...

"Jeff Garland" <jeff@crystalclearsoftware.com> writes:

...
On Fri, 24 Jun 2005 09:19:05 -0400, David Abrahams wrote

...
You mean

ar & my_non_const_object

works? If so, I'm less worried. However, the non-uniformity seems a bit gratuitous, and I think you're setting a bad precedent by equating non-const with "will change," even if that interpretation is overridable.

The non-uniformity is really a feature of the & operator itself. Its used for both loading and saving. The operations are not quite as symetric as they appear at first glance.

...

...
I'm worried people will get in the 'habit' of casting to use serialization. And in the real world that won't be using fancy C++ casts -- they'll get out the big bad c-cast hammer. And IME once the casting starts it has a way of growing -- programmers see the casts and 'learn from them'.

...

Yeah, with all due respect to the author -- who has designed a library that's by all accounts very satisfying -- this design choice is just all wrong. It doesn't detect what it purports to, and gives plenty of false positives. Because it's a compile-time check people will get used to doing what is required to subvert it.

This is very speculative. Some people may realize the error of their ways and correct their practices - who can say?

...

It's well known that error reports that are commonly wrong are worse than no report at all.

First I don't think the error report is commonly wrong. Second, it will be almost impossible for the person who commits such an error to find it if/when it occurs.

...

The hashing idea is a lot closer to the mark.

Not in my view

...

...
...
...
...
I'm starting to care.

The whole thing has been blown waaay out of proportion.

Maybe. These days, I am putting a lot more attention on small details of libraries that I hadn't seen much of before. It isn't personal; I am just trying to keep the overall quality high.

I never stopped caring, I just got tired. Since the moment this change went into the library and blew up date-time tests and some of my other programs that were working perfectly I was unhappy. But neither Vladimir or myself have been able to convince Robert that this change is ill advised

I've noticed a similar dynamic with a few other Boost libraries recently.

There might be an explanation for that. I get feedback from people with questions about using the library. Much of it is via private email. This is quite a different group than those that inhabit this list. On the list I get a lot of concerns about about usages of the library in ways I never imagined. Also many of the commenters on the list have strong opinions driven by their particular application. I feel I have to be concerned about correctness, transparency, efficiency, portability, long comple times, etc. Many times someone is disappointed I don't include this or that change because it conflicts with one of these aspects which is uninportant to them. So everyone is disappointed at least to some extent. This is the root cause that larger libaries that lots of people have an interest in (e.g.units/dimensions) can't make progress in boost. (Serialization would fall into the category except for my particular personality features.)

...

...
-- so I just stopped and modified my stuff. I ended up adding something to the date-time docs so that we can demonstrate stack-based serialization:

Hmm I'll take a look.

...

...
NOTE: due to a change in the serialization library interface, it is now required that all streamable objects be const prior to writing to the archive.

I'm not sure what the above means. What is true is that the state of an object should not be changed durring the process of serialization. That is the statement made by save(Archive &ar, const T & t).

...

Actually that's not even accurate. Forming a const reference to an object doesn't make the object const. The fact that this design makes you write something hard to explain might be a clue that it isn't helping.

The problem you had was the same as mine when writing the tests. The typical code was A a, a1; text_archive oa(..) oa << a; ... text_iarchive ia(..) oa >> a1 BOOST_CHECK(a == a1) Of course this was easy to fix (just make use const A a1) but other cases required a little more work - but never very much. But I argue - as Joaquin did. That the above situation is not at all typical and that in typically the situation almost never comes up. Of course that is speculative on my part. But so far I've only gotten objections from library testers. Vladimir came up with a bunch of cases but they seemed very atypical to me. The were at best usages of the library in ways that have never occurred to me so it was hard for me to be convinced. That's why I say I believe its been blown waaaay out of proportion.

...

...
The following template function will allow for this (and is used in the date_time tests). At this time no special steps are necessary to read from an archive.

template<class archive_type, class temporal_type> void save_to(archive_type& ar, const temporal_type& tt) { ar << tt; }

Feels/looks like an ugly workaround to me...

typically that's the way it gets used anyway. Which is exactly my point.

...

Yah. Just another pointless hoop to jump through.

Yeah - just like the whole "const" business in the first place? For what its worth there are a couple of incidental aspects of this situation that might be interesting. a) This whole thing is implemented by one STATIC_ASSERT in oserializer.hpp which can be commented out b) I considered using STATIC_WARNING but I could never really quite STATIC_WARNING to function totally to my taste. c) After I ran this with my own tests, and being forced to think about each instance, I concluded that my code was more fragil than I had thought and became more convinced that it was a good idea. d) Also having to find my own errors in serialization of stl containers convinced me it was a good idea. I would like to wait and see how this works for a while. So far I've heard from myself - library writer Jeff - wrote serialization test for date/time Joaquin - wrote serialization for mult-index - a tricky job Vladimir - using in in real code. Dave - not used the library in any way As far as I'm concerned we really have only one data pont (Vladimir) which really addresses what the discussion is about. So lets relax a bit and let some more experience trickle in for a while and see what happens. Robert Ramey

Peter Dimov

11:16 a.m.

Robert Ramey wrote:

...

David Abrahams wrote:

...
"Jeff Garland" <jeff@crystalclearsoftware.com> writes:

[...]

...

...
The hashing idea is a lot closer to the mark.

Not in my view

I find that very odd, since you later go on to say that this statement:

...

...
...
NOTE: due to a change in the serialization library interface, it is now required that all streamable objects be const prior to writing to the archive.

in inaccurate (which it isn't), and explain that:

...

What is true is that the state of an object should not be changed durring the process of serialization. That is the statement made by save(Archive &ar, const T & t).

The hash check does exactly that, verify that the state of the object hasn't changed. Anyway; have you considered making the default tracking level to only track pointer saves but not value saves? This would instantly marginalize the problem. Mixing object saves and pointer saves is fairly unreliable since it's order dependent (and will create problems later on if the object is versioned), but if one wants to use the library in such a way, the option to enable full tracking would still be there (and get the full const treatment).

David Abrahams

12:23 p.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

Robert Ramey wrote:

...
David Abrahams wrote:

...
"Jeff Garland" <jeff@crystalclearsoftware.com> writes:

[...]

...
...
The hashing idea is a lot closer to the mark.

Not in my view

I find that very odd, since you later go on to say that this statement:

...
...
...
NOTE: due to a change in the serialization library interface, it is now required that all streamable objects be const prior to writing to the archive.

in inaccurate (which it isn't)

Actually, _I_ said it was inaccurate. You can stream a non-const object just as long as you do it through a const pointer or reference. How weird is that? What other operations do we have in C++ that work on const lvalues and *not* on non-const ones? This is utterly unprecedented and it subverts the fundamental C++ relationship that a non-const object is-a const object.

...

and explain that:

...
What is true is that the state of an object should not be changed durring the process of serialization. That is the statement made by save(Archive &ar, const T & t).

The hash check does exactly that, verify that the state of the object hasn't changed.

Exactly, and the const check does not check for that at all. If you extend the logic used to support the const check, a std::list would have to be const in order to call .sort() on it. That's right: it's an error for the comparison operator used to change any of the list's elements in a way that would affect the sort, just like it's an error for any serialization operator to change the objects being serialized in a way that would affect serialization. Can you imagine a sort() member function being const?

...

Anyway; have you considered making the default tracking level to only track pointer saves but not value saves? This would instantly marginalize the problem.

Yeah, that was another option I was going to mention, but I didn't want to further confuse things.

...

Mixing object saves and pointer saves is fairly unreliable since it's order dependent (and will create problems later on if the object is versioned), but if one wants to use the library in such a way, the option to enable full tracking would still be there (and get the full const treatment).

-- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams

12:46 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
"Jeff Garland" <jeff@crystalclearsoftware.com> writes:

...
On Fri, 24 Jun 2005 09:19:05 -0400, David Abrahams wrote

Yeah, with all due respect to the author -- who has designed a library that's by all accounts very satisfying -- this design choice is just all wrong. It doesn't detect what it purports to, and gives plenty of false positives. Because it's a compile-time check people will get used to doing what is required to subvert it.

This is very speculative.

No, it is based on well-earned experience and observation. This is what happened in Java with exception specifications and it's what happens when people get compiler warnings that are often wrong.

...

Some people may realize the error of their ways and correct their practices - who can say?

Most people with experience in writing C++ (and probably a few others) will realize that their ways are not fundamentally erroneous, as several of the more experienced people on this list have. It's normal that anything you can do with a const lvalue can also be done with a non-const one.

...

...
It's well known that error reports that are commonly wrong are worse than no report at all.

First I don't think the error report is commonly wrong. Second, it will be almost impossible for the person who commits such an error to find it if/when it occurs.

Great, then use a detection mechanism that will actually catch the error. Hashing does that. In fact it will catch instances that refusing to serialize non-const lvalues will not. The for loop you showed in an earlier post is a prime example.

...

...
...
I never stopped caring, I just got tired. Since the moment this change went into the library and blew up date-time tests and some of my other programs that were working perfectly I was unhappy. But neither Vladimir or myself have been able to convince Robert that this change is ill advised

I've noticed a similar dynamic with a few other Boost libraries recently.

There might be an explanation for that. I get feedback from people with questions about using the library. Much of it is via private email. This is quite a different group than those that inhabit this list. On the list I get a lot of concerns about about usages of the library in ways I never imagined. Also many of the commenters on the list have strong opinions driven by their particular application. I feel I have to be concerned about correctness, transparency, efficiency, portability, long comple times, etc. Many times someone is disappointed I don't include this or that change because it conflicts with one of these aspects which is uninportant to them. So everyone is disappointed at least to some extent. This is the root cause that larger libaries that lots of people have an interest in (e.g.units/dimensions) can't make progress in boost. (Serialization would fall into the category except for my particular personality features.)

Please don't trivialize the concerns of those experienced library developers who are upset about this. This is not about personal disappointment and desires; it's a fundamentally deeper issue about consistency with the way users expect C++ to work.

...

What is true is that the state of an object should not be changed durring the process of serialization. That is the statement made by save(Archive &ar, const T & t).

No, the statement made is that *save* itself promises not to change t.

...

The problem you had was the same as mine when writing the tests. The typical code was

A a, a1;

text_archive oa(..) oa << a; ... text_iarchive ia(..) oa >> a1

BOOST_CHECK(a == a1)

Of course this was easy to fix (just make use const A a1) but other cases required a little more work - but never very much.

But I argue - as Joaquin did. That the above situation is not at all typical and that in typically the situation almost never comes up. Of course that is speculative on my part. But so far I've only gotten objections from library testers.

I am not even testing a library. I am just looking at this from the point of view of language and library consistency. I think this particular design choice sets a very bad precedent, and people look to Boost as an example of good C++ coding style.

...

Vladimir came up with a bunch of cases but they seemed very atypical to me. The were at best usages of the library in ways that have never occurred to me so it was hard for me to be convinced.

That's why I say I believe its been blown waaaay out of proportion.

I think I care about this for reasons you haven't considered. Yes, it's "easy to work around," but it doesn't make sense. That's more important to me than the fact that it's a little inconvenient.

...

...
...
The following template function will allow for this (and is used in the date_time tests). At this time no special steps are necessary to read from an archive.

template<class archive_type, class temporal_type> void save_to(archive_type& ar, const temporal_type& tt) { ar << tt; }

Feels/looks like an ugly workaround to me...

typically that's the way it gets used anyway. Which is exactly my point.

...
Yah. Just another pointless hoop to jump through.

Yeah - just like the whole "const" business in the first place?

No. The "const" business catches real problems and provides a framework for reasoning about code... as long as it's used consistently. Making an operation legal only for const objects is inconsistent with the way "const" works in C++.

...

For what its worth there are a couple of incidental aspects of this situation that might be interesting.

a) This whole thing is implemented by one STATIC_ASSERT in oserializer.hpp which can be commented out b) I considered using STATIC_WARNING but I could never really quite STATIC_WARNING to function totally to my taste. c) After I ran this with my own tests, and being forced to think about each instance, I concluded that my code was more fragil than I had thought and became more convinced that it was a good idea.

If you think some checks are needed, hashing provides the right ones.

...

d) Also having to find my own errors in serialization of stl containers convinced me it was a good idea.

Can you show all the code you wrote that stopped compiling after the introduction of this check? Can you also describe the thought process you used to decide which examples were erroneous and which should be silenced by the addition of a const?

...

I would like to wait and see how this works for a while. So far I've heard from

myself - library writer Jeff - wrote serialization test for date/time Joaquin - wrote serialization for mult-index - a tricky job Vladimir - using in in real code. Dave - not used the library in any way

As far as I'm concerned we really have only one data pont (Vladimir) which really addresses what the discussion is about. So lets relax a bit and let some more experience trickle in for a while and see what happens.

I'd like to convince you not to let this one out into the world, though :) -- Dave Abrahams Boost Consulting www.boost-consulting.com

Jeff Garland

4:21 p.m.

On Fri, 24 Jun 2005 23:11:31 -0700, Robert Ramey wrote

...

Hmm I'll take a look.

...
...
NOTE: due to a change in the serialization library interface, it is now required that all streamable objects be const prior to writing to the archive.

I'm not sure what the above means.

The object has to be const in the current context to stream it.

...

What is true is that the state of an object should not be changed durring the process of serialization. That is the statement made by save(Archive &ar, const T & t).

It's more than that. You can always pass a non-const to a const method and expect it to return unchanged. You can't currently serialize an object in a non-const context -- hence the workaround.

...

...
Actually that's not even accurate. Forming a const reference to an object doesn't make the object const. The fact that this design makes you write something hard to explain might be a clue that it isn't helping.

The problem you had was the same as mine when writing the tests. The typical code was

A a, a1;

text_archive oa(..) oa << a; ... text_iarchive ia(..) oa >> a1

BOOST_CHECK(a == a1)

Of course this was easy to fix (just make use const A a1) but other cases required a little more work - but never very much.

Again -- we're arguing over a trivial testcase where you might be able to easily create a const object. Just because this little testcase can easily make A const doesn't equate to whether I want to do that in a 'real program'. I've written lots of programs that build up and objects on the stack and then write them out to a file. Now I'm forced to put a function wrapper around things to use serialization. Sorry if this isn't a 'real enough' a use case for you, but I thought that was one of Vladimir's points as well.

...

But I argue - as Joaquin did. That the above situation is not at all typical and that in typically the situation almost never comes up. Of course that is speculative on my part. But so far I've only gotten objections from library testers.

The way you wrote this it appears that you value testers opinion less -- I think you should reconsider because a bunch of the library testers are also boost 'power users'.

...

Vladimir came up with a bunch of cases but they seemed very atypical to me. The were at best usages of the library in ways that have never occurred to me so it was hard for me to be convinced.

I think his cases are valid and I said so.

...

That's why I say I believe its been blown waaaay out of proportion.

That's your view and you're sticking to it. I had accepted it and moved on figuring that at some point after the 1.33 release the wave of user complaints would finally force you to reconsider.

...

Yeah - just like the whole "const" business in the first place?

For what its worth there are a couple of incidental aspects of this situation that might be interesting.

a) This whole thing is implemented by one STATIC_ASSERT in oserializer.hpp which can be commented out

I'd put that in the FAQ right now and make a compile option to turn it off -- that would alay my concerns and will make it trivial for all of us to answer the user questions. You're not the only one that's going to have to deal with this.

...

...snip...

I would like to wait and see how this works for a while. So far I've heard from

myself - library writer Jeff - wrote serialization test for date/time

You assume alot about things you don't know. I have other programs that have nothing to do with date-time that depend on serialization that were broken by this change. Perhaps that wasn't clear in my other correspondence.

...

Joaquin - wrote serialization for mult-index - a tricky job Vladimir - using in in real code. Dave - not used the library in any way

As far as I'm concerned we really have only one data pont (Vladimir) which really addresses what the discussion is about. So lets relax a bit and let some more experience trickle in for a while and see what happens.

I'm sorry you've dismissed the rest of us as heretics :-( I'm done with this thread now for good -- you've known my view for a few months and chosen to dismiss it -- that's your right. As for the experience part, only a very small number of people use CVS -- most use the releases. This much discussion of a library change prior to it's release is highly unusual -- I expect more of a flood than a trickle once the release occurs, people start upgrading, and programs start breaking. And that's why I think this IS a big deal... Jeff

Robert Ramey

5:54 p.m.

...

I'm sorry you've dismissed the rest of us as heretics :-( I'm done with this thread now for good -- you've known my view for a few months and chosen to dismiss it -- that's your right.

I think that a little unfar. I may disagree with your view, but I never dismissed it. I've spent a lot of time addressing it - even though its been raised as an issue only by two people who have actually used the library.

...

As for the experience part, only a very small number of people use CVS -- most use the releases. This much discussion of a library change prior to it's release is highly unusual -- I expect more of a flood than a trickle once the release occurs, people start upgrading, and programs start breaking. And that's why I think this IS a big deal...

Of course that's really the crux of the issue. What will users experience when this occurs? I concede I can't know for sure in advance. I don't think anyone can know this. In fact, I've been surprised but a lot of the things that I see users doing. Examples are: ar << *this; // I can't imagine what this if for ar << f(...); // function returnng a value on the stack to be serialized I'm not saying they are necessarily bad, but I do find them surprising and to me, in explicable. I've become very skeptical of my ability to predict how other programmers are going to address things. I'm willing to wait and see. So I'm anxious to see where this goes. If in fact we do get a wave of users with problems, I'll be forced to modify my point of view in the face of the facts. If (almost) no one complains (perhaps unlikely) then there will be no issue. If there are a fair number of complaints - but it turns out that a significant portion are due to real mistakes - we might be in for another debate. But given more data, I would expect it to be of a different character. If it turns out (almost) none of the cases are real problems, then I'll have to accept as a fact that its a bad idea. So I'm optimistic and not at all hyped up. Robert Ramey

Jeff Garland

7:01 p.m.

On Sat, 25 Jun 2005 10:54:26 -0700, Robert Ramey wrote

...

...
I'm sorry you've dismissed the rest of us as heretics :-( I'm done with this thread now for good -- you've known my view for a few months and chosen to dismiss it -- that's your right.

I think that a little unfar. I may disagree with your view, but I never dismissed it. I've spent a lot of time addressing it - even though its been raised as an issue only by two people who have actually used the library.

Perhaps -- it's this statement:

...

As far as I'm concerned we really have only one data pont (Vladimir) which really addresses what the discussion is about.

that makes me think that we are being dismissed. As an example, dismissing Dave's view (which I agree with and expresses my instinct of why this 'felt wrong' better than I could), even though he hasn't used the library is just wrong. His knowledge of C++ is second to none and he has an extremely high standard for correctness and consistency -- exactly what we have all come to expect from boost libraries. Anyway I retract the heretic part -- last thing I want is for boost to become uncivil like the rest of the world that swirls around us daily -- apologies. Of course, I remain unsatisfied with your answers to our concerns ;-) Jeff

Peter Dimov

7:04 p.m.

Robert Ramey wrote:

...

ar << f(...); // function returnng a value on the stack to be serialized

I'm not saying they are necessarily bad, but I do find them surprising and to me, in explicable.

This is pretty straightforward value-based serialization. It looks inexplicable to you because the library uses an identity-based approach by default, where the address of the object is important. There are cases where you only care about the value being saved, not its identity. You want '4' in the archive, you don't care which specific '4'. One situation where this happens often is when you have a specific file format that you need to support. For example, application A uses the serialization library to read/write objects of some type X. Application B has to be compatible with A's file format, but its internal data structures do not resemble X at all. So B does the following: on save, construct a temporary X from its state and serialize it; on load, deserialize a temporary X and construct its state to match. IOW: ar << construct_X_from( *this ); and X x; ar >> x; construct_from_X( *this, x ); You have repeatedly said that such cases are atypical, but this is not so. File formats are important; old data is important.

Robert Ramey

8:34 p.m.

Peter Dimov wrote:

...

Robert Ramey wrote:

...
ar << f(...); // function returnng a value on the stack to be serialized

I'm not saying they are necessarily bad, but I do find them surprising and to me, in explicable.

This is pretty straightforward value-based serialization. It looks inexplicable to you because the library uses an identity-based approach by default, where the address of the object is important.

No argument there.

...

There are cases where you only care about the value being saved, not its identity. You want '4' in the archive, you don't care which specific '4'.

That's why the STATIC_ASSERT doesn't trap if the type is marked as "track_never". For such items where you only care about the value the "track_never" trait should be set for efficiency reasons if nothing else. The case in my mind was something like pixels in a photograph. So I would amplify my statement above that I can't understand why someone would use the above on an type that might track the address. Sorry that wasn't more obvious. Of course it raises the question as to what the inverse of the operation is. It can't be ar >> f(..) unless f(..) returns a refence -but then ... etc. As I said, when I see something like that I can't discern the programmers intention and it certainly seems that it might be something that I didn't provide for when I wrote the library.

...

One situation where this happens often is when you have a specific file format that you need to support. For example, application A uses the serialization library to read/write objects of some type X. Application B has to be compatible with A's file format, but its internal data structures do not resemble X at all. So B does the following: on save, construct a temporary X from its state and serialize it; on load, deserialize a temporary X and construct its state to match. IOW:

ar << construct_X_from( *this );

and

X x; ar >> x; construct_from_X( *this, x );

You have repeatedly said that such cases are atypical, but this is not so. File formats are important; old data is important.

Well, I think they are atypical. This particular situation has indeed come up. That is there have been attempts to use the library to match a particular external format but these have not been successful as it conflicts with another fundamental feature of the serialization system. That being that the data is driven by the C++ data structures. I concede that your example above would trap but in fact, type construct_X_from should in this case be marked as "track_never". It can make never make sense to try to track this type in this context and marking it as "track_never" will be much more efficient to serialize/de-serialize as well. This is exactly the thing I'm thinking about. Without the trap, the user would have no clue that he is doing something he really doesn't intend to do. Of course he might say "I want to track it anyway" or he might say - well maybe in this example but ... Robert Ramey

Peter Dimov

9:35 p.m.

Robert Ramey wrote:

...

Peter Dimov wrote:

...
Robert Ramey wrote:

There are cases where you only care about the value being saved, not its identity. You want '4' in the archive, you don't care which specific '4'.

That's why the STATIC_ASSERT doesn't trap if the type is marked as "track_never". For such items where you only care about the value the "track_never" trait should be set for efficiency reasons if nothing else. The case in my mind was something like pixels in a photograph.

So I would amplify my statement above that I can't understand why someone would use the above on an type that might track the address.

For several reasons, the most important one being that tracking is on by default. But more fundamentally, address tracking is not a property of the type. It is quite possible that the same type X is used in one serialization session as non-trackable and in another as trackable. The author of X doesn't know how the type will be used, so he can't mark it as track_never. It becomes the responsibility of the person who actually serializes X to mark it appropriately (and it's not possible to supply different tracking levels for two parts of the same program AFAICS.) Most of these users of both the serialization library and X are very likely to not mess with the default tracking level; even if they do, they might forget about a type or two; and sometimes they'll never see the X being serialized because it might only be part of another object Y. I think that the default behavior on a sequence of two saves with the same address should be to write the two objects, as if address tracking isn't on. If later a pointer to that address is saved, an exception should occur. Or more generally, - one value save, then N pointer saves sharing the same address should be OK; - M pointer saves sharing the same address should be OK; (*) - K value saves sharing the same address should be OK and result in K copies in the archive; - all other sequences raise an exception at first opportunity. Is there any reasonable use case which is prohibited by the above rules? (*) This promotes questionable coding practices but is consistent with the current behavior. :-)

Robert Ramey

26 Jun 26 Jun

3:59 a.m.

Peter Dimov wrote:

...

Robert Ramey wrote:

...
Peter Dimov wrote:

...
Robert Ramey wrote:

There are cases where you only care about the value being saved, not its identity. You want '4' in the archive, you don't care which specific '4'.

That's why the STATIC_ASSERT doesn't trap if the type is marked as "track_never". For such items where you only care about the value the "track_never" trait should be set for efficiency reasons if nothing else. The case in my mind was something like pixels in a photograph.

So I would amplify my statement above that I can't understand why someone would use the above on an type that might track the address.

For several reasons, the most important one being that tracking is on by default.

Thats why I like the trapping behavior. It will indicate that the default - track_selectively - is probably not appropriate for this case. BTW - I thought a little more about your example: #include <fstream> #include <boost/archive/text_oarchive.hpp> #include <boost/static_assert.hpp> class X { }; X construct_X_from(int i); int main(){ std::ofstream os("file"); boost::archive::text_oarchive oa(os); int i; // the following traps with vc 7.1 // fails to compile with gcc 3.3 and borland // compiles - doesn't trap with comeau - I believe that it traps // but BOOST_STATIC_ASSERT doesn't work for comeau in this case oa << construct_X_from(i); } sure enough, it fails to compile on at least a couple of compilers. Apparently those compilers don't like taking a reference to something on the argument stack. So with these compilers the whole issue of the trap never presents itself. This behavior seems correct to me. I'm not sure why the others compile it.

...

But more fundamentally, address tracking is not a property of the type.

Maybe - maybe not. I would say that's a topic for another time.

...

It is quite possible that the same type X is used in one serialization session as non-trackable and in another as trackable. The author of X doesn't know how the type will be used, so he can't mark it as track_never. It becomes the responsibility of the person who actually serializes X to mark it appropriately

LOL - I'm sure that marking every usage of << as to whether it should be tracked or not would be very popular. In fact the default is almost always what on wants. The trap is to highlight likely cases where it may not be.

...

Most of these users of both the serialization library and X are very likely to not mess with the default tracking level; even if they do, they might forget about a type or two; and sometimes they'll never see the X being serialized because it might only be part of another object Y.

again - generaly the defaults are what most people would want. No one has raised any issue regarding the defaults until now.

...

I think that the default behavior on a sequence of two saves with the same address should be to write the two objects, as if address tracking isn't on. If later a pointer to that address is saved, an exception should occur. Or more generally,

- one value save, then N pointer saves sharing the same address should be OK;

- M pointer saves sharing the same address should be OK; (*)

- K value saves sharing the same address should be OK and result in K copies in the archive;

Currently you'll only get one copy for this case unless you suppress tracking.

...

- all other sequences raise an exception at first opportunity.

Is there any reasonable use case which is prohibited by the above rules?

(*) This promotes questionable coding practices but is consistent with the current behavior. :-)

Accept as noted above, that's how it works now. The question isn't really the defaults. But whether my attempts to trap likely cases where they should be overriden are mis-guided. Robert Ramey

Peter Dimov

9:15 a.m.

Robert Ramey wrote:

...

Peter Dimov wrote:

...
Robert Ramey wrote:

...
Peter Dimov wrote:

...

...
...
So I would amplify my statement above that I can't understand why someone would use the above on an type that might track the address.

For several reasons, the most important one being that tracking is on by default.

Thats why I like the trapping behavior. It will indicate that the default - track_selectively - is probably not appropriate for this case.

BTW - I thought a little more about your example:

#include <fstream> #include <boost/archive/text_oarchive.hpp> #include <boost/static_assert.hpp> class X { }; X construct_X_from(int i); int main(){ std::ofstream os("file"); boost::archive::text_oarchive oa(os); int i; // the following traps with vc 7.1 // fails to compile with gcc 3.3 and borland // compiles - doesn't trap with comeau - I believe that it traps // but BOOST_STATIC_ASSERT doesn't work for comeau in this case oa << construct_X_from(i); }

sure enough, it fails to compile on at least a couple of compilers. Apparently those compilers don't like taking a reference to something on the argument stack. So with these compilers the whole issue of the trap never presents itself. This behavior seems correct to me.

Right, that's because your operator<< takes a non-const reference. So the code above is not prevented by your STATIC_ASSERT trap, but it's still not supported by the library. Change the function to return "X const", as a user might do when faced with the error, and it will compile, not trap, and probably not do what one wants. (The other workaround would be to use a named variable and save that. This case will be trapped if the variable is not const.)

...

I'm not sure why the others compile it.

For backward compatibility; many years ago non-const references to temporaries were allowed.

...

...
But more fundamentally, address tracking is not a property of the type.

Maybe - maybe not. I would say that's a topic for another time.

[...]

...

LOL - I'm sure that marking every usage of << as to whether it should be tracked or not would be very popular. In fact the default is almost always what on wants. The trap is to highlight likely cases where it may not be.

[...]

...

again - generaly the defaults are what most people would want. No one has raised any issue regarding the defaults until now.

...
I think that the default behavior on a sequence of two saves with the same address should be to write the two objects, as if address tracking isn't on. If later a pointer to that address is saved, an exception should occur. Or more generally,

- one value save, then N pointer saves sharing the same address should be OK;

- M pointer saves sharing the same address should be OK; (*)

- K value saves sharing the same address should be OK and result in K copies in the archive;

Currently you'll only get one copy for this case unless you suppress tracking.

...
- all other sequences raise an exception at first opportunity.

Is there any reasonable use case which is prohibited by the above rules?

(*) This promotes questionable coding practices but is consistent with the current behavior. :-)

Accept as noted above, that's how it works now.

The question isn't really the defaults. But whether my attempts to trap likely cases where they should be overriden are mis-guided.

You keep dismissing everything I say. It's a topic for another time; it's not about the defaults. It is. The _only_ reason that you had to trap some cases is because your defaults don't work for them! If you change the default behavior to handle these cases, while still allowing previous uses to work as they did, you will no longer need to trap anything. Which is why I proposed one possible default behavior that seems to fit that description, and asked you whether it sounds reasonable.

Robert Ramey

27 Jun 27 Jun

3:13 p.m.

Peter Dimov wrote:

...

...
...
I think that the default behavior on a sequence of two saves with the same address should be to write the two objects, as if address tracking isn't on. If later a pointer to that address is saved, an exception should occur. Or more generally,

- one value save, then N pointer saves sharing the same address should be OK;

- M pointer saves sharing the same address should be OK; (*)

- K value saves sharing the same address should be OK and result in K copies in the archive;

Currently you'll only get one copy for this case unless you suppress tracking.

...
- all other sequences raise an exception at first opportunity.

Is there any reasonable use case which is prohibited by the above rules?

(*) This promotes questionable coding practices but is consistent with the current behavior. :-)

Accept as noted above, that's how it works now.

...

...

If you change the default behavior to handle these cases, while still allowing previous uses to work as they did, you will no longer need to trap anything. Which is why I proposed one possible default behavior that seems to fit that description, and asked you whether it sounds reasonable.

Actually I realize now that I mis-spoke. Default tracking behavior is "track selectively". That means that a value saves are tracked only if a object of the same type is anywhere serialized through a pointer. I believe that the current behavior matches your preference as described above. Robert Ramey

Peter Dimov

7:12 p.m.

Robert Ramey wrote:

...

Actually I realize now that I mis-spoke. Default tracking behavior is "track selectively". That means that a value saves are tracked only if a object of the same type is anywhere serialized through a pointer.

Right, but the problem is that your static assertion fires even when an object is never serialized through a pointer.

Vladimir Prus

6:41 a.m.

Peter Dimov wrote:

...

Or more generally,

- one value save, then N pointer saves sharing the same address should be OK;

- M pointer saves sharing the same address should be OK; (*)

- K value saves sharing the same address should be OK and result in K copies in the archive;

- all other sequences raise an exception at first opportunity.

Is there any reasonable use case which is prohibited by the above rules?

To make myself clear: I think *this* is the core question of this entire thread, and I agree 100% with what Peter proposes. This behaviour will make the case that the current assertion is meant to catch just work. - Volodya

Robert Ramey

4:19 p.m.

Vladimir Prus wrote:

...

This behaviour will make the case that the current assertion is meant to catch just work.

I believe a) that it currently works exactly as Peter thinks it should b) that problems still can and will occur c) that the trap is useful in detecting a significant and worthwhile portion of those problems d) at a small cost in convenience. To illustrate my case, lets take peter's interesting example. class construct_from { ... }; void main(){ ... Y y; ar << construct_from(y); } I never considered this specific example and I would consider uncommon but it is a legitimate and plausable usage of the library. First lets assume that the trap is commented out. 1) as written above, it fails to compile on a conforming compler. This is because the << operator takes a reference as its argument. So we make a slight change to make it pass. void main(){ ... Y y; construct_from x(y); ar << x; } 2) this compiles and executes fine. No tracking is done because construct_from has never been serialized through a pointer. Now some time later, the next programmer(2) comes along and makes an enhancement. He wants the archive to be sort of a log. void main(){ ... Y y; construct_from x(y); ar << x; ... x.f(); // change x in some way ... ar << x } Again no problem. He gets to copies in the archive, each one is different. That is he gets exactly what he expects and is naturally delighted. 3) Now sometime later, a third programmer(3) sees construct_from and says - oh cool, just what I need. He writes a function in a totally disjoint module. (The project is so big, he doesn't even realize the existence of the original usage) and writes something like: class K { shared_ptr<construct_from> z; template<class Archive> void serialize(Archive & ar, const unsigned version){ ar << z; } }; He builds and runs the program and tests his functionality and it works great and he's delighted. 4) Things continue smoothly as before and a month goes by before its discovered that when loading the archives made in thelast month (reading the log). Things don't work. The second log entry is always the same as the first. After a very long and acrimonius email exchanges, its discovered that programmer (3) accidently broke programmer(2)'s code because by serializing via a pointer, the behavior in an unrelate piece of code is changed. Bad enough, but worse yet the data wasn't being saved and cannot not be recovered. People are really upset and disappointed with boost (at least the serialization system). Now suppose the trap is turned on. How are things different? 1) Right away, the program traps at ar << x; The programmer curses (another %^&*&*( hoop to jump through). If he's in a hurry (and who isn't) and would prefer not to const_cast - because it looks bad, He'll just make the following change an move on. const construct_from x(y); ar << x; Things work fine and he moves on. 2) Now programer 2 wants to make his change - and again (^&%*^%) another annoy const issue; const construct_from x(y); ar << x; ... x.f(); // change x in some way ; compile error f() is not const ... ar << x He mildly annoyed now he tries the following: a) He considers making f() a const - but presumable that shifts the const error to somewhere else. And his doesn't want to fiddle with "his" code to work around a quirk in the serializaition system b) He removes the "const" from "const construct_from above - damn now he gets the trap. If he looks at the comment code where the BOOST_STATIC_ASSERT occurs, he'll do one of to things i) This is just B.S. Its making my life needlessly difficult and flagging code that is just fine. So I'll fix this with a const cast and fire off a complaint to the list and mabe they will fix it. ii)Oh, this trap is suggesting that the default serialization isn't really what I want. Of course in this particular program it doesn't matter. But then the code in the trap can't really evaluate code in other modules (which might not even be written yet). OK, I'll at the following to my construct_from.hpp to solve the problem. BOOST_SERIALIZATION_TRACKING(construct_from, track_never) Program compiles with no casts and no traps and executes just as expected (as above) 3) Now programmer (3) comes along and make his change. The behavior of the original (and distant module) remains unchanged because the construct_from trait has been set to track_never so we always get copies and the log is always what we expect. His program also works as expected even thought it saves/loads multipe copies. //Alert: Ironic humor follows On the other hand, Now he gets another trap - trying to save an object of a class marked "track_never" through a pointer. So he goes back to construct_from.hpp and comments out the BOOST_SERIALIZATION_TRACKING that was inserted Now the second trap is avoided, But damn - the first trap is popping up again. After much acrimonious email the situation is resolved so to no one's real satisfcation. // End humor alert This second trap doesn't currently exist. But as I writing the above made me think about it, I now believe it should also be implemented. So contrary to my original intent - the above isn't funny after all - its serious. Now (without the second trap) what is going happen is the following. 4) Actually programmer (3) changes are not going t function as shared_ptr<construct_from> is not have a single raw pointer shared amonst the instances but rather multiple ones. Things won't work. Of course this won't be obvious until a month from now and we're back in the same boat. This is my best attempt to illustrate why the trap (now traps?) are important. Not perfect but better than nothing. Vladimir started this thread with the question of whether the inconvenience created by the traps was a worthwhile trade off for any benefit. I agree that this is the fundamental question and have done my best to address it here. I believe that this scenario illustrate at least one case where it would. The inconvenience is small an and actually leads to a more correct program. Note that I didn't contrive this example, its Peter's example - I just took it to its logical conclusions. Robert Ramey

Peter Dimov

7:38 p.m.

Robert Ramey wrote:

...

Vladimir Prus wrote:

...
This behaviour will make the case that the current assertion is meant to catch just work.

I believe

a) that it currently works exactly as Peter thinks it should

Almost. This rule: - K value saves sharing the same address should be OK and result in K copies in the archive; isn't being followed, and it makes all the difference. In your example:

...

void main(){ ... Y y; construct_from x(y); ar << x; ... x.f(); // change x in some way ... ar << x }

the above rule will always output two copies of x, regardless of whether construct_from is serialized through a pointer in a distant and unrelated part of the program. I can think of one situation that could be adversely affected by this change: virtual base classes will be saved more than once unless their tracking level is set to 'always'.

David Abrahams

8:23 p.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

I can think of one situation that could be adversely affected by this change: virtual base classes will be saved more than once unless their tracking level is set to 'always'.

That might be strangely consistent. Don't forget that they might be constructed more than once anyway ;-) -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams

8:11 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

1) as written above, it fails to compile on a conforming compler. This is because the << operator takes a reference as its argument.

Why is it a good idea for << to take its 2nd argument by non-const reference?? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

9:36 p.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
1) as written above, it fails to compile on a conforming compler. This is because the << operator takes a reference as its argument.

Why is it a good idea for << to take its 2nd argument by non-const reference??

To trap non-const cases and refuse to compile them. ;-) (Back to square one.)

Robert Ramey

28 Jun 28 Jun

2:53 a.m.

Peter Dimov wrote:

...

David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
1) as written above, it fails to compile on a conforming compler. This is because the << operator takes a reference as its argument.

Why is it a good idea for << to take its 2nd argument by non-const reference??

To trap non-const cases and refuse to compile them. ;-) (Back to square one.)

Actually that's not funny- its true But also the prototypes specializations of save all use const. e.g. member funciton template<class Archive> void save(Archive &ar, unsigned const int) const; and template<class Archive, class T> void save(Archive &ar, const T &t, unsigined const int) So I orignally made operator<<(Archive &ar, const T &t) as it seemed natural and consistent to me. Of course this produces no errors as it converts a non-const to a const. However upon encountering examples like my extrapolation of peter's above, I concluded I wanted to be able to trap a probabe mis-use of the library such as illustrated by the previous example - and I didn't want to throw the opportunity away. Also, I came upon a warning somewhere that the compiler may copy a value when converting to a const. Now I don't know if that applies to const references and in fact I never saw it happen with any of my compilers. So my motivation was and is purely practical. I expected that the only people inconvenienced by this are those that are inconvenienced by const in general. And that they would handle it the same way. They would just grumble and apply a const_cast and forget about it. I'm certainly not prohibiting that. Robert Ramey

David Abrahams

11:36 a.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

So my motivation was and is purely practical. I expected that the only people inconvenienced by this are those that are inconvenienced by const in general. And that they would handle it the same way. They would just grumble and apply a const_cast and forget about it. I'm certainly not prohibiting that.

Robert, How is your rule any different from trapping serialization of all non-pointer types whose name begins with an uppercase letter? There's an easy workaround (just serialize via pointer) and the conditions of the trap have as much to do with the problem you're trying to detect. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

1:57 p.m.

David Abrahams wrote:

...

"Robert Ramey" <ramey@rrsd.com> writes:

...
So my motivation was and is purely practical. I expected that the only people inconvenienced by this are those that are inconvenienced by const in general. And that they would handle it the same way. They would just grumble and apply a const_cast and forget about it. I'm certainly not prohibiting that.

Robert,

How is your rule any different from trapping serialization of all non-pointer types whose name begins with an uppercase letter? There's an easy workaround (just serialize via pointer) and

...

the conditions of the trap have as much to do with the problem you're trying to detect.

Of course this is what I'm not seeing. I believe that if someone has situation where he feels that he has to use a const_cast, its likely that he plans to change the value during the course of the creation of the archive. If its not marked "track_never" he will set himself up for lots of pain he doesn't need. I concede this is speculative on my part. But the examples presented so far have reinforced my beliefs on this. I greped the serialization library and code and cound 40 BOOST_STATIC_ASSERT s. I don't know how many are cases similar to this one but at least some of them are. In a number of cases there is BOOST_STATIC_WARNING commented out. So looking back, I've wrestled with this on a regular basis. One way address the concerns who don't want this "help" from the library writer would be to introdue the concept of a warning level - ignore, warn, error to be used in the places where these macros are. Unfortunately, this entails a whole new concept to be implemented, tested, and documented and I don't see how I would have time for this. I'm tying up a couple of loose ends regarding shared_ptr serialization and then there is the compiler/tool issues which generate a large number failures. Robert Ramey

David Abrahams

3:17 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

David Abrahams wrote:

...
"Robert Ramey" <ramey@rrsd.com> writes:

...
So my motivation was and is purely practical. I expected that the only people inconvenienced by this are those that are inconvenienced by const in general. And that they would handle it the same way. They would just grumble and apply a const_cast and forget about it. I'm certainly not prohibiting that.

Robert,

How is your rule any different from trapping serialization of all non-pointer types whose name begins with an uppercase letter? There's an easy workaround (just serialize via pointer) and

...
the conditions of the trap have as much to do with the problem you're trying to detect.

Of course this is what I'm not seeing. I believe that if someone has situation where he feels that he has to use a const_cast,

I never feel I have to use a const_cast to const-ify something. If the code I'm working with isn't broken, I almost never have to do anything, because a mutable object is-a const object in C++. But if I did have to do something, I'd let implicit conversion take care of it or use implicit_cast.

...

its likely that he plans to change the value during the course of the creation of the archive.

And you hold this belief based on... what? As far as I can tell, it's just a misunderstanding about the meaning of "const."

...

If its not marked "track_never" he will set himself up for lots of pain he doesn't need. I concede this is speculative on my part. But the examples presented so far have reinforced my beliefs on this.

...

I greped the serialization library and code and cound 40 BOOST_STATIC_ASSERT s. I don't know how many are cases similar to this one but at least some of them are. In a number of cases there is BOOST_STATIC_WARNING commented out. So looking back, I've wrestled with this on a regular basis. One way address the concerns who don't want this "help" from the library writer would be to introdue the concept of a warning level - ignore, warn, error to be used in the places where these macros are. Unfortunately, this entails a whole new concept to be implemented, tested, and documented and I don't see how I would have time for this.

Or you could take the simple way out and implement Peter Dimov's suggested semantics. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams

27 Jun 27 Jun

11:16 a.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

I think that the default behavior on a sequence of two saves with the same address should be to write the two objects, as if address tracking isn't on. If later a pointer to that address is saved, an exception should occur. Or more generally,

- one value save, then N pointer saves sharing the same address should be OK;

- M pointer saves sharing the same address should be OK; (*)

- K value saves sharing the same address should be OK and result in K copies in the archive;

- all other sequences raise an exception at first opportunity.

I'm sure you mean trigger an assertion, right? ;-) -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

noon

David Abrahams wrote:

...

"Peter Dimov" <pdimov@mmltd.net> writes:

...
I think that the default behavior on a sequence of two saves with the same address should be to write the two objects, as if address tracking isn't on. If later a pointer to that address is saved, an exception should occur. Or more generally,

- one value save, then N pointer saves sharing the same address should be OK;

- M pointer saves sharing the same address should be OK; (*)

- K value saves sharing the same address should be OK and result in K copies in the archive;

- all other sequences raise an exception at first opportunity.

I'm sure you mean trigger an assertion, right? ;-)

No, I really mean an exception. Asserting while saving isn't a good thing; the program goes down, taking the user's document with it. Saving into a different format may be successful and the opportunity shouldn't be denied. A release build with assertions disabled that silently produces unreadable files isn't a good thing, either.

David Abrahams

1:26 p.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
I think that the default behavior on a sequence of two saves with the same address should be to write the two objects, as if address tracking isn't on. If later a pointer to that address is saved, an exception should occur. Or more generally,

- one value save, then N pointer saves sharing the same address should be OK;

- M pointer saves sharing the same address should be OK; (*)

- K value saves sharing the same address should be OK and result in K copies in the archive;

- all other sequences raise an exception at first opportunity.

I'm sure you mean trigger an assertion, right? ;-)

No, I really mean an exception. Asserting while saving isn't a good thing; the program goes down, taking the user's document with it. Saving into a different format may be successful and the opportunity shouldn't be denied.

An assertion can always be set up to throw in release mode. I was thinking of BOOST_ASSERT or something like it, not plain old assert.

...

A release build with assertions disabled that silently produces unreadable files isn't a good thing, either.

But don't you want to be able to debug this coding error when you make it during development? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

2:20 p.m.

David Abrahams wrote:

...

"Peter Dimov" <pdimov@mmltd.net> writes:

...
No, I really mean an exception. Asserting while saving isn't a good thing; the program goes down, taking the user's document with it. Saving into a different format may be successful and the opportunity shouldn't be denied.

An assertion can always be set up to throw in release mode. I was thinking of BOOST_ASSERT or something like it, not plain old assert.

No, no. :-) An assertion looks like this: Requires: Cond. An exception looks like this: Throws: pointer_conflict when Cond. Exception safety: If an exception is thrown, there are no effects.

...

...
A release build with assertions disabled that silently produces unreadable files isn't a good thing, either.

But don't you want to be able to debug this coding error when you make it during development?

Yes, probably. But I don't want the undefined behavior.

David Abrahams

3:26 p.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
No, I really mean an exception. Asserting while saving isn't a good thing; the program goes down, taking the user's document with it. Saving into a different format may be successful and the opportunity shouldn't be denied.

An assertion can always be set up to throw in release mode. I was thinking of BOOST_ASSERT or something like it, not plain old assert.

No, no. :-)

An assertion looks like this:

Requires: Cond.

Yes. Why wouldn't you want this function to require that condition? Anything else is a coding error.

...

An exception looks like this:

Throws: pointer_conflict when Cond. Exception safety: If an exception is thrown, there are no effects.

...
...
A release build with assertions disabled that silently produces unreadable files isn't a good thing, either.

But don't you want to be able to debug this coding error when you make it during development?

Yes, probably. But I don't want the undefined behavior.

BOOST_ASSERT doesn't have to induce undefined behavior. I just want a clear separation between coding errors and other conditions. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

7:03 p.m.

David Abrahams wrote:

...

"Peter Dimov" <pdimov@mmltd.net> writes:

...
David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
No, I really mean an exception. Asserting while saving isn't a good thing; the program goes down, taking the user's document with it. Saving into a different format may be successful and the opportunity shouldn't be denied.

An assertion can always be set up to throw in release mode. I was thinking of BOOST_ASSERT or something like it, not plain old assert.

No, no. :-)

An assertion looks like this:

Requires: Cond.

Yes. Why wouldn't you want this function to require that condition? Anything else is a coding error.

I am reluctant to label it a coding error ("should never happen in a correct program"), because in general it can be very hard to ensure that the condition isn't violated. It's a bit like: Requires: a particular state of the library that you have no way of querying.

David Abrahams

8:20 p.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

...
Yes. Why wouldn't you want this function to require that condition? Anything else is a coding error.

I am reluctant to label it a coding error ("should never happen in a correct program"), because in general it can be very hard to ensure that the condition isn't violated. It's a bit like:

Requires: a particular state of the library that you have no way of querying.

Harrumph. Isn't it the archive under construction whose state is constrained? If you know you've put 5 elements in a vector, you don't have to query it before calling pop_back(), do you? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

9:33 p.m.

David Abrahams wrote:

...

"Peter Dimov" <pdimov@mmltd.net> writes:

...
...
Yes. Why wouldn't you want this function to require that condition? Anything else is a coding error.

I am reluctant to label it a coding error ("should never happen in a correct program"), because in general it can be very hard to ensure that the condition isn't violated. It's a bit like:

Requires: a particular state of the library that you have no way of querying.

Harrumph. Isn't it the archive under construction whose state is constrained? If you know you've put 5 elements in a vector, you don't have to query it before calling pop_back(), do you?

So it's safe to remove size() and empty() from the vector interface as users always know how large their vectors are, right? If you are given a vector you don't know whether you can pop_back. If you are given an archive you don't know whether you are allowed to save by value, or whether you can safely save by pointer. (In general.)

David Abrahams

10:23 p.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
...
Yes. Why wouldn't you want this function to require that condition? Anything else is a coding error.

I am reluctant to label it a coding error ("should never happen in a correct program"), because in general it can be very hard to ensure that the condition isn't violated. It's a bit like:

Requires: a particular state of the library that you have no way of querying.

Harrumph. Isn't it the archive under construction whose state is constrained? If you know you've put 5 elements in a vector, you don't have to query it before calling pop_back(), do you?

So it's safe to remove size() and empty() from the vector interface as users always know how large their vectors are, right?

No, because people need that functionality, and not just to check whether it's okay to pop_back.

...

If you are given a vector you don't know whether you can pop_back.

Right. But then you don't want to pop elements off a vector you don't know anything about.

...

If you are given an archive you don't know whether you are allowed to save by value, or whether you can safely save by pointer.

(In general.)

Yeah, but what's the use case for this? Someone you don't know hands you an archive and tells you to save something in it without telling you how it's safe to do so? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

10:52 p.m.

David Abrahams wrote:

...

"Peter Dimov" <pdimov@mmltd.net> writes:

...
If you are given an archive you don't know whether you are allowed to save by value, or whether you can safely save by pointer.

(In general.)

Yeah, but what's the use case for this? Someone you don't know hands you an archive and tells you to save something in it without telling you how it's safe to do so?

Yep, that's pretty much what happens every time your 'save' function is invoked.

David Abrahams

28 Jun 28 Jun

2:27 a.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
If you are given an archive you don't know whether you are allowed to save by value, or whether you can safely save by pointer.

(In general.)

Yeah, but what's the use case for this? Someone you don't know hands you an archive and tells you to save something in it without telling you how it's safe to do so?

Yep, that's pretty much what happens every time your 'save' function is invoked.

Then it's almost impossible to write a reliable save function. How is the exception going to help? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

11:04 a.m.

David Abrahams wrote:

...

"Peter Dimov" <pdimov@mmltd.net> writes:

...
David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
If you are given an archive you don't know whether you are allowed to save by value, or whether you can safely save by pointer.

(In general.)

Yeah, but what's the use case for this? Someone you don't know hands you an archive and tells you to save something in it without telling you how it's safe to do so?

Yep, that's pretty much what happens every time your 'save' function is invoked.

Then it's almost impossible to write a reliable save function.

It's impossible to write a reliable save function regardless of that.

...

How is the exception going to help?

By detecting that this particular sequence of saves (which in general depends on the structure being saved and is a runtime property) would have produced an unreadable archive.

David Abrahams

11:56 a.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
If you are given an archive you don't know whether you are allowed to save by value, or whether you can safely save by pointer.

(In general.)

Yeah, but what's the use case for this? Someone you don't know hands you an archive and tells you to save something in it without telling you how it's safe to do so?

Yep, that's pretty much what happens every time your 'save' function is invoked.

Then it's almost impossible to write a reliable save function.

It's impossible to write a reliable save function regardless of that.

This just seems cryptic. What do you mean?

...

...
How is the exception going to help?

By detecting that this particular sequence of saves (which in general depends on the structure being saved and is a runtime property) would have produced an unreadable archive.

An assertion can detect that, too. My point is that you have to have enough knowledge about how the program is structured, etc., to do the save reliably or there's little point in trying. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

29 Jun 29 Jun

2:03 p.m.

David Abrahams wrote:

...

"Peter Dimov" <pdimov@mmltd.net> writes:

...
David Abrahams wrote:

[...]

...

...
...
Then it's almost impossible to write a reliable save function.

It's impossible to write a reliable save function regardless of that.

This just seems cryptic. What do you mean?

A save function can always fail; you can't write a reliable save function if you use "reliable" as "will succeed". If you use "reliable" as "will not invoke undefined behavior", then the exception version is reliable, the precondition version is not. I'm very surprised by this suggestion to throw on undefined behavior coming from you. You can't have it both ways. Either the behavior is defined for the problematic cases, or it isn't. If it isn't, you can't expect an exception. You can't expect anything. Corrupting the archive beyond redemption with no warning is perfectly within the specification.

...

...
...
How is the exception going to help?

By detecting that this particular sequence of saves (which in general depends on the structure being saved and is a runtime property) would have produced an unreadable archive.

An assertion can detect that, too.

No, an assertion does not detect anything. An assertion is an implementation detail, one particular manifestation of undefined behavior. You can't use "reliable" and "assertion" in the same sentence.

David Abrahams

3 p.m.

"Peter Dimov" <pdimov@gmail.com> writes:

...

David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
David Abrahams wrote:

[...]

...
...
...
Then it's almost impossible to write a reliable save function.

It's impossible to write a reliable save function regardless of that.

This just seems cryptic. What do you mean?

A save function can always fail; you can't write a reliable save function if you use "reliable" as "will succeed".

Sure, if the save function is just doing hashing, you can ;-) But seriously, that's not what I mean by reliable. I mean "a save function that's doing something sensical." Either the author of the save function tells the caller whether the object will be tracked, and the caller knows whether that behavior is appropriate given the context, or there's little point in making the save call to begin with. I mean, you're not going to do try { save_call_with_tracking(ar, x); } catch(tracking_error&) { save_call_without_tracking(ar, x); } right?

...

If you use "reliable" as "will not invoke undefined behavior", then the exception version is reliable, the precondition version is not.

I'm very surprised by this suggestion to throw on undefined behavior coming from you. You can't have it both ways. Either the behavior is defined for the problematic cases, or it isn't. If it isn't, you can't expect an exception. You can't expect anything. Corrupting the archive beyond redemption with no warning is perfectly within the specification.

Well, it must've been a moment of weakness. I never really thought the exception was a good idea no matter how you classify the condition.

...

...
...
...
How is the exception going to help?

By detecting that this particular sequence of saves (which in general depends on the structure being saved and is a runtime property) would have produced an unreadable archive.

An assertion can detect that, too.

No, an assertion does not detect anything. An assertion is an implementation detail, one particular manifestation of undefined behavior. You can't use "reliable" and "assertion" in the same sentence.

Just did ;-) Seriously, though: the assert macro in <cassert> has defined behavior. You can definitely document that a function uses assert(x) for any condition x; when you do that, you're invoking defined behavior. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

5:40 p.m.

David Abrahams wrote:

...

But seriously, that's not what I mean by reliable. I mean "a save function that's doing something sensical." Either the author of the save function tells the caller whether the object will be tracked, and the caller knows whether that behavior is appropriate given the context, or there's little point in making the save call to begin with. I mean, you're not going to do

try { save_call_with_tracking(ar, x); } catch(tracking_error&) { save_call_without_tracking(ar, x); }

right?

The user of my application will do something along these lines, yes. :-) When you have a complex data structure, it isn't exactly trivial to determine in advance whether it's "tracking safe" to attempt a save. If you try to do that, you'll end up duplicating the save functionality. Consider something like vector< shared_ptr<A> > v; where the A is an abstract class and the implementations are put into v from separate parts of the program written by different people. If you absolutely insist on an assert, you should provide a way to test whether the precondition is met.

David Abrahams

1 Jul 1 Jul

7:55 a.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

David Abrahams wrote:

...
But seriously, that's not what I mean by reliable. I mean "a save function that's doing something sensical." Either the author of the save function tells the caller whether the object will be tracked, and the caller knows whether that behavior is appropriate given the context, or there's little point in making the save call to begin with. I mean, you're not going to do

try { save_call_with_tracking(ar, x); } catch(tracking_error&) { save_call_without_tracking(ar, x); }

right?

The user of my application will do something along these lines, yes. :-)

The /user/ of your /application/ is a programmer? I'm sorry, but could you fill in some detail here? You're baffling me.

...

When you have a complex data structure, it isn't exactly trivial to determine in advance whether it's "tracking safe" to attempt a save. If you try to do that,

You mean "to determine in advance...?"

...

you'll end up duplicating the save functionality.

How?

...

Consider something like

vector< shared_ptr<A> > v;

where the A is an abstract class and the implementations are put into v from separate parts of the program written by different people.

Well, aside from not being able to deserialize deleters, I'm not sure what the problem is. In your scheme, AFAICT, a pointer save never directly triggers an exception anyway. Am I missing something there? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Peter Dimov

9:56 a.m.

David Abrahams wrote:

...

"Peter Dimov" <pdimov@mmltd.net> writes:

...
David Abrahams wrote:

...
[...] I mean, you're not going to do

try { save_call_with_tracking(ar, x); } catch(tracking_error&) { save_call_without_tracking(ar, x); }

right?

The user of my application will do something along these lines, yes. :-)

The /user/ of your /application/ is a programmer? I'm sorry, but could you fill in some detail here? You're baffling me.

I may not write the above fallback explicitly, but if I have menu items for "Save with tracking" and "Save without tracking" and the user tries tracking and it fails, he will likely try the other option.

...

...
When you have a complex data structure, it isn't exactly trivial to determine in advance whether it's "tracking safe" to attempt a save. If you try to do that,

You mean "to determine in advance...?"

...
you'll end up duplicating the save functionality.

How?

If you have an arbitrary data structure, the easiest method to determine whether it's "tracking safe" is to attempt a save and watch for the exception. If you try to write an is_tracking_safe algorithm, you'll see that it basically follows the same pattern, except that it doesn't write to an archive.

...

...
Consider something like

vector< shared_ptr<A> > v;

where the A is an abstract class and the implementations are put into v from separate parts of the program written by different people.

Well, aside from not being able to deserialize deleters, I'm not sure what the problem is. In your scheme, AFAICT, a pointer save never directly triggers an exception anyway. Am I missing something there?

A pointer save triggers an exception if you've already had two value saves from the same address. The problem in the above is not in serializing shared_ptr<A>, it's that in general you don't know what's behind these A's and what save sequence they are going to produce.

David Abrahams

12:53 p.m.

"Peter Dimov" <pdimov@mmltd.net> writes:

...

David Abrahams wrote:

...
"Peter Dimov" <pdimov@mmltd.net> writes:

...
David Abrahams wrote:

...
[...] I mean, you're not going to do

try { save_call_with_tracking(ar, x); } catch(tracking_error&) { save_call_without_tracking(ar, x); }

right?

The user of my application will do something along these lines, yes. :-)

The /user/ of your /application/ is a programmer? I'm sorry, but could you fill in some detail here? You're baffling me.

I may not write the above fallback explicitly, but if I have menu items for "Save with tracking" and "Save without tracking" and the user tries tracking and it fails, he will likely try the other option.

You're not seriously proposing that a real GUI application would have both those options, are you? Can you really imagine an application where either of those two strategies might be applicable?

...

...
...
When you have a complex data structure, it isn't exactly trivial to determine in advance whether it's "tracking safe" to attempt a save. If you try to do that,

You mean "to determine in advance...?"

...
you'll end up duplicating the save functionality.

How?

If you have an arbitrary data structure, the easiest method to determine whether it's "tracking safe" is to attempt a save and watch for the exception. If you try to write an is_tracking_safe algorithm, you'll see that it basically follows the same pattern, except that it doesn't write to an archive.

Okay.

...

...
...
Consider something like

vector< shared_ptr<A> > v;

where the A is an abstract class and the implementations are put into v from separate parts of the program written by different people.

Well, aside from not being able to deserialize deleters, I'm not sure what the problem is. In your scheme, AFAICT, a pointer save never directly triggers an exception anyway. Am I missing something there?

A pointer save triggers an exception if you've already had two value saves from the same address.

Oh, OK.

...

The problem in the above is not in serializing shared_ptr<A>, it's that in general you don't know what's behind these A's and what save sequence they are going to produce.

I understand that you're saying it's hard to control the order and manner in which things are serialized overall, and it's especially hard for the author of a small component to make local decisions whose rightness can only be determined with knowledge he doesn't have. What I don't understand is how any program can have a reasonable fallback strategy for the case when things go wrong, nor how a program where things go wrong can be considered correct. In other words, hard as it may be, it seems to me that you have to control the order and manner in which things are serialized in your program, or the program just won't work. To have a fighting chance, you probably need to establish some program-wide policies that allow you to know enough about how other parts of the program are serializing things. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Robert Ramey

5:09 p.m.

David Abrahams wrote:

...

To have a fighting chance, you probably need to establish some program-wide policies that allow you to know enough about how other parts of the program are serializing things.

Hmmm - sounds like a "tracking trait" to me. Robert Ramey

David Abrahams

25 Jun 25 Jun

11:14 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

This is exactly the thing I'm thinking about. Without the trap, the user would have no clue that he is doing something he really doesn't intend to do. Of course he might say "I want to track it anyway" or he might say - well maybe in this example but ...

That might make sense if the set of circumstances you trap actually had a relationship to the set of problematic circumstances, but it doesn't. It's almost like choosing a random set of cases in which to issue a diagnostic in order to "force the user to think about what she's doing." (**) The only difference here is that you're telling the user which cases will cause the diagnostic. She may preemptively const-ify all of her tracked serializations, but how will that help? Adding const doesn't prevent any bugs; it just silences your diagnostic. (**) I don't intend to imply that you used those words, but I read them often in postings from those who want to deny users abstraction: exception-handling, garbage collection, classes, operator overloading, etc. Yours seems like a similarly patronizing approach. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Martin Slater

26 Jun 26 Jun

7:04 a.m.

...

ar << *this; // I can't imagine what this if for

Is this going to break with the new release? I use this idiom quite a lot and would hate to have to hack the boost source immediately 1.33 is released. From my point of view this lets me serialise some complex objects with exposing the acutal serialisation mechanism used, hence its an implementation detail that I am free to change without breaking client code. These objects take a file path to load from in a constructor and initialise from that and also have a save member funtion to write out the archive. Whats so odd about that? Martin -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.323 / Virus Database: 267.8.1/28 - Release Date: 24/06/2005

Robert Ramey

3:28 p.m.

I didn't mean to say that there's anything wrong with it. Just that I couldn't imagine a situation where one would want to do it. I see it now. Thanks. Robert Ramey Martin Slater wrote:

...

...
ar << *this; // I can't imagine what this if for

Is this going to break with the new release? I use this idiom quite a lot and would hate to have to hack the boost source immediately 1.33 is released. From my point of view this lets me serialise some complex objects with exposing the acutal serialisation mechanism used, hence its an implementation detail that I am free to change without breaking client code. These objects take a file path to load from in a constructor and initialise from that and also have a save member funtion to write out the archive. Whats so odd about that?

Martin

martin.ecker＠tab.at

27 Jun 27 Jun

10:37 a.m.

Hi, Martin Slater wrote:

...

...
ar << *this; // I can't imagine what this if for

Is this going to break with the new release? I use this idiom quite a lot and would hate to have to hack the boost source immediately 1.33 is released.

Same here. We use it for the same reasons. For example, we have some classes that expose a load and save member function, which internally uses Boost.Serialization, something like: void my_class:save() { my_archive ar(some_stream); ar << *this; } Best Regards, Martin TAB Austria Haiderstraße 40 4052 Ansfelden Austria Phone: +43 7229 78040-218 Fax: +43 7229 78040-209 E-mail: martin.ecker@tab.at http://www.tab.at

Robert Ramey

3:12 p.m.

martin.ecker@tab.at wrote:

...

Hi,

Martin Slater wrote:

...
...
ar << *this; // I can't imagine what this if for

Is this going to break with the new release? I use this idiom quite a lot and would hate to have to hack the boost source immediately 1.33 is released.

Same here. We use it for the same reasons. For example, we have some classes that expose a load and save member function, which internally uses Boost.Serialization, something like:

void my_class:save() { my_archive ar(some_stream); ar << *this; }

Will it kill you if you have to change it to the following? void my_class:save() const { my_archive ar(some_stream); ar << *this; } Robert Ramey

Robert Ramey

24 Jun 24 Jun

4:03 a.m.

...

Each check is only reasonable if it finds more bugs than it causes problems. We seem to disagree about the proportion for the *specific case* of the >check in serialization library.

I guess so. But you might check Joaquin's previous message regarding this.

...

...
Of course your correct on this. But lot's of people drive without seat belts too. That doesn't mean that the rest of us should be prohibited from using them.

...

I don't think the analogy is correct. I guess if you were required to refasten the belt each time you change gear, you won't be using it.

If this is happening more than an occasionally, I would guess you're using the library in ways that I failed to envision. In my demos and examples it was easy to avoid the problem without casts and the tests/demos do things like save and load within the same function that not the most usual case.

...

...
Default trait is non-primitive objects is "track_selectivly" This means that objects will be tracked if and only anywhere in the source, some object of this class is serialized through a pointer. So when I'm compiling one module above and checking at compile time I realy don't know that the object will in fact be tracked. So I trap unless the serialization trait is set to "track never". To reiterate, in this case the object won't be tracked unless somewhere else its serialized as a pointer.

...

I'm not sure this behaviour is right. It certainly matters if I save the *same* object as pointer or not. Why does it matter if I have *another* object by pointer.

...

Suppose you've saving an object with the same address twice. There possible situations are:

...

1. Both saves are via pointers. You enable tracking for this address; only one object is actually saved. User is responsible for making sure that the object does not change between saves.

...

2. First save is via pointer, the second is not by pointer. You throw pointer_conflict.

...

3. First save is not by pointer, second is by pointer. You ehable tracking >for this address.

...

4. Both saves are not via pointer. You don't track anything.

...

Is there anything wrong with above behaviour?

#3 -for this address-. We can't know at compile time what the addresses of the objects are. If an object of a certain type is serialized anywhere in the program through a pointer, we have to instantiate code to track pointers.

...

...
class a ( X & m_x; .... };

...

And how would you deserialize this, given that references are not rebindable?

Specialization of save/load_construct_data

...

...
...
...
for(...{

...
X x = *it; // create a copy of ar << x }

How do we know that one of the x's saved in the loop is not serialized asa pointer somewhere else?

...

You keep a set of addresses of all saved objects.

Isn't it just easier and better just to let the serialization system do it by reformulating the above as: For(... const X & x = *it; ar << x }

...

...
We have to track ALL x's because we don't know which ones if any are being tracked somewhere else. It could even be in a different module.

...

Right, you need to track all addressed while saving, but in archive the saves from the above loop need not be marked as tracked.

Currently there is no way (and no need in my opinion) to assign tracking behavior for each invocation of the << and >> operator. If you want to suppress tracking altogether for type X you can assign track_never trait. In the case the STATIC_ASSERT won't trap - so there you are.

...

const X x; const X& x2 = x;

...

are you saying that saving them works differently?

I'm saying that const X &x = m_x; ar << x; is different thatn const x = m_x ar << x;

...

...
which we are quite comfortable with. In fact I believe

auto_ptr<const Data> finish() { //auto_ptr<Data> result; // whoops auto_ptr<const Data> result; // modify result return result; }

expresses your intention quite well. That you're returning an auto_ptr to an object that you don't expect should be changed by anyone who gets the pointer this way.

Except that it does not compile.

...

I recall that in earlier versions users simple *could not* serialize a pointer to int. Did that change or I am wrong?

It could probably be done by tweaking the serialization traits for int but that would ripple through the whole program. Robert Ramey

Vladimir Prus

7:33 a.m.

Robert Ramey wrote:

...

...
Each check is only reasonable if it finds more bugs than it causes problems. We seem to disagree about the proportion for the *specific case* of the >check in serialization library.

I guess so. But you might check Joaquin's previous message regarding this.

I can't find any messages from Joaquin in this thread? Maybe you can send a link or forward that message to me?

...

...
I'm not sure this behaviour is right. It certainly matters if I save the *same* object as pointer or not. Why does it matter if I have *another* object by pointer.

...
Suppose you've saving an object with the same address twice. There possible situations are:

...
1. Both saves are via pointers. You enable tracking for this address; only one object is actually saved. User is responsible for making sure that the object does not change between saves.

...
2. First save is via pointer, the second is not by pointer. You throw pointer_conflict.

...
3. First save is not by pointer, second is by pointer. You ehable tracking >for this address.

...
4. Both saves are not via pointer. You don't track anything.

...
Is there anything wrong with above behaviour?

#3 -for this address-. We can't know at compile time what the addresses of the objects are. If an object of a certain type is serialized anywhere in the program through a pointer, we have to instantiate code to track pointers.

Sure, you have to *instantiate code*. But you can decide at runtime if each specific address needs tracking.

...

...
...
class a ( X & m_x; .... };

...
And how would you deserialize this, given that references are not rebindable?

Specialization of save/load_construct_data

If the user is required to provide special hooks here, he might as well take special care when saving reference, no? In "save_construct_data"?

...

...
...
...
...
for(...{

...
X x = *it; // create a copy of ar << x }

How do we know that one of the x's saved in the loop is not serialized asa pointer somewhere else?

...
You keep a set of addresses of all saved objects.

Isn't it just easier and better just to let the serialization system do it by reformulating the above as:

For(... const X & x = *it; ar << x }

1. To clarify "you keep a set.." above means "serialization library keeps". 2. If serialization library can keep the set of saved objects, then wouldn't it be easier if serialization worked in both cases? 3. I'm getting dizzy. Does 'const' has any other effect that disable your STATIC_ASSERT? If yes, then it does not make it easier to figure out of "one of the x's saved in the loop is serialized as a pointer somewhere else". Inside: X x = *it; ar << x; you record address of 'x' and offset in archive where it's saved (or some other id). When user later does: X* x = whatever; ar << x; you check you set of addresses. If the address if found, you write a reference to previous object to the archive. Looks pretty simple to me.

...

...
...
We have to track ALL x's because we don't know which ones if any are being tracked somewhere else. It could even be in a different module.

...
Right, you need to track all addressed while saving, but in archive the saves from the above loop need not be marked as tracked.

Currently there is no way (and no need in my opinion) to assign tracking behavior for each invocation of the << and >> operator. If you want to suppress tracking altogether for type X you can assign track_never trait. In the case the STATIC_ASSERT won't trap - so there you are.

Returning to example you gave in the first email in this thread:

...

Its very easy to write for(...{ X x = *it; // create a copy of ar << x }

all the x's are at the same address so if they happen to be tracked because a pointer to some X in serialized somewhere in the program, then the subsequent copies would be supprsed by tracking.

What does "the subsequence copies would be supressed by tracking"? Are you saing that only one 'X' will be stored in archive? And what will happen duing deserialization. Will you load just one 'X' object, and then for each iteration of deserialization loop, make a copy of it? Then it's very strange behaviour, and what's even more worring, that behaviour is activated if some *unrelated* object of the same time is saved in some other module that I have no idea about. That's very fragile. Or do you expected that every given type 'X' is either serialized by value, or by pointer, but never by both ways? That sounds like a artificial restriction. - Volodya

David Abrahams

1:24 p.m.

Vladimir Prus <ghost@cs.msu.su> writes:

...

...
Its very easy to write for(...{ X x = *it; // create a copy of ar << x }

all the x's are at the same address so if they happen to be tracked because a pointer to some X in serialized somewhere in the program, then the subsequent copies would be supprsed by tracking.

What does "the subsequence copies would be supressed by tracking"? Are you saing that only one 'X' will be stored in archive?

Yes, if (as is highly likely) the x happens to have the same address in each iteration of the loop. To reconstruct object graphs correctly (that's what tracking is for), you need to do it that way.

...

And what will happen duing deserialization. Will you load just one 'X' object, and then for each iteration of deserialization loop, make a copy of it?

Surely it depends what the deserialization loop looks like, but the serialization library itself won't make any copies: it will return a reference or pointer to that same object each time. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Vladimir Prus

27 Jun 27 Jun

6:17 a.m.

David Abrahams wrote:

...

Vladimir Prus <ghost@cs.msu.su> writes:

...
...
Its very easy to write for(...{ X x = *it; // create a copy of ar << x }

all the x's are at the same address so if they happen to be tracked because a pointer to some X in serialized somewhere in the program, then the subsequent copies would be supprsed by tracking.

What does "the subsequence copies would be supressed by tracking"? Are you saing that only one 'X' will be stored in archive?

Yes, if (as is highly likely) the x happens to have the same address in each iteration of the loop. To reconstruct object graphs correctly (that's what tracking is for), you need to do it that way.

If you serializing object graphs, then you either: 1. Always store an object by pointer 2. Save it by value first time, and by pointer other times. Both situations are easy to detect. But another situation is when object with the same address is saved several times by value. In that case, it's very reasonable to do no tracking at all. Implementing should be easy. For each stored pointer it's necessary to keep a flag, telling if: - there was one save by value - there were several saved by value - there were save by pointer. The logic is. - Save by value - if there were saves by pointer, throw pointer_conflict - if there were (several) saves by value, save the data, change flag to "several saves by value" - Save by pointer - if there were save by value, store id of previously saved object, change flag to "save by pointer" - if there were several saves by value, throw something - if there were save by pointer, store id of previously saved object I can't find anything wrong with above logic. Of course, I might be missing something too. - Volodya

David Abrahams

11:15 a.m.

Vladimir Prus <ghost@cs.msu.su> writes:

...

If you serializing object graphs, then you either:

1. Always store an object by pointer 2. Save it by value first time, and by pointer other times.

Both situations are easy to detect. But another situation is when object with the same address is saved several times by value. In that case, it's very reasonable to do no tracking at all.

Yes, that's a reasonable option; I think it's what Peter has been advocating.

...

Implementing should be easy. For each stored pointer it's necessary to keep a flag, telling if:

- there was one save by value - there were several saved by value - there were save by pointer.

The logic is.

- Save by value - if there were saves by pointer, throw pointer_conflict - if there were (several) saves by value, save the data, change flag to "several saves by value"

- Save by pointer - if there were save by value, store id of previously saved object, change flag to "save by pointer" - if there were several saves by value, throw something

I'm sure you mean assert something, right? ;-)

...

- if there were save by pointer, store id of previously saved object

I can't find anything wrong with above logic. Of course, I might be missing something too.

It's certainly reasonable and predictable. Since the library can know when it is dealing with a reference member and when it is dealing with a value member, if reference members are just saved by pointer, it all holds together. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams

24 Jun 24 Jun

12:41 p.m.

"Robert Ramey" <ramey@rrsd.com> writes:

...

I'm saying that

const X &x = m_x; ar << x; is different thatn const x = m_x ar << x;

I'm sorry, but unless you're doing something to detect that X's copy ctor is called in the 2nd case, what you just said is impossible. Aside from one copy ctor the 2 snippets are identical from the language's point-of-view. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Jeff Garland

11 May 11 May

6:19 a.m.

On Tue, 10 May 2005 11:36:27 +0400, Vladimir Prus wrote

...

Robert Ramey wrote:

...
...
...
instantiated from this code of mine:

ofstream ofs(av[2]); boost::archive::binary_oarchive oa(ofs); oa << *data_builder.finish();

This code should be considered erroneous. The documentation in 1.32 addressed this but unfortunately the enforcement was lost in the shuffle.

Excuse me, but I refuse to accept that this code is erroneous. As I've said, this is my existing design -- the 'finish' function returns non-const pointer. The fact that somewhere else, I serialize the result of 'finish' has nothing to do with 'finish' return type, and I should not be forced to change my existing interface just because I use serialization somewhere.

...
The intention is to trap the saving of tracked non-const objects. This is to prevent users from doing something like For(... A a; ... ar << a; // will save the address of a - which is on the stack

If a is tracked here, instances of a after the first will be stored only as reference ids. When the data is restored, all the as will be the same. Not what the programmer intended - and a bear to find. This is really the save counterpart to the load situation which required the implementation of reset_object_address.

So, you're saing that "const" == "no allocated on stack"? I don't see why this statement is true. I can just as well do this:

void foo(const A& a) { ar << a; }

and circumvent your protection. Further, how often is it that non-pointer object is tracked? I think it's rare case, while saving a pointer is a common case, and making common case inconvenient for the sake of non-common case does not seem right.

... more stuff...

FWIW, I agree that this new change in serialization is wrong. Even though the docs warned that this was the design in the last release, it was easy to overlook since it was not enforced and even the example code did not live to the 'everything to be serialized must be const' rule. So basically this is going to break alot of existing code and I think for dubious benefit.... Jeff

Vladimir Prus

7:05 a.m.

Jeff Garland wrote:

...

...
... more stuff...

FWIW, I agree that this new change in serialization is wrong. Even though the docs warned that this was the design in the last release, it was easy to overlook since it was not enforced and even the example code did not live to the 'everything to be serialized must be const' rule. So basically this is going to break alot of existing code and I think for dubious benefit....

And additionally, the change complicates the interface very much. Not everybody wants to read all serialization docs. With previous version, I could say to a colleague that boost::text_oarchive is just like ostream. Now, I'd have to explain why he should write ar << const_cast<const Whatever*>(whaterver); Since neither of those usages is standard for streams. I believe this is a common situation where an error checking is introduced with good intentions, but that error checking just gets in the way and complicates common usage. - Volodya

Robert Ramey

6 May 6 May

4:57 p.m.

I've just had a chance to review your patches.

...

...
split_member.hpp.

I see no problem here. Actually I have that in my next pending upload

...

...
interface_oarchive.hpp

As explained in my previous posting, I would consider this erroneous and I believe it should be backed out. Did you re-re-run the serialization tests on your local machine before you checked this in? I would be surprised if this change doesn't generate a number of test failures in the test_const* tests as well as array serialization on the borland compiler. Compilers are so permited to convert a const to a non-const by making a copy (although I havn't seen this) so I think this is wrong for that reason as well.

...

...
iserializer.hpp

This is somewhat puzzling to me. Actually I suspect it is a secondary effect from the above change. The issue of const-ness has resulted in tweaks in the archive overrides. The above change would have subtle effects in archives derived from basicarchive. This is due to the CRTP design. After the above change is backed out, This should be checked again. Robert Ramey

7330

Age (days ago)

7386

Last active (days ago)

List overview

Download

73 comments

9 participants

participants (9)

David Abrahams
Jeff Flinn
Jeff Garland
Martin Slater
martin.ecker＠tab.at
Peter Dimov
Peter Dimov
Robert Ramey
Vladimir Prus