[1.44] Beta progress?

older
[1.44] Beta version of PDF docs...

Beman Dawes

19 Jul 2010 19 Jul '10

1:35 p.m.

It will be Wednesday before I can start pulling the 1.44 beta together. In the meantime, does anyone have any serious issues we need to tackle before the beta? Thanks, --Beman

Show replies by date

Robert Ramey

19 Jul 19 Jul

4:36 p.m.

I have one that I'm working on right now. I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library. I've had a lot of problem with this but with the help of interested parties have been making progress. I don't know if it'll be ready by wednesday, but I'm just letting you know. Robert Ramey Beman Dawes wrote:

...

It will be Wednesday before I can start pulling the 1.44 beta together. In the meantime, does anyone have any serious issues we need to tackle before the beta?

Thanks,

--Beman _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

David Abrahams

6:38 p.m.

On Jul 19, 2010, at 12:36 PM, Robert Ramey wrote:

...

I have one that I'm working on right now.

I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library. I've had a lot of problem with this but with the help of interested parties have been making progress.

Hi Robert, I don't see how anything could stop you from making them work with future versions of serialization unless somehow the archives weren't versioned. Weren't they? -- David Abrahams BoostPro Computing http://boostpro.com

Robert Ramey

9:48 p.m.

David Abrahams wrote:

...

On Jul 19, 2010, at 12:36 PM, Robert Ramey wrote:

...
I have one that I'm working on right now.

I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library. I've had a lot of problem with this but with the help of interested parties have been making progress.

Hi Robert,

I don't see how anything could stop you from making them work with future versions of serialization unless somehow the archives weren't versioned. Weren't they?

Here the mistake I made. I have a library version # in the header of every archive. In emitting 1.42 a some "minor" changes to suppress level 4 warnings. A number of warning were of the nature of moving data between different size types. So I used integer traits to specify types of the "right size" and diminish/eliminate the warnings. All in all I felt that the effort was worthwhile in the eliminated almost all the warnings and left those visible which should be visible (very few). In the course of this I fixed a couple of heretofore previously undetected or unencountered bugs. So 1.42 was released and everyone was happy with it. Until it was discovered that a few type changes had broken the ability to read previous binary_?archives. OK no problem - just wait until 1.43 comes out. EXCEPT I had neglected to bump the library version # . Doh!!!. THAT is the real problem. I've got a solution to fix the problem, but it'll take a couple of days to checking - test with those who have helped me find it, update the docs, etc. There are couple of options a) role back to 1.41 b) roll back to 1.43 - e.g. don't update. c) hang on a couple of days In contrast to other problems, this one will just keep getting worse (more archives will need a "fix" applied) as long as nothing is done. It's like a the oil spill. So, I'm very interested in getting this in asap. Robert Ramey

David Abrahams

20 Jul 20 Jul

2:55 a.m.

[sent from tiny mobile device] On Jul 19, 2010, at 5:48 PM, "Robert Ramey" <ramey@rrsd.com> wrote:

...

I had neglected to bump the library version # . Doh!!!.

THAT is the real problem.

That'll do it. Consider inserting the boost version number as well; that gets bumped by the release managers so you can't forget ;-)

Robert Ramey

4:22 a.m.

David Abrahams wrote:

...

[sent from tiny mobile device]

On Jul 19, 2010, at 5:48 PM, "Robert Ramey" <ramey@rrsd.com> wrote:

...
I had neglected to bump the library version # . Doh!!!.

THAT is the real problem.

That'll do it. Consider inserting the boost version number as well; that gets bumped by the release managers so you can't forget ;-)

actually the library version # has been bumped only when there was a change in file format - which was less frequent than boost releases. lol - hold on to you hat - here we go. But I think this question raises a much more interesting aspect of boost. I think the concept of a boost release version is rapidly becoming out of date and irrelevant. Everything points to a future of less coupling between libraries. In the future, I think each library will have it's own interface version number and a separate implementation version number. A "boost release" will only be a list of library version numbers - and of course a snapshot of a set of libraries. In general, a library won't be able to know what boost release it will be part of. Another way of saysing this is that to me boost release is "deployment". In the future there will be different "deployments". TRx subset, reviewed, current and maintained, etc. So the idea of putting the boost release into a library would be circular and not actually doable. Robert Ramey

Matthias Troyer

6:21 p.m.

On 19 Jul 2010, at 22:22, Robert Ramey wrote:

...

David Abrahams wrote:

...
[sent from tiny mobile device]

On Jul 19, 2010, at 5:48 PM, "Robert Ramey" <ramey@rrsd.com> wrote:

...
I had neglected to bump the library version # . Doh!!!.

THAT is the real problem.

That'll do it. Consider inserting the boost version number as well; that gets bumped by the release managers so you can't forget ;-)

actually the library version # has been bumped only when there was a change in file format - which was less frequent than boost releases.

lol - hold on to you hat - here we go.

But I think this question raises a much more interesting aspect of boost.

I think the concept of a boost release version is rapidly becoming out of date and irrelevant. Everything points to a future of less coupling between libraries. In the future, I think each library will have it's own interface version number and a separate implementation version number. A "boost release" will only be a list of library version numbers - and of course a snapshot of a set of libraries. In general, a library won't be able to know what boost release it will be part of.

Another way of saysing this is that to me boost release is "deployment". In the future there will be different "deployments". TRx subset, reviewed, current and maintained, etc. So the idea of putting the boost release into a library would be circular and not actually doable.

Hi Robert, What you can do is just bump the version number with every release of Boost.Serialization, even if you think that nothing has changed. Matthias

Robert Ramey

8:34 p.m.

...

Hi Robert,

What you can do is just bump the version number with every release of Boost.Serialization, even if you think that nothing has changed.

That would be prudent and I'm considering it. Actually, I think the whole issue of library version # is an interesting one which merits consideration. I see a couple of version #'s a) api version - like library_version is now b) implemenation version - bumped every library change c) ABI version. The inclusion of these version # in the library would permit library user code to verify that dynamically loaded or statically linked libraries are compatible with the libraries being used. So program x would include code which checks the api level and can abort if the linked in library isn't recent enough. Finding "dll hell" bugs like this can be a nightmare. Robert Ramey

Matthias Troyer

7:57 p.m.

On 20 Jul 2010, at 14:34, Robert Ramey wrote:

...

...
Hi Robert,

What you can do is just bump the version number with every release of Boost.Serialization, even if you think that nothing has changed.

That would be prudent and I'm considering it.

Actually, I think the whole issue of library version # is an interesting one which merits consideration. I see a couple of version #'s

a) api version - like library_version is now b) implemenation version - bumped every library change c) ABI version.

The inclusion of these version # in the library would permit library user code to verify that dynamically loaded or statically linked libraries are compatible with the libraries being used. So program x would include code which checks the api level and can abort if the linked in library isn't recent enough. Finding "dll hell" bugs like this can be a nightmare.

Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated? Matthias

Robert Ramey

9:47 p.m.

Matthias Troyer wrote:

...

Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated?

Most of my recent changes have already been pushed to the release branch. I had just assumed that you're changes were already there. I had been checking the trunk results of the MPI library and they looked OK as far as I could tell. Only now do I realize that I hadn't been checking the ones in the release branch and of course they fail. So mpi trunk changes need to be merged over.

...

From a very cursory look at your changes, I think that you might have made more changes than are actually required so you might want to double check them.

Robert Ramey

...

Matthias

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Belcourt, Kenneth

9:19 p.m.

On Jul 20, 2010, at 3:47 PM, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated?

Most of my recent changes have already been pushed to the release branch. I had just assumed that you're changes were already there. I had been checking the trunk results of the MPI library and they looked OK as far as I could tell.

Actually, Robert, your serialization changes broke the Sandia-sun tester (their MPI tests used to work correctly). I'd really appreciate it if you'd fix these on trunk. -- Noel

Matthias Troyer

21 Jul 21 Jul

3:23 a.m.

On 20 Jul 2010, at 15:19, Belcourt, Kenneth wrote:

...

On Jul 20, 2010, at 3:47 PM, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated?

Most of my recent changes have already been pushed to the release branch. I had just assumed that you're changes were already there. I had been checking the trunk results of the MPI library and they looked OK as far as I could tell.

Actually, Robert, your serialization changes broke the Sandia-sun tester (their MPI tests used to work correctly). I'd really appreciate it if you'd fix these on trunk.

I took a quick look at those failures. Many seem to be runtime failures without any output - that is hard to debug remotely. The sun test failures though all can be traced to a change in strong typedefs on the trunk. The "strong typedef'ed" classes no longer have a default constructor and version_type now has a private default constructor. Adding those constructors again should solve the SUN regressions. Matthias

Robert Ramey

6:25 a.m.

Matthias Troyer wrote:

...

On 20 Jul 2010, at 15:19, Belcourt, Kenneth wrote:

...
On Jul 20, 2010, at 3:47 PM, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated?

Most of my recent changes have already been pushed to the release branch. I had just assumed that you're changes were already there. I had been checking the trunk results of the MPI library and they looked OK as far as I could tell.

Actually, Robert, your serialization changes broke the Sandia-sun tester (their MPI tests used to work correctly). I'd really appreciate it if you'd fix these on trunk.

I took a quick look at those failures. Many seem to be runtime failures without any output - that is hard to debug remotely. The sun test failures though all can be traced to a change in strong typedefs on the trunk. The "strong typedef'ed" classes no longer have a default constructor and version_type now has a private default constructor. Adding those constructors again should solve the SUN regressions.

The reason I made this change was that the implicit conversions was making it very hard to track what was going on and have any confidence of what was happening from looking at the code. Remember that the primitive types are not all the same across all machines - so looking at the code is not enough to know what it's actually doing in this case. So I don't think this is a good fix. I'm puzzled as to why only? the Sun compiler manifests this issue. The last checkin addressed the fact that class_id_type (and class_id_optional_type) had the same issue as version_type. So I'm thinking that whatever was done for version_type should be done for these types as well. Robert Ramey

...

Matthias

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Matthias Troyer

3:24 a.m.

On 20 Jul 2010, at 15:47, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated?

Most of my recent changes have already been pushed to the release branch. I had just assumed that you're changes were already there. I had been checking the trunk results of the MPI library and they looked OK as far as I could tell. Only now do I realize that I hadn't been checking the ones in the release branch and of course they fail. So mpi trunk changes need to be merged over.

I would rather wait for your final version of serialization in 1.44 and get the newly reported regressions on the trunk resolved before merging something to the release branch that might break again when you move more changes to the release branch or roll some back. I will wait for the advice of the release manager. Matthias

Eric Niebler

4:27 a.m.

On 7/20/2010 11:24 PM, Matthias Troyer wrote:

...

On 20 Jul 2010, at 15:47, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated?

Most of my recent changes have already been pushed to the release branch. I had just assumed that you're changes were already there. I had been checking the trunk results of the MPI library and they looked OK as far as I could tell. Only now do I realize that I hadn't been checking the ones in the release branch and of course they fail. So mpi trunk changes need to be merged over.

I would rather wait for your final version of serialization in 1.44 and get the newly reported regressions on the trunk resolved before merging something to the release branch that might break again when you move more changes to the release branch or roll some back. I will wait for the advice of the release manager.

I may have missed something. What is left for Robert to merge to release? Can someone pls summarize the current state of MPI and serialization on release for me? -- Eric Niebler BoostPro Computing http://www.boostpro.com

Matthias Troyer

7:10 p.m.

On 20 Jul 2010, at 22:27, Eric Niebler wrote:

...

On 7/20/2010 11:24 PM, Matthias Troyer wrote:

...
On 20 Jul 2010, at 15:47, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated?

Most of my recent changes have already been pushed to the release branch. I had just assumed that you're changes were already there. I had been checking the trunk results of the MPI library and they looked OK as far as I could tell. Only now do I realize that I hadn't been checking the ones in the release branch and of course they fail. So mpi trunk changes need to be merged over.

I would rather wait for your final version of serialization in 1.44 and get the newly reported regressions on the trunk resolved before merging something to the release branch that might break again when you move more changes to the release branch or roll some back. I will wait for the advice of the release manager.

I may have missed something. What is left for Robert to merge to release? Can someone pls summarize the current state of MPI and serialization on release for me?

The state of serialization has been mentioned by Robert a few days ago: On 19 Jul 2010, at 10:36, Robert Ramey wrote:

...

I have one that I'm working on right now.

I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library. I've had a lot of problem with this but with the help of interested parties have been making progress.

I don't know if it'll be ready by wednesday, but I'm just letting you know.

The state of Boost.MPI is that Boost.MPI is broken since Robert introduced some fundamental changes in the types used to store version, class, and object information. Those classes were completely redesigned and the redesign caused Boost.MPI to fail to compile on all platforms. On the trunk I have introduced some workarounds that seem to address the problem on most machines but not all. One issue is the newly removed default constructor of those types that causes compile time failures on Sun (and should actually cause problems on all compilers that strictly conform to the standard). In addition, Noel reported that the sandia tests fail, but it is hard for me to say anything about that since, apart from the Sun failures mentioned above, the other failures are runtime failures but I don't see what fails at runtime. I did not move the workarounds for the solved issues from the trunk to the release branch yet since Robert mentioned that he might roll back to the 1.43 state for 1.44. Matthias

Belcourt, Kenneth

7:35 p.m.

On Jul 21, 2010, at 1:10 PM, Matthias Troyer wrote:

...

On 20 Jul 2010, at 22:27, Eric Niebler wrote:

...
On 7/20/2010 11:24 PM, Matthias Troyer wrote:

...
On 20 Jul 2010, at 15:47, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
Robert, another related issue is whether I need to update Boost.MPI on the release branch. This will depend on whether you push your recent changes to the release branch or not. Can you please keep me updated?

Most of my recent changes have already been pushed to the release branch. I had just assumed that you're changes were already there. I had been checking the trunk results of the MPI library and they looked OK as far as I could tell. Only now do I realize that I hadn't been checking the ones in the release branch and of course they fail. So mpi trunk changes need to be merged over.

I would rather wait for your final version of serialization in 1.44 and get the newly reported regressions on the trunk resolved before merging something to the release branch that might break again when you move more changes to the release branch or roll some back. I will wait for the advice of the release manager.

I may have missed something. What is left for Robert to merge to release? Can someone pls summarize the current state of MPI and serialization on release for me?

The state of serialization has been mentioned by Robert a few days ago:

On 19 Jul 2010, at 10:36, Robert Ramey wrote:

...
I have one that I'm working on right now.

I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library. I've had a lot of problem with this but with the help of interested parties have been making progress.

I don't know if it'll be ready by wednesday, but I'm just letting you know.

The state of Boost.MPI is that Boost.MPI is broken since Robert introduced some fundamental changes in the types used to store version, class, and object information. Those classes were completely redesigned and the redesign caused Boost.MPI to fail to compile on all platforms. On the trunk I have introduced some workarounds that seem to address the problem on most machines but not all. One issue is the newly removed default constructor of those types that causes compile time failures on Sun (and should actually cause problems on all compilers that strictly conform to the standard).

In addition, Noel reported that the sandia tests fail, but it is hard for me to say anything about that since, apart from the Sun failures mentioned above, the other failures are runtime failures but I don't see what fails at runtime.

Sorry, I should clarify. The Sun tests are compile failures. The other failures are likely caused by the tests running longer than the default time limit (300 seconds). These failures tend to be intermittent due to system load, file system responsiveness, etc... and can be ignored for the most part. If I find any actual runtime failures that aren't load dependent, i.e. issues with the MPI or serialization libraries, I'll certainly point them out. If Matthias or Robert can fix the Sun compilation issues, I'll be quite content. -- Noel

Matthias Troyer

11:20 p.m.

On 21 Jul 2010, at 13:35, Belcourt, Kenneth wrote:

...

On Jul 21, 2010, at 1:10 PM, Matthias Troyer wrote:

...
The state of Boost.MPI is that Boost.MPI is broken since Robert introduced some fundamental changes in the types used to store version, class, and object information. Those classes were completely redesigned and the redesign caused Boost.MPI to fail to compile on all platforms. On the trunk I have introduced some workarounds that seem to address the problem on most machines but not all. One issue is the newly removed default constructor of those types that causes compile time failures on Sun (and should actually cause problems on all compilers that strictly conform to the standard).

In addition, Noel reported that the sandia tests fail, but it is hard for me to say anything about that since, apart from the Sun failures mentioned above, the other failures are runtime failures but I don't see what fails at runtime.

Sorry, I should clarify. The Sun tests are compile failures. The other failures are likely caused by the tests running longer than the default time limit (300 seconds). These failures tend to be intermittent due to system load, file system responsiveness, etc... and can be ignored for the most part. If I find any actual runtime failures that aren't load dependent, i.e. issues with the MPI or serialization libraries, I'll certainly point them out.

If Matthias or Robert can fix the Sun compilation issues, I'll be quite content.

Robert should be able to fix it by reintroducing default constructor for his "strongly typedef'ed" classes, and by making one private default constructor public again. Matthias

Robert Ramey

22 Jul 22 Jul

4:55 a.m.

Matthias Troyer wrote:

...

...
If Matthias or Robert can fix the Sun compilation issues, I'll be quite content.

Robert should be able to fix it by reintroducing default constructor for his "strongly typedef'ed" classes, and by making one private default constructor public again.

Hmmm - I'm not so sure about that but I'll take a look at it. Robert Ramey

Robert Ramey

5:26 a.m.

Robert Ramey wrote:

...

Matthias Troyer wrote:

...
...
If Matthias or Robert can fix the Sun compilation issues, I'll be quite content.

Robert should be able to fix it by reintroducing default constructor for his "strongly typedef'ed" classes, and by making one private default constructor public again.

Hmmm - I'm not so sure about that but I'll take a look at it.

I'm looking at this now. I see item_version_type - private default constructor version_type - private default constructor class_id_type - public default constructor I suspect that I made no conscious decision to make the private one private. I can make those public if you think that would help. I don't see how it would though. also I have BOOST_ARCHIVE_STRONG_TYPEDEF(class_id_type, class_id_optional_type) which has a public constructor of class_id_optional_type (which I suspect is the culprit). Since class_id_optional_type is derived from class_id_type (via strong_typedef), that would explain why class_id_type has a public default constructor and the other's don't. If I'm understanding this correctly, we're dealing with a compilation error with the sun compiler whose error message has yet to be revealed to us. I think we need better information to know what the correct fix is. Robert Ramey

Matthias Troyer

2:11 p.m.

On 21 Jul 2010, at 23:26, Robert Ramey wrote:

...

Robert Ramey wrote:

...
Matthias Troyer wrote:

...
...
If Matthias or Robert can fix the Sun compilation issues, I'll be quite content.

Robert should be able to fix it by reintroducing default constructor for his "strongly typedef'ed" classes, and by making one private default constructor public again.

Hmmm - I'm not so sure about that but I'll take a look at it.

I'm looking at this now. I see

item_version_type - private default constructor version_type - private default constructor class_id_type - public default constructor

I suspect that I made no conscious decision to make the private one private. I can make those public if you think that would help. I don't see how it would though.

also I have

BOOST_ARCHIVE_STRONG_TYPEDEF(class_id_type, class_id_optional_type)

which has a public constructor of class_id_optional_type (which I suspect is the culprit). Since class_id_optional_type is derived from class_id_type (via strong_typedef), that would explain why class_id_type has a public default constructor and the other's don't.

If I'm understanding this correctly, we're dealing with a compilation error with the sun compiler whose error message has yet to be revealed to us. I think we need better information to know what the correct fix is.

The problem comes from this line in boost/mpi/datatype_fwd.hpp: template<typename T> MPI_Datatype get_mpi_datatype(const T& x = T()); The compilation error is in the regression logs on the web page and in case of version_type complains that the default constructor is private. In case of class_id_optional_type the problem is that the BOOST_ARCHIVE_STRONG_TYPEDEF in archive/basic_archive.hpp does not define a default constructor on the trunk, while it did in 1.43 and before. Strangely the BOOST_STRONG_TYPEDEF in serialization/strong_typedef.hpp still does have a default constructor. Adding the missing default constructor and making the private ones public will solve the Sun issues. Matthias

Robert Ramey

5:32 p.m.

Matthias Troyer wrote:

...

On 21 Jul 2010, at 23:26, Robert Ramey wrote:

...
Robert Ramey wrote:

...
Matthias Troyer wrote:

...
...
If Matthias or Robert can fix the Sun compilation issues, I'll be quite content.

Robert should be able to fix it by reintroducing default constructor for his "strongly typedef'ed" classes, and by making one private default constructor public again.

Hmmm - I'm not so sure about that but I'll take a look at it.

I'm looking at this now. I see

item_version_type - private default constructor version_type - private default constructor class_id_type - public default constructor

I suspect that I made no conscious decision to make the private one private. I can make those public if you think that would help. I don't see how it would though.

also I have

BOOST_ARCHIVE_STRONG_TYPEDEF(class_id_type, class_id_optional_type)

which has a public constructor of class_id_optional_type (which I suspect is the culprit). Since class_id_optional_type is derived from class_id_type (via strong_typedef), that would explain why class_id_type has a public default constructor and the other's don't.

If I'm understanding this correctly, we're dealing with a compilation error with the sun compiler whose error message has yet to be revealed to us. I think we need better information to know what the correct fix is.

The problem comes from this line in boost/mpi/datatype_fwd.hpp:

template<typename T> MPI_Datatype get_mpi_datatype(const T& x = T());

The compilation error is in the regression logs on the web page and in case of version_type complains that the default constructor is private. In case of class_id_optional_type the problem is that the BOOST_ARCHIVE_STRONG_TYPEDEF in archive/basic_archive.hpp does not define a default constructor on the trunk, while it did in 1.43 and before. Strangely the BOOST_STRONG_TYPEDEF in serialization/strong_typedef.hpp still does have a default constructor.

Adding the missing default constructor and making the private ones public will solve the Sun issues.

I've looked at this in a little more detail and I've got a few questions: Why does MPI (or any other library) need to construct an object of type "boost::archive::version_type"? The reason I ask is that I'm concerned that a reason for doing this might be to get the size of the type. The class name skeleton suggests this. The problem is that the version_type used internally by the archive class is different than the one stored in the file. So making this change would just hide a problem. In fact, I think this (or something like it) is what broke MPI in the first place. Robert Ramey

...

Matthias

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Matthias Troyer

5:09 p.m.

On 22 Jul 2010, at 11:32, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
On 21 Jul 2010, at 23:26, Robert Ramey wrote:

...
Robert Ramey wrote:

...
Matthias Troyer wrote:

...
...
If Matthias or Robert can fix the Sun compilation issues, I'll be quite content.

Robert should be able to fix it by reintroducing default constructor for his "strongly typedef'ed" classes, and by making one private default constructor public again.

Hmmm - I'm not so sure about that but I'll take a look at it.

I'm looking at this now. I see

item_version_type - private default constructor version_type - private default constructor class_id_type - public default constructor

I suspect that I made no conscious decision to make the private one private. I can make those public if you think that would help. I don't see how it would though.

also I have

BOOST_ARCHIVE_STRONG_TYPEDEF(class_id_type, class_id_optional_type)

which has a public constructor of class_id_optional_type (which I suspect is the culprit). Since class_id_optional_type is derived from class_id_type (via strong_typedef), that would explain why class_id_type has a public default constructor and the other's don't.

If I'm understanding this correctly, we're dealing with a compilation error with the sun compiler whose error message has yet to be revealed to us. I think we need better information to know what the correct fix is.

The problem comes from this line in boost/mpi/datatype_fwd.hpp:

template<typename T> MPI_Datatype get_mpi_datatype(const T& x = T());

The compilation error is in the regression logs on the web page and in case of version_type complains that the default constructor is private. In case of class_id_optional_type the problem is that the BOOST_ARCHIVE_STRONG_TYPEDEF in archive/basic_archive.hpp does not define a default constructor on the trunk, while it did in 1.43 and before. Strangely the BOOST_STRONG_TYPEDEF in serialization/strong_typedef.hpp still does have a default constructor.

Adding the missing default constructor and making the private ones public will solve the Sun issues.

I've looked at this in a little more detail and I've got a few questions:

Why does MPI (or any other library) need to construct an object of type "boost::archive::version_type"? The reason I ask is that I'm concerned that a reason for doing this might be to get the size of the type. The class name skeleton suggests this. The problem is that the version_type used internally by the archive class is different than the one stored in the file. So making this change would just hide a problem. In fact, I think this (or something like it) is what broke MPI in the first place.

No, this is not what caused the break, and making these default constructible again does not "hide a problem". But indeed we need the size of the type IN MEMORY to be able to send it over the net and we need to default construct the object to receive into it. SO far those types have been default constructible, but you now changed that - this is what causes the compilation errors. Please recall that we send directly from memory to memory via the network and don't use files or the representation in files. Matthias

Robert Ramey

6:47 p.m.

Matthias Troyer wrote:

...

On 22 Jul 2010, at 11:32, Robert Ramey wrote:

No, this is not what caused the break, and making these default constructible again does not "hide a problem".

But indeed we need the size of the type IN MEMORY to be able to send it over the net and we need to default construct the object to receive into it. SO far those types have been default constructible, but you now changed that - this is what causes the compilation errors.

Please recall that we send directly from memory to memory via the network and don't use files or the representation in files.

One other question that I forgot to ask: Why does this show up only on the Sun Compiler. I'm only seeing the compiler error message "/opt/sunstudio12.1/bin/CC" +d -library=stlport4 -features=tmplife -features=tmplrefstatic -g -erroff=%none -m64 -KPIC -DBOOST_ALL_NO_LIB=1 -I".." -I"/opt/SUNWhpc/HPC8.1/sun/include" -I"/opt/SUNWhpc/HPC8.1/sun/include/openmpi" -c -o "/scratch2/kbelco/boost/results/boost/bin.v2/libs/mpi/test/all_gather_test-1.test/sun-5.10/debug/address-model-64/stdlib-sun-stlport/all_gather_test.o" "../libs/mpi/test/all_gather_test.cpp" "../boost/mpi/datatype_fwd.hpp", line 28: Error: Could not find a match for boost::archive::class_id_optional_type::class_id_optional_type() needed in __dflt_argK(). "../boost/mpi/collectives/broadcast.hpp", line 134: Where: While instantiating "boost::mpi::detail::broadcast_impl<std::string>(const boost::mpi::communicator&, std::string *, int, int, mpl_::bool_<0>)". "../boost/mpi/collectives/broadcast.hpp", line 134: Where: Instantiated from boost::mpi::broadcast<std::string>(const boost::mpi::communicator&, std::string *, int, int). "../boost/mpi/collectives/all_gather.hpp", line 44: Where: Instantiated from boost::mpi::detail::all_gather_impl<std::string>(const boost::mpi::communicator&, const std::string *, int, std::string *, mpl_::bool_<0>). "../boost/mpi/collectives/all_gather.hpp", line 68: Where: Instantiated from boost::mpi::all_gather<std::string>(const boost::mpi::communicator&, const std::string &, std::vector<std::string>&). "../libs/mpi/test/all_gather_test.cpp", line 40: Where: Instantiated from all_gather_test<string_generator>(const boost::mpi::communicator&, string_generator, const char*). "../libs/mpi/test/all_gather_test.cpp", line 105: Where: Instantiated from non-template code. Warning: A reference return value should be an lvalue (if the value of this function is used, the result is unpredictable). where the offending line 28 at datatype_fwd.hpp is template<typename T> MPI_Datatype get_mpi_datatype(const T& x = T()); Also note the Warning. Are you sure that there is no way around the mpi library requiring a default constructor which it should never invoke? That makes the mpi library dependent on an internal detail of the the serialization library. Is that the best/only way to handle this? Robert Ramey

Matthias Troyer

5:58 p.m.

On 22 Jul 2010, at 12:47, Robert Ramey wrote:

...

Are you sure that there is no way around the mpi library requiring a default constructor which it should never invoke? That makes the mpi library dependent on an internal detail of the the serialization library. Is that the best/only way to handle this?

The other compilers seem to optimize away the constructor of an unused argument, while the sun compiler seems to insist on instantiating it. We can redesign the MPI library to work around that breaking change in your library but I then wonder what you will break next. I totally disagree with your statements that we depend on internal details of Boost.Serialization. Boost.Serialization does publish an (incomplete!) archive concept and you did intend that others can extend it with new archive classes. Thus, the types you use to serialize version information, etc. and the list of "primitive types" which any archive has to explicitly support belong to the public interface of your library. It would be good if you could add information to the documentation that states this list of types and the concepts they satisfy. In the absence of that information I can validly assume that the public members of the version_type, etc. is what I can use and which will not change. The moment you allow new archive classes to be written, the version_type, etc. is no longer an internal detail! Matthias

Robert Ramey

7:39 p.m.

Matthias Troyer wrote:

...

On 22 Jul 2010, at 12:47, Robert Ramey wrote:

...
Are you sure that there is no way around the mpi library requiring a default constructor which it should never invoke? That makes the mpi library dependent on an internal detail of the the serialization library. Is that the best/only way to handle this?

The other compilers seem to optimize away the constructor of an unused argument, while the sun compiler seems to insist on instantiating it. We can redesign the MPI library to work around that breaking change in your library but I then wonder what you will break next.

I totally disagree with your statements that we depend on internal details of Boost.Serialization. Boost.Serialization does publish an (incomplete!) archive concept and you did intend that others can extend it with new archive classes.

The published archive concept specifes the concepts that must be fullfilled by any serializable types. The serialization library includes examples of archives which depend upon only on the documented concepts. The documentated archive concepts don't prevent other archive classes from including more functionality. Indeed, facilities such as serialization of pointers through a base class, etc demand it. And it's true that I haven't discouraged leveraging on these "extended archives"

...

Thus, the types you use to serialize version information, etc. and the list of "primitive types" which any archive has to explicitly support belong to the public interface of your library. It would be good if you could add information to the documentation that states this list of types and the concepts they satisfy.

...

In the absence of that information I can validly assume that the public members of the version_type, etc. is what I can use and which will not change.

hmmm - of course you can assume that - but I can't guarentee it.

...

The moment you allow new archive classes to be written, the version_type, etc. is no longer an internal detail!

hmmm - one could just as well say that once one depends upon undefined behavior, you can't guarentee that the code will not break. If it makes everyone feel better I take full responsability for this problem and accept the entire blame for this problem. Now the point that interests me: It's become clear that the question boils down to: a) make the indicated default constructors public or b) tweak the MPI code so it doesn't instantitiate code it doesn't use. I can do a) and it's not too hard, but it opens me up to this kind of problem happening again and makes it easier for errors to creep in. You can do b). I have no idea how hard it is. If it's easy and it resolves a valid warning and also eliminates non-sensical code (returning a reference to an object which has disappeared off the stack). Note that if get_mpi_type() was actually called and the result used it would result in a program crash. I would prefer that this be done. So I'll leave it up to you a) or b)? Robert Ramey

Belcourt, Kenneth

6:46 p.m.

On Jul 22, 2010, at 1:39 PM, Robert Ramey wrote:

...

a) make the indicated default constructors public or b) tweak the MPI code so it doesn't instantitiate code it doesn't use.

...

So I'll leave it up to you a) or b)?

+1 for a) Once you've fixed the problem and we're past the release, then by all means please carefully consider all other options. -- Noel

Matthias Troyer

11:03 p.m.

On 22 Jul 2010, at 13:39, Robert Ramey wrote:

...

...
I totally disagree with your statements that we depend on internal details of Boost.Serialization. Boost.Serialization does publish an (incomplete!) archive concept and you did intend that others can extend it with new archive classes.

The published archive concept specifes the concepts that must be fullfilled by any serializable types. The serialization library includes examples of archives which depend upon only on the documented concepts.

The documentated archive concepts don't prevent other archive classes from including more functionality. Indeed, facilities such as serialization of pointers through a base class, etc demand it. And it's true that I haven't discouraged leveraging on these "extended archives"

I'm talking about the requirements on a minimal archive. Those are not fully documented.

...

...
Thus, the types you use to serialize version information, etc. and the list of "primitive types" which any archive has to explicitly support belong to the public interface of your library. It would be good if you could add information to the documentation that states this list of types and the concepts they satisfy.

...
In the absence of that information I can validly assume that the public members of the version_type, etc. is what I can use and which will not change.

hmmm - of course you can assume that - but I can't guarentee it.

Then we need a list of concepts for these types that you do guarantee, or your next change might again break code,

...

...
The moment you allow new archive classes to be written, the version_type, etc. is no longer an internal detail!

hmmm - one could just as well say that once one depends upon undefined behavior, you can't guarentee that the code will not break.

So you are saying that implementing new archives basically depends on undefined behavior and you cannot guarantee that it will not break in the next release? It would be better to publicly defines the concepts for an archive and for the primitive types if you do not want us to take the current public interface of those classes as the implicit concepts.

...

If it makes everyone feel better I take full responsability for this problem and accept the entire blame for this problem.

Now the point that interests me:

It's become clear that the question boils down to:

a) make the indicated default constructors public or b) tweak the MPI code so it doesn't instantitiate code it doesn't use.

I can do a) and it's not too hard, but it opens me up to this kind of problem happening again and makes it easier for errors to creep in.

You can do b). I have no idea how hard it is. If it's easy and it resolves a valid warning and also eliminates non-sensical code (returning a reference to an object which has disappeared off the stack). Note that if get_mpi_type() was actually called and the result used it would result in a program crash. I would prefer that this be done.

Actually not, this warning seems spurious. And indeed the function get_mpi_type() *is* called all the time.

...

So I'll leave it up to you a) or b)?

b) might break other compilers and will thus take longer to test. I think that it will be easier to do a) which should not break the other compilers but get the Sun compiler to work. Matthias

Robert Ramey

23 Jul 23 Jul

12:42 a.m.

Matthias Troyer wrote:

...

On 22 Jul 2010, at 13:39, Robert Ramey wrote:

...
...
I totally disagree with your statements that we depend on internal details of Boost.Serialization. Boost.Serialization does publish an (incomplete!) archive concept and you did intend that others can extend it with new archive classes.

The published archive concept specifes the concepts that must be fullfilled by any serializable types. The serialization library includes examples of archives which depend upon only on the documented concepts.

The documentated archive concepts don't prevent other archive classes from including more functionality. Indeed, facilities such as serialization of pointers through a base class, etc demand it. And it's true that I haven't discouraged leveraging on these "extended archives"

I'm talking about the requirements on a minimal archive. Those are not fully documented.

Hmmm that would be news to me. I compiled trivial archive from the documentation with serialization code. And trivial archive is model of minimal archive. That is I believe that serializing any types to a trivial archive will compile without error. If there is a serializable type which trivial archive fails to work with, I would be interested in hearing about it.

...

...
...
In the absence of that information I can validly assume that the public members of the version_type, etc. is what I can use and which will not change.

hmmm - of course you can assume that - but I can't guarentee it.

Then we need a list of concepts for these types that you do guarantee, or your next change might again break code,

These are the one's in the archive concept.

...

...
...
The moment you allow new archive classes to be written, the version_type, etc. is no longer an internal detail!

hmmm - one could just as well say that once one depends upon undefined behavior, you can't guarentee that the code will not break.

...

So you are saying that implementing new archives basically depends on undefined behavior and you cannot guarantee that it will not break in the next release? It would be better to publicly defines the concepts for an archive and for the primitive types if you do not want us to take the current public interface of those classes as the implicit concepts.

The archives included with the package - binary_archive in this case have implementation features beyond what is required to satisfy the archive concept. In this case there is a type used called version_type. Actually - until now, version_type wasn't part of the binary_archive at all - it was just one more serializable type. So any dependency on some feature of version_type wasn't really a dependency on the archive class at all but on one of the types that is used to implement binary_archive. MPI serialization leverages on the implemention of binary_archive. This makes a lot of sense of course. But it means that time to time this situation will occur. The only way to avoid this is for mpi serialization to depend only on the archive concept as described in the documenation and modeled by trivial archive or never change the implemenation of binary_archive again.

...

...
Note that if get_mpi_type() was actually called and the result used it would result in a program crash. I would prefer that this be done.

...

Actually not, this warning seems spurious. And indeed the function get_mpi_type() *is* called all the time.

but the result isn't used. if it were it would crash the program. Returning a reference to a value on the stack and then truncating the stack with a return invalidates the result. The warning is correct and useful in my opinoin

...

...
So I'll leave it up to you a) or b)?

b) might break other compilers and will thus take longer to test. I think that it will be easier to do a) which should not break the other compilers but get the Sun compiler to work.

OK - I'll make the default constructors visible. Robert Ramey

Matthias Troyer

8:47 p.m.

On 22 Jul 2010, at 18:42, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
On 22 Jul 2010, at 13:39, Robert Ramey wrote:

...
...
I totally disagree with your statements that we depend on internal details of Boost.Serialization. Boost.Serialization does publish an (incomplete!) archive concept and you did intend that others can extend it with new archive classes.

The published archive concept specifes the concepts that must be fullfilled by any serializable types. The serialization library includes examples of archives which depend upon only on the documented concepts.

The documentated archive concepts don't prevent other archive classes from including more functionality. Indeed, facilities such as serialization of pointers through a base class, etc demand it. And it's true that I haven't discouraged leveraging on these "extended archives"

I'm talking about the requirements on a minimal archive. Those are not fully documented.

Hmmm that would be news to me. I compiled trivial archive from the documentation with serialization code. And trivial archive is model of minimal archive. That is I believe that serializing any types to a trivial archive will compile without error. If there is a serializable type which trivial archive fails to work with, I would be interested in hearing about it.

You misunderstand. The concepts should specify: 1) what primitive types the archive needs to be able to serialize 2) what concepts these primitive types satisfy that can be used. Do you want to say that the trivial archive should be used to deduce that? class trivial_oarchive { public: ////////////////////////////////////////////////////////// // public interface used by programs that use the // serialization library typedef boost::mpl::bool_<true> is_saving; typedef boost::mpl::bool_<false> is_loading; template<class T> register_type(){} template<class T> trivial_oarchive & operator<<(const T & t){ return *this; } template<class T> trivial_oarchive & operator&(const T & t){ return *this << t; } void save_binary(void *address, std::size_t count){}; }; This class satisfies the following: 1) it can deal with ANY type 2) it uses NO property of the types Clearly, those are not the requirements for all archives. If we next look at the text archives we'll find that 1) it can deal with any primitive type that can be streamed to in i/ostream 2) it uses the streaming operator of those types Are those the requirements? Again not. What we need is an explicit list of all primitive types that an archive is required to support, and a list of the properties of those types that a user can rely on. For example, all the primnitive types had a default constructor and I used it but now you say that was an implementation detail that I should not have used. How can I as an archive implementor know what I may use without risking breakages? Matthias

Robert Ramey

24 Jul 24 Jul

12:11 a.m.

Matthias Troyer wrote:

...

On 22 Jul 2010, at 18:42, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
On 22 Jul 2010, at 13:39, Robert Ramey wrote:

...
...
I totally disagree with your statements that we depend on internal details of Boost.Serialization. Boost.Serialization does publish an (incomplete!) archive concept and you did intend that others can extend it with new archive classes.

The published archive concept specifes the concepts that must be fullfilled by any serializable types. The serialization library includes examples of archives which depend upon only on the documented concepts.

The documentated archive concepts don't prevent other archive classes from including more functionality. Indeed, facilities such as serialization of pointers through a base class, etc demand it. And it's true that I haven't discouraged leveraging on these "extended archives"

I'm talking about the requirements on a minimal archive. Those are not fully documented.

Hmmm that would be news to me. I compiled trivial archive from the documentation with serialization code. And trivial archive is model of minimal archive. That is I believe that serializing any types to a trivial archive will compile without error. If there is a serializable type which trivial archive fails to work with, I would be interested in hearing about it.

You misunderstand. The concepts should specify:

1) what primitive types the archive needs to be able to serialize 2) what concepts these primitive types satisfy that can be used.

Do you want to say that the trivial archive should be used to deduce that?

class trivial_oarchive { public: ////////////////////////////////////////////////////////// // public interface used by programs that use the // serialization library typedef boost::mpl::bool_<true> is_saving; typedef boost::mpl::bool_<false> is_loading; template<class T> register_type(){} template<class T> trivial_oarchive & operator<<(const T & t){ return *this; } template<class T> trivial_oarchive & operator&(const T & t){ return *this << t; } void save_binary(void *address, std::size_t count){}; };

This class satisfies the following:

1) it can deal with ANY type 2) it uses NO property of the types

correct. And it fullfills the requirements of the archive concept.

...

Clearly, those are not the requirements for all archives.

The archive concept doesn't prohibit any model of that concept to add other facilities of it's own choosing. The guarentee is that any archive which models the concept will compile ar << t for any serializable type t.

...

If we next look at the text archives we'll find that

1) it can deal with any primitive type that can be streamed to in i/ostream.

I believe that this is the same as saying that a text archive models the archive concept as stated in the documentation. 2) it uses the streaming operator of those types True - this isn't required by the archive concept. And in fact it's not true for all archives. For example, the native binary_archive doesn't use the streamin operators at all. It makes all calls to the underlying filebuf. In fact, a binary_archive can be constructed with either a stream (from which it just gets the filebuf) or from a filebuf or stringbuf directly.

...

Are those the requirements? Again not. What we need is an explicit list of all primitive types that an archive is required to support, and a list of the properties of those types that a user can rely on.

if it can't support any serializable type it's not an archve. all primitive types are serializable types by definition.

...

For example, all the primnitive types had a default constructor

all prmitive types may have a default constructor - but the archive concept doesn't require that. There might be some ambiguity here. For purposes of serializability - a prmitive type is one that either a) is a C++ primitive type or b) marked "primitive" via a serialization trait. the "serializable" concept doesn't require that a serializable type have a default constructor - and in fact many do not.

...

and I used it

You presumed that a type marked "primitive" must have a default constructor. I realize that all C++ types have default constructors - but there is no guarentee that a type marked as serializable via "primitive" have a default constructor.

...

but now you say that was an implementation detail that I should not have used.

agreed, I think you made an error here.

...

How can I as an archive implementor know what I may use without risking breakages?

Of course this is the real question. Strictly speaking the only way would to not leverage on the archives already written - as they add a lot of capability beyond that which is required by the concept. Of course, that would be a lot more work which we want to avoid. You wanted to leverage on a huge part of the archive implemenation which is beyond the strict archive concept. All the archvies class do this in different ways. xml, text and binary have bridged an incredible breadth of utility and functionality with very little breakage over the years. Honestly, I can't guarentee that this won't happen from time to time. If it makes you feel any better, in order to "fix" this issue with version_type I had to make a few minor changes in xml and text archives as well. I deally it would seem that it would be possible that they might be totally independent - in practice, sharing code through base classes makes them "a little bit" interdependent. But it cuts down the work by a large amount. So that's the trade off we've gotten. Huge reduction on effort, very wide applicability, very minor interface breakage, backward compatibiliy over 8 years (so far). **** So now I've answered your specific questions - here are a couple more random observations. The archive concept is very "thin". It only specifies the calling interface - really nothing more. It doesn't include versioning, it doesn't say anything about pointers, it doesn't say anything about tracking, etc, etc. We know that these issues are essential to a useful serialization library. How is it that it doesn't address this? (I'm suspecting that this might be your question). If we think about this it turns out there are different ways an archive might be implemented. Take pointers. Someone might make an archve which would just copy the raw pointer. He would say - hey I'm just using for in memory copies and I don't want "deep copies". The exact same line of thought comes up with regards to tracking (not good for sending data over a line), and all the other aspects of what we now call the serialization library. So I concluded that the archive concept shouldn't specify what the archive does - only it's interface. That's why it's as it is to day. When the serialization documentation was being disputed - it became clear that there was no agreement what seriaization should mean - by limiting the concept to the interface - it let me get past this dispute and move on. do you see a pattern here? It's facinating to me that this is how boost helps software quality. But failing to agree, it became clear, the the only way to move on was to leave any "features" unspecified which turned out to be the correct decision. Besides shortening the effort considerably it has had huge practical benefits. First it has permited the extension to new kinds of archives which were totally unanticipated. For an interesting example, look at the simple logging archive in the documentation. The archive is output only and can dump to any stream output any serializable type in a formatted way. It's header only. It implements the archive concept but doesn't rely on the base class implementations of the other archives so it's very light weight. I see this is being very interesting for logging and debugging. (of course public reaction has been underwhelming but that's not my point here). There are other things that one could make archives for: a) a template "deep copy" b) an editing archive for gui editing c) a diff archve where two archives are compared and their diffence is produced. d) an inverse of the above. e) c + d above would lead to a whole "archive algebra" for rolling back and forward archives. All of the above would be permited by the archvie concept and would work with all serializable types - with changing any current code! I hope that clarifies the reasonnig for why things are the way they are. Now, taking a look at the mpi usage of serializaiton. I realy haven't looked at it enough to really understand it so I maybe wrong about this - these are only casual observations. a) it seems that the "skeleton" idea seems to depend on the idea that the size of the data stored in the bniary archive be the same as the size of the underlying data type. Up until now that has been true even though there was never any explicity guarantee to that effect. I had to change the behavior in order to extract myself from some other fiasco and this "feature" was no longer true. I think this is where the problem started. It's no one's fault. b) the MPI file sends the class versions over the wire. It doesn't need to do this. If you look at some of the archives there is class_optional_id which is trapped by the archvie classes and suppressed both on input and output because that particular archive class doesn't need it. But it's there if someone want's to hook it (like an editing archive). I think MPI might want to do the same thing with version_type. c) I'm not sure how MPI uses portable binary archive (if at all). Seems like that might be interesting. d) what is really needed to send data "over the wire" is to be able to supress tracking at the archive level. The would permit the same data to be sent over and over and wouldn't presume that the data constant. So you wouldn't have to create a new archive for each transaction. I've puzzled about how to do this without breaking the archive concept. Turns out it's a little tricky. And there doesn't seem to be much demand for it - but maybe there would be if I did it. e) this bit of code is what created the the issue with the Sun compiler.

...

The problem comes from this line in boost/mpi/datatype_fwd.hpp:

template<typename T> MPI_Datatype get_mpi_datatype(const T& x = T());

Frankly, it's just plain wrong and should be fixed. You might say that you know it's wrong but it works around this or that template or compiler quirk and it's too hard to fix. I could accept that. But if it's fixable, it should be fixed. I did make the constructors of version_type public. I had made them private to trap errors in code where they were constructed but not initialized. Now error like this arn't trapped. So I think you should fix this. f) I believe that MPI uses binary_archive_base? as a basis. you could have used a higher level class as a basis. I don't know that that woudl have made things easier or harder but it's worth looking into. The binary_archive is actually very small - only a few hundred lines of code. This could have been cloned and edited. This might or might not have made things more/less intertwined with the other archive classes. This isn't a suggestion - just an observation that it might be worth looking into. Robert Ramey

Matthias Troyer

3:31 a.m.

On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...

[snip] Robert Ramey

Let me summarize concisely: no user-defined archive type may use any of the serialize function, pointer handling, or any other aspect of the serialization library since all those are implementation details that you might change at any time. I think that concludes the discussion and I will stop maintaining Boost.MPI Matthias

Eric Niebler

3:57 a.m.

On 7/23/2010 11:31 PM, Matthias Troyer wrote:

...

On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

Let me summarize concisely:

<snip> Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release? Let's get the release out first. Then you guys can bicker. ;-) -- Eric Niebler BoostPro Computing http://www.boostpro.com

Matthias Troyer

4:02 a.m.

On 23 Jul 2010, at 21:57, Eric Niebler wrote:

...

On 7/23/2010 11:31 PM, Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

Let me summarize concisely:

<snip>

Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release?

I'm still waiting for this to be done, and we need to test it on trunk first where the Sun regression tests are run. Matthias

Robert Ramey

6:12 a.m.

Matthias Troyer wrote:

...

On 23 Jul 2010, at 21:57, Eric Niebler wrote:

...
On 7/23/2010 11:31 PM, Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

Let me summarize concisely:

<snip>

Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release?

I'm still waiting for this to be done, and we need to test it on trunk first where the Sun regression tests are run.

I did this yesterday. Looking at the sun test results, a similar problem has cropped up with a couple of other types. Robert Ramey

...

Matthias

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey

6:19 a.m.

Eric Niebler wrote:

...

On 7/23/2010 11:31 PM, Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

Let me summarize concisely:

<snip>

Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release?

Let's get the release out first. Then you guys can bicker. ;-)

I don't think it's fair to characterize this as bickering. It's a real and difficult problem that can't really be solved to everyone's total satisfaction. Robert Ramey

Matthias Troyer

1:34 p.m.

On 24 Jul 2010, at 00:19, Robert Ramey wrote:

...

Eric Niebler wrote:

...
On 7/23/2010 11:31 PM, Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

Let me summarize concisely:

<snip>

Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release?

Let's get the release out first. Then you guys can bicker. ;-)

I don't think it's fair to characterize this as bickering. It's a real and difficult problem that can't really be solved to everyone's total satisfaction.

I think this could be solved satisfactorily if you could make a list of all primitive types that an archive has to support, and specify which concepts once can assume for those types. Matthias

Belcourt, Kenneth

3:20 p.m.

On Jul 24, 2010, at 7:34 AM, Matthias Troyer wrote:

...

On 24 Jul 2010, at 00:19, Robert Ramey wrote:

...
Eric Niebler wrote:

...
On 7/23/2010 11:31 PM, Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

Let me summarize concisely:

<snip>

Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release?

No, it's still broken on trunk. As such, it represents a regression from the previous release. Once it's fixed on trunk and merged into release, I volunteer to swap all trunk testing over to release for a few days to ensure MPI is working correctly before cutting the release. Just as a side note, the only two release branch testers running MPI tests are showing wholesale failure as well (perhaps caused by a premature or partial merge?). -- Noel

Robert Ramey

5:38 p.m.

Belcourt, Kenneth wrote:

...

On Jul 24, 2010, at 7:34 AM, Matthias Troyer wrote:

...
On 24 Jul 2010, at 00:19, Robert Ramey wrote:

...
Eric Niebler wrote:

...
On 7/23/2010 11:31 PM, Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

Let me summarize concisely:

<snip>

Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release?

No, it's still broken on trunk. As such, it represents a regression from the previous release.

Is this true? I thought this has been broken since 1.42?

...

Once it's fixed on trunk and merged into release, I volunteer to swap all trunk testing over to release for a few days to ensure MPI is working correctly before cutting the release. Just as a side note, the only two release branch testers running MPI tests are showing wholesale failure as well (perhaps caused by a premature or partial merge?).

As far as the serialization library is concerned, the only difference between the two branches is that the trunk includes the publication of a default constructor for a couple of types. I believe that mathias has changes in the trunk that haven't been merged to release yet. Robert Ramey

...

-- Noel

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Belcourt, Kenneth

4:52 p.m.

On Jul 24, 2010, at 11:38 AM, Robert Ramey wrote:

...

Belcourt, Kenneth wrote:

...
On Jul 24, 2010, at 7:34 AM, Matthias Troyer wrote:

...
...
Eric Niebler wrote:

...
Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release?

No, it's still broken on trunk. As such, it represents a regression from the previous release.

Is this true? I thought this has been broken since 1.42?

Robert, please (re)read this email I sent to both the testing mailing list and copied you on directly (Note the date was early June, 2010). This commit broke all MPI tests including the remaining broken Sun MPI tests. Begin forwarded message:

...

From: "Belcourt, Kenneth" <kbelco@sandia.gov> Date: June 7, 2010 7:43:16 PM MDT To: Running Boost regression tests <boost-testing@lists.boost.org> Cc: Robert Ramey <ramey@rrsd.com> Subject: [boost]Trunk MPI tests broken (Serialization bug)

Hi Robert,

It looks like most all the MPI tests may have broken with this checkin.

https://svn.boost.org/trac/boost/changeset/62358/trunk

or at least one of your checkins that touched the archive version_type.

The error points to ../boost/mpi/datatype.hpp:184:3: error: no matching function for call to ???assertion_failed(mpl_::failed************ boost ::mpi::is_mpi_datatype<boost::archive::version_type>::************)??? but from what I can tell neither this file (nor any other MPI source file) has changed in months.

...

...
Once it's fixed on trunk and merged into release, I volunteer to swap all trunk testing over to release for a few days to ensure MPI is working correctly before cutting the release. Just as a side note, the only two release branch testers running MPI tests are showing wholesale failure as well (perhaps caused by a premature or partial merge?).

As far as the serialization library is concerned, the only difference between the two branches is that the trunk includes the publication of a default constructor for a couple of types.

I believe that mathias has changes in the trunk that haven't been merged to release yet.

Okay, that would explain it, thanks. -- Noel

Robert Ramey

5:30 p.m.

Matthias Troyer wrote:

...

On 24 Jul 2010, at 00:19, Robert Ramey wrote:

...
Eric Niebler wrote:

...
On 7/23/2010 11:31 PM, Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

Let me summarize concisely:

<snip>

Guys, we're trying to get a release out. IIUC, a course of action has already been chosen: Robert is to add back the public default constructors that were causing MPI to fail. What's the status of that? Has it in fact fixed the problem on release?

Let's get the release out first. Then you guys can bicker. ;-)

I don't think it's fair to characterize this as bickering. It's a real and difficult problem that can't really be solved to everyone's total satisfaction.

I think this could be solved satisfactorily if you could make a list of all primitive types that an archive has to support, and specify which concepts once can assume for those types.

There is a list of types in common archive. These were created because some archives - specifically xml_archives needed to provide special handling for certain types. For example, take class id. The xml documentation I looked at indicates that this should be placed as an attribute after the tag while other types should be bracketed with <name>...</name>. On the other hand other archive types such as text and binary didn't need to implement any special handling for these types. In fact these archive classes didn't need any special handling for ANY types. It is for this very reason that I was very reluctant to support serialization as xml and implemented it only under duress. It sort of broke the idea that archives could be completely independent of the types being serialized. The final result is that there was such a dependency though it wasn't obvious because for most archives it didn't come up. And now it has come up in an entirely different context. Version type was too loosley defined and this created a number of difficulties which were pointed out be compiling at a higher warning level. And several different version types existed (library version type, version_type, collection item version type) which created confusion and possible bugs. Another case was collection_size_type. This one has morphed a couple of times from unsigned int, unsigned long, int64, etc depending on the compiler and platform. Looks like you derive from basic_binary_archive in the mpi library. Looking at this class, the following types have special handling : class_id, class_id_optional, class_id_reference, version_type, item_version_type, collection size type and class name. You could override these with your own special handling. A better idea in my opinion would be to derive mpi_?archive from one level up - common_iarchive<mip_?> archive. Then any changes to binary_archive wouldn't effect you. You could then suppress transmission of version_type entirely thus saving band width. But you'd have to clone/reimplement basic_binary_archive.ipp - but that's only 90 lines. looking at this more carefully, it seems to me that there are two issues. a) mpi assumes that the size of a type in memory is the same as the size of the type when rendered in an archive. This has been true for most types. But now the size of the type in memory may not be the same as the size of the type in the binary_?archive. Since the binary archive has to address history and mpi archive doesn't I don't see a way to reconcile this. So I think the only way to really address this is to not use binary_archive as a base but move one level up to common_archive. I'm guessing that this will result in mpi_?archive being very small - maybe 50 lines. Then there's 100 lines cloned from basic_binary_archive. b) it seems that mpi depends on default constructability as part of the mechanism to account for the amount of space take by the type in an archive. This is creating a problem but doesn't seem to hard to fix using sizeof(T) in some form. (maybe wrapping it in a function template. Another useful idea would be to slightly refactor mpi_?archive so that it can be compiled and tested without having mpi headers available. That would be very helpful. In any case, I see the only realistic option for now would be to releast 1.44 with mpi in it's current state. It's been broken since 1.42. A good definitive fix - regardless of who/how it gets done, requires more time than the release schedule permits. Robert Ramey

...

Matthias

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey

6:30 a.m.

Matthias Troyer wrote:

...

On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

...

Let me summarize concisely: no user-defined archive type may use any of the serialize function, pointer handling, or any other aspect of the serialization library since all those are implementation details that you might change at any time.

Naturally I think that's a little harsh. I have strived over years to minimize situations such as this and feel I've been pretty successful at it - Especially considering the breadth and complexity of library. Not to mention the difficulty of a library which depends upon behavior undefined by the C++ standard which varies from compiler to compiler, not to mention that the it's the only library which has to consider the whole past history of all it's predecessors, and not to mention that it's being applied to an ever expanding area (e.g. dynamically loaded/unloaded DLLS). I'm sorry my efforts haven't been to your satisfaction.

...

I think that concludes the discussion and I will stop maintaining Boost.MPI

and I'm sorry to hear that. Robert Ramey

David Abrahams

25 Jul 25 Jul

12:05 a.m.

At Fri, 23 Jul 2010 22:30:03 -0800, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

...
Let me summarize concisely: no user-defined archive type may use any of the serialize function, pointer handling, or any other aspect of the serialization library since all those are implementation details that you might change at any time.

Naturally I think that's a little harsh.

As far as I can tell there was no criticism, express or implied, in Matthias's statement, so if it sounds harsh perhaps it is because the predicament in which he currently finds himself is harsh. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Robert Ramey

6:12 a.m.

David Abrahams wrote:

...

At Fri, 23 Jul 2010 22:30:03 -0800, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

...
Let me summarize concisely: no user-defined archive type may use any of the serialize function, pointer handling, or any other aspect of the serialization library since all those are implementation details that you might change at any time.

Naturally I think that's a little harsh.

As far as I can tell there was no criticism, express or implied, in Matthias's statement, so if it sounds harsh perhaps it is because the predicament in which he currently finds himself is harsh.

It's actually demonstrably incorrect - I was just being diplomatic Robert Ramey

Matthias Troyer

2:43 p.m.

On 25 Jul 2010, at 00:12, Robert Ramey wrote:

...

David Abrahams wrote:

...
At Fri, 23 Jul 2010 22:30:03 -0800, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

...
Let me summarize concisely: no user-defined archive type may use any of the serialize function, pointer handling, or any other aspect of the serialization library since all those are implementation details that you might change at any time.

Naturally I think that's a little harsh.

As far as I can tell there was no criticism, express or implied, in Matthias's statement, so if it sounds harsh perhaps it is because the predicament in which he currently finds himself is harsh.

It's actually demonstrably incorrect - I was just being diplomatic

Then please demonstrate how to implement an archive that actually does anything sensible and supports pointers, etc. without depending on what you call implementation details. The only way is implementing all the functionality from scratch. Matthias

Robert Ramey

4:28 p.m.

Matthias Troyer wrote:

...

On 25 Jul 2010, at 00:12, Robert Ramey wrote:

...
David Abrahams wrote:

...
At Fri, 23 Jul 2010 22:30:03 -0800, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

...
Let me summarize concisely: no user-defined archive type may use any of the serialize function, pointer handling, or any other aspect of the serialization library since all those are implementation details that you might change at any time.

Naturally I think that's a little harsh.

As far as I can tell there was no criticism, express or implied, in Matthias's statement, so if it sounds harsh perhaps it is because the predicament in which he currently finds himself is harsh.

It's actually demonstrably incorrect - I was just being diplomatic

Then please demonstrate how to implement an archive that actually does anything sensible and supports pointers, etc. without depending on what you call implementation details. The only way is implementing all the functionality from scratch.

Here's what to do: a) derive from common archive instead of binary_archive. This is what all the other archives do. At this level and above there is only interface and not implemenation. By doing this you will have total control of the relationship between the the native types as represented in memory and those written to to the storage. The interface at this level hasn't been frozen in any specification - but as far as I can recall - it's never been changed since the beginning. As an aside there is precedent for this. At one time there was a section of the documentation. "Deriving from an archive class". It used binary_archive as a base to create portable binary archive. This seemed a good idea at the time. But the implementation of binary_archive was changed (by you actually - though that's beside the point), and this example broke. As it was I had just contracted with a customer to improve binary portable archive - for a flat fee not less. I was pretty unhappy about discovering that it no longer worked and was in fact unfixable. Upside is that it wasn't hard to make the kind of change described in a) above and the result actually came out simpler. Here are some other suggestions which would be helpful. b) look into this issue of require default constructabiliy of types. Generally, default constructability is not a requirement of serializable types. I suspect that this came about by accident and I also suspect it would be easy to fix. c) look into the possibility of factoring out the MPI part so that the archive can be tested independently of the MPI facility. For example, if there were an mpibuf similar to filebuf, then the mpi_archive would could be verified orthogonally to mpi. The mpi_archive would in fact become a "super_binary" archive - which presumably be even faster than the current binary one and might have applicability beyond mpi itself. All the other archives do the above so I don't think these enhancements would be very difficult. Benefits would be: a) make things more robust - independent of binary archive. binary archive is sometimes hard because the types actually used are sometime hidden behind typedefs so it's hard to see what's going on. b) make things more easily testable on all platforms. c) make mpi_archive useful for things (which of course I can't forsee) beyond just mpi. I'm focused on getting past this current problem. And I think that implementing this suggestion is the only practical way to do it. I realize that this is a pain but on the upside, it would make a huge improvement to the mpi_archive at quite a small investment of effort. Robert Ramey

John Maddock

5:53 p.m.

...

I'm focused on getting past this current problem. And I think that implementing this suggestion is the only practical way to do it. I realize that this is a pain but on the upside, it would make a huge improvement to the mpi_archive at quite a small investment of effort.

I don't see how we can rewrite the archive class at this late stage... FYI: simply changing BOOST_ARCHIVE_STRONG_TYPEDEF to: #define BOOST_ARCHIVE_STRONG_TYPEDEF(T, D) \ class D : public T { \ public: \ explicit D(const T t) : T(t){} \ D() : T() {} \ }; \ /**/ is sufficient to get MPI compiling with sun's compiler on Linux (this is on Trunk... I assume that both libraries have unmerged fixes awaiting for the release branch... as that looks hopelessly broken at present?). I still haven't been able to run MPI's regression tests though due to linker invocation errors: sun.link.dll /home/john/bin/boost/bin.v2/libs/mpi/build/sun-12.1/debug/stdlib-sun-stlport/threading-multi/libboost_mpi.so.1.44.0 CC: Warning: Option -Wl,--export-dynamic passed to ld, if ld is invoked, ignored otherwise /home/john/SunStudio/sunstudio12.1/prod/lib/ld: unrecognized option '-Wl,--export-dynamic' /home/john/SunStudio/sunstudio12.1/prod/lib/ld: use the --help option for usage information But maybe this gives us a way forward for *this* release? Perhaps with the proviso that Boost.Serialization is allowed to move forward and make breaking changes to it's implementation for the next release? Just my 2c worth... John.

Robert Ramey

26 Jul 26 Jul

4:49 a.m.

John Maddock wrote:

...

...
I'm focused on getting past this current problem. And I think that implementing this suggestion is the only practical way to do it. I realize that this is a pain but on the upside, it would make a huge improvement to the mpi_archive at quite a small investment of effort.

I don't see how we can rewrite the archive class at this late stage...

FYI: simply changing BOOST_ARCHIVE_STRONG_TYPEDEF to:

#define BOOST_ARCHIVE_STRONG_TYPEDEF(T, D) \ class D : public T { \ public: \ explicit D(const T t) : T(t){} \ D() : T() {} \ }; \ /**/

is sufficient to get MPI compiling with sun's compiler on Linux (this is on Trunk... I assume that both libraries have unmerged fixes awaiting for the release branch... as that looks hopelessly broken at present?).

<snip>

...

But maybe this gives us a way forward for *this* release? Perhaps with the proviso that Boost.Serialization is allowed to move forward and make breaking changes to it's implementation for the next release?

Just my 2c worth... John.

OK, I can see how that might help. I can check in this change. Before doing this, I check the test matrix and now see that the compile failures of the SUN compiler are gone so now I'm not sure whether I should do this or not. Mathias? Robert Ramey

Matthias Troyer

25 Jul 25 Jul

8:44 p.m.

Hi Robert, On 25 Jul 2010, at 10:28, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
O Then please demonstrate how to implement an archive that actually does anything sensible and supports pointers, etc. without depending on what you call implementation details. The only way is implementing all the functionality from scratch.

Here's what to do:

a) derive from common archive instead of binary_archive. This is what all the other archives do. At this level and above there is only interface and not implemenation. By doing this you will have total control of the relationship between the the native types as represented in memory and those written to to the storage. The interface at this level hasn't been frozen in any specification - but as far as I can recall - it's never been changed since the beginning.

This still does not solve the basic issue I'm trying to tell you. The problem we have now does not come about because of deriving from binary_archive. Even the common archive uses version_type, etc.. In order to implement an archive I will need to know the list of these primitive types that I have to support and their interface. Contrary to what you say the interface to those types has changed now. If you declare those types to be implementation details then I still cannot implement an archive without making the "mistake: of relying on implementation details. I realize that those might change occasionally, but then I want an updated list and a heads-up announcement of potentially breaking changes.

...

b) look into this issue of require default constructabiliy of types. Generally, default constructability is not a requirement of serializable types. I suspect that this came about by accident and I also suspect it would be easy to fix.

Sure, it can be fixed by a redesign of the MPI datatype creation. I just fear that such a redesign of a core part of the library a few days before a new release is not a good idea.

...

c) look into the possibility of factoring out the MPI part so that the archive can be tested independently of the MPI facility. For example, if there were an mpibuf similar to filebuf, then the mpi_archive would could be verified orthogonally to mpi. The mpi_archive would in fact become a "super_binary" archive - which presumably be even faster than the current binary one and might have applicability beyond mpi itself.

...

All the other archives do the above so I don't think these enhancements would be very difficult. Benefits would be:

a) make things more robust - independent of binary archive. binary archive is sometimes hard because the types actually used are sometime hidden behind typedefs so it's hard to see what's going on.

b) make things more easily testable on all platforms.

c) make mpi_archive useful for things (which of course I can't forsee) beyond just mpi.

I'm focused on getting past this current problem. And I think that implementing this suggestion is the only practical way to do it. I realize that this is a pain but on the upside, it would make a huge improvement to the mpi_archive at quite a small investment of effort.

Actually it seems you do not understand MPI well. Your proposal is not feasible for either of the archives we use: The "content" mechanism just sends from memory to memory, never going through any buffer. The packed archives use the MPI library to pack a buffer. Both require an MPI library. Matthias

Robert Ramey

26 Jul 26 Jul

5:19 a.m.

Matthias Troyer wrote:

...

Hi Robert,

On 25 Jul 2010, at 10:28, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
O Then please demonstrate how to implement an archive that actually does anything sensible and supports pointers, etc. without depending on what you call implementation details. The only way is implementing all the functionality from scratch.

Here's what to do:

a) derive from common archive instead of binary_archive. This is what all the other archives do. At this level and above there is only interface and not implemenation. By doing this you will have total control of the relationship between the the native types as represented in memory and those written to to the storage. The interface at this level hasn't been frozen in any specification - but as far as I can recall - it's never been changed since the beginning.

...

This still does not solve the basic issue I'm trying to tell you. The problem we have now does not come about because of deriving from binary_archive.

OK - I had thought the problem came from the fact that now the size of certain types as stored in binary archive is not the same as the native size of the type. Example version_type internally is now 32 bit but stored as 8, 16, or 32 bits depending on the version of the archive. It had not occured to me until just now that the mpi library might be dependent on the size of the type I suppose I jumped to the wrong conclusion as I first noticed the problem when I make the rendering of version_type in the file different than the size of version type as stored in memory.

...

Even the common archive uses version_type, etc.. In order to implement an archive I will need to know the list of these primitive types that I have to support and their interface.

...

Contrary > to what you say the interface to those types has changed now.

Hmmm - I did change the size of types. And I did restrict the usage of the types as opposed to STRONG_TYPEDEF which permits anything - conversions, arithetics, etc. I did this to make the system more robust and avoid getting surprised by "automatic" behavior. None of the archive classes in the library "complained" about these restrictions. (actually, not quite true - but the complaints were easily fixed and made the code more robust by minimizing conversions etc.). And truth is, it just never occurred to me that other archives might perform these operations on things like version_type, class_type etc. If it had occurred to me, I likely would have assumed that any fixes would be trivial as they were in my case.

...

If you declare those types to be implementation details then I still cannot implement an archive without making the "mistake: of relying on implementation details.

I think what I'm not seeing is why you need to rely upon how these types are implemented. The binary archive has to do this since there is the question of historical archives to be addressed. But in your case I don't see where the problem is coming from. It seems to me that the only connection be mpi_archives and specific types would be skip version_type and class_optional type in the mpi_archive class since you don't need them.

...

...
b) look into this issue of require default constructabiliy of types. Generally, default constructability is not a requirement of serializable types. I suspect that this came about by accident and I also suspect it would be easy to fix.

Sure, it can be fixed by a redesign of the MPI datatype creation. I just fear that such a redesign of a core part of the library a few days before a new release is not a good idea.

I already checked in modification which makes the version_type default constructor public. I did this in the effort to get things over the hump and don't think it's a big deal even though I'm not crazy about it. If this were the only issue there would be no problem here. update - John maddoc has suggested a similar change in STRONG_TYPEDEF which I could do. But now I look at the sun results and all these compile errors are gone - so I assumed you made some sort of adjustment here. It's become clear to me that a definitive solution to this won't happen in the next few days. It's really going to take more time. This not because I think that it's a lot work, it's just that the required back and forth is a time consuming process which can't be hurried. Sort of like a chess match where each side has to think about what the best move is.

...

...
c) look into the possibility of factoring out the MPI part so that the archive can be tested independently of the MPI facility. For example, if there were an mpibuf similar to filebuf, then the mpi_archive would could be verified orthogonally to mpi. The mpi_archive would in fact become a "super_binary" archive - which presumably be even faster than the current binary one and might have applicability beyond mpi itself.

...
All the other archives do the above so I don't think these enhancements would be very difficult. Benefits would be:

a) make things more robust - independent of binary archive. binary archive is sometimes hard because the types actually used are sometime hidden behind typedefs so it's hard to see what's going on.

b) make things more easily testable on all platforms.

c) make mpi_archive useful for things (which of course I can't forsee) beyond just mpi.

I'm focused on getting past this current problem. And I think that implementing this suggestion is the only practical way to do it. I realize that this is a pain but on the upside, it would make a huge improvement to the mpi_archive at quite a small investment of effort.

Actually it seems you do not understand MPI well.

lol - at last one thing we can agree on!

...

Your proposal is not feasible for either of the archives we use: The "content" mechanism just sends from memory to memory, never going through any buffer. The packed archives use the MPI library to pack a buffer. Both require an MPI library.

No dispute here. I tried to compile mpi_archive and took a cursory look at the code. I have no idea how feasible my suggestions are, I just thought they might make things better. Feel free to ignore them. I think I said this, but I got the idea that the skeleton presumed that the size of data as stored in the binary archive was the same of the size of the data type. This used to be true, but I had to break that to maintain compatibility with historical archives. So if I'm wrong about the skeleton, and it's only a question of either my adding operations to version_type etc or you tweak your code to use only the subset of operations that these types now permit the problem is much smaller than I thought. Robert Ramey

John Maddock

9:15 a.m.

...

update - John maddoc has suggested a similar change in STRONG_TYPEDEF which I could do. But now I look at the sun results and all these compile errors are gone - so I assumed you made some sort of adjustment here.

Looks like there were changes to both the serialization and mpi libraries yesterday... and re-running the tests today the compiler errors are indeed now gone. Just the linker errors from -Wl--export-dynamic, not sure where that's coming from though.... John.

Matthias Troyer

3:55 p.m.

On 26 Jul 2010, at 03:15, John Maddock wrote:

...

...
update - John maddoc has suggested a similar change in STRONG_TYPEDEF which I could do. But now I look at the sun results and all these compile errors are gone - so I assumed you made some sort of adjustment here.

Looks like there were changes to both the serialization and mpi libraries yesterday... and re-running the tests today the compiler errors are indeed now gone. Just the linker errors from -Wl--export-dynamic, not sure where that's coming from though....

I added a quick workaround that might still break user code but avoids the spurious instantiation of that default constructor by the Sun compiler. This is after all actually a compiler bug and not a but in Boost.Serialization or Boost.MPI. Matthias

Belcourt, Kenneth

5:53 p.m.

On Jul 26, 2010, at 9:55 AM, Matthias Troyer wrote:

...

On 26 Jul 2010, at 03:15, John Maddock wrote:

...
...
update - John maddoc has suggested a similar change in STRONG_TYPEDEF which I could do. But now I look at the sun results and all these compile errors are gone - so I assumed you made some sort of adjustment here.

Looks like there were changes to both the serialization and mpi libraries yesterday... and re-running the tests today the compiler errors are indeed now gone. Just the linker errors from -Wl-- export-dynamic, not sure where that's coming from though....

I added a quick workaround that might still break user code but avoids the spurious instantiation of that default constructor by the Sun compiler. This is after all actually a compiler bug and not a bug in Boost.Serialization or Boost.MPI.

Yes, and I appreciate your collective efforts to get MPI working again on the Sun. With the next reporting cycle the MPI tests on the Sun should be (mostly) all passing. If tonight's normal test cycle still looks good for MPI, then I'd say move the changes into the branch. Again, thanks for your efforts in getting this fixed. -- Noel

Matthias Troyer

28 Jul 28 Jul

4:28 a.m.

On 26 Jul 2010, at 11:53, Belcourt, Kenneth wrote:

...

Yes, and I appreciate your collective efforts to get MPI working again on the Sun. With the next reporting cycle the MPI tests on the Sun should be (mostly) all passing. If tonight's normal test cycle still looks good for MPI, then I'd say move the changes into the branch.

Would you say we are stable enough to move the changes into the release branch? Matthias

Belcourt, Kenneth

1:06 p.m.

On Jul 27, 2010, at 10:28 PM, Matthias Troyer wrote:

...

On 26 Jul 2010, at 11:53, Belcourt, Kenneth wrote:

...
Yes, and I appreciate your collective efforts to get MPI working again on the Sun. With the next reporting cycle the MPI tests on the Sun should be (mostly) all passing. If tonight's normal test cycle still looks good for MPI, then I'd say move the changes into the branch.

Would you say we are stable enough to move the changes into the release branch?

Yes. -- Noel

Matthias Troyer

25 Jul 25 Jul

8:51 p.m.

On 25 Jul 2010, at 10:28, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
Then please demonstrate how to implement an archive that actually does anything sensible and supports pointers, etc. without depending on what you call implementation details. The only way is implementing all the functionality from scratch.

Here's what to do:

a) derive from common archive instead of binary_archive.

I have one more question in addition to my previous comment: common_oarchive is in namespace archive::detail while basic_binary_oarchive is in the top namespace archive. Do I understand you correctly that deriving from archive::detail::common_oarchive is safe and not considered depending on implementation details, while deriving from archive::basic_binary_oarchive is not? I can easily change all the Boost.MPI archives to use archive::detail::common_oarchive where they now use archive::basic_binary_oarchive (although this will not solve the issue we have right now). Matthias

Robert Ramey

26 Jul 26 Jul

5:56 a.m.

Matthias Troyer wrote:

...

On 25 Jul 2010, at 10:28, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
Then please demonstrate how to implement an archive that actually does anything sensible and supports pointers, etc. without depending on what you call implementation details. The only way is implementing all the functionality from scratch.

Here's what to do:

a) derive from common archive instead of binary_archive.

I have one more question in addition to my previous comment:

common_oarchive is in namespace archive::detail while basic_binary_oarchive is in the top namespace archive.

...

Do I understand you correctly that deriving from archive::detail::common_oarchive is safe and not considered depending on implementation details, while deriving from archive::basic_binary_oarchive is not?

I can easily change all the Boost.MPI archives to use archive::detail::common_oarchive where they now use archive::basic_binary_oarchive (although this will not solve the issue we have right now).

I can see where this would be confusing. Let me indicate what I mean of a few of the terms being used. archive concept - minimal concept which states the function interface that an archive class has to support. Doesn't say anything specific about the sematics or functionality.

...

From this I made some models of this concept. These models implemented behavior that was deemed useful. I factored the implementation of common functionality in a few different places:

base_archive - library code common_archive - library and types needed to implement the desired functionality. interface - iserialzation - common code for user's types. Now when I thought of "user", I was thinking of someone who just an archive already built. I didn't really think of a person making a new archive as a "user". Truth is, I was just factoring code. I put all this stuff in "detail" namespace because I didn't think it was interesting to users. And of course it isn't documented like other stuff is and one might change it because after all it's a "detail". And the stuff in "detail" has it's own "public" and "implemention detail" aspects. For an archive developer who wants to develope an archive with functionality similar to the existing ones, it's not a detail. He wants to know that the public functions aren't going to change. As I've said - I just never thought about this. On the other hand, I don't think the "detail" interface has changed very much (if at all) over time. I can't honestly say I know this - because as I've said - I never thought about it. I suspect that it hasn't changed much because we haven't had much if any breakage originating in this area. So - back to our problem. I had thought the the source of the issue was coupling mpi_archive/skeleton to binary_archive implementation. That's why I thought deriving from common_archive would help. If I'm wrong about the above then deriving mpi_archive from common_archive won't help - though it would probably be a good idea. If the only problem is that version_type eliminated some operations mpi_archive depended on all integer types to have (STRONG_TYPEDEF) this can also be worked out one way or the other without too much difficulty. If all the above is true - this shouldn't be so hard to address. Given that we've had so much difficulty with this, it's possible that one of the above is not true. Finally, you've indicated that an archive writer needs to know the list of internal types in the archive and that they'll never change. This would suggest to me that perhaps a separate section in the documentation describing the "common archive implementation" (text_archive, etc)distinct from other "sample implementations" (trivial archive, simple_log_archive etc.) a description of the functionality of these archives. Basically this supplies the missing "semantics" left undefined by the archive concept. Basically it would list this functionaly, pointers, tracking versioning, etc. common - this implemented special types used internally and their interface. We can discuss whether these types should have the rich interface permited by STRONG_TYPEDEF or a narrow one which is useful for catching coding errors. What is still missing here? Robert Ramey

David Abrahams

11:05 a.m.

[sent from tiny mobile device] On Jul 26, 2010, at 1:56 AM, "Robert Ramey" <ramey@rrsd.com> wrote:

...

As I've said - I just never thought about this. On the other hand, I don't think the "detail" interface has changed very much (if at all) over time. I can't honestly say I know this - because as I've said - I never thought about it. I suspect that it hasn't changed much because we haven't had much if any breakage originating in this area.

Careful and consistent application of the boost concept check library would have caught any problem arising from models not satisfying stated concepts and operations relying on more than documented concept requirements. In fact, after this release would be a good time to apply BCCL to both libraries, to avoid such issues in the future.

Matthias Troyer

3:51 p.m.

On 26 Jul 2010, at 05:05, David Abrahams wrote:

...

[sent from tiny mobile device]

On Jul 26, 2010, at 1:56 AM, "Robert Ramey" <ramey@rrsd.com> wrote:

...
As I've said - I just never thought about this. On the other hand, I don't think the "detail" interface has changed very much (if at all) over time. I can't honestly say I know this - because as I've said - I never thought about it. I suspect that it hasn't changed much because we haven't had much if any breakage originating in this area.

Careful and consistent application of the boost concept check library would have caught any problem arising from models not satisfying stated concepts and operations relying on more than documented concept requirements. In fact, after this release would be a good time to apply BCCL to both libraries, to avoid such issues in the future.

Dave, the issue is that no concepts were defined for the classes under discussion. That is more the problem than not using BCCL and is what I will need more urgently if Boost.MPI should remain maintainable. Matthias

David Abrahams

7:22 p.m.

On Jul 26, 2010, at 11:51 AM, Matthias Troyer wrote:

...

On 26 Jul 2010, at 05:05, David Abrahams wrote:

...
[sent from tiny mobile device]

On Jul 26, 2010, at 1:56 AM, "Robert Ramey" <ramey@rrsd.com> wrote:

...
As I've said - I just never thought about this. On the other hand, I don't think the "detail" interface has changed very much (if at all) over time. I can't honestly say I know this - because as I've said - I never thought about it. I suspect that it hasn't changed much because we haven't had much if any breakage originating in this area.

Careful and consistent application of the boost concept check library would have caught any problem arising from models not satisfying stated concepts and operations relying on more than documented concept requirements. In fact, after this release would be a good time to apply BCCL to both libraries, to avoid such issues in the future.

Dave, the issue is that no concepts were defined for the classes under discussion.

Are you saying that "classes under discussion" (by which I suppose you mean the primitive size type) were assumed by the serialization library to model a particular concept that was never defined? -- David Abrahams BoostPro Computing http://boostpro.com

Matthias Troyer

8:15 p.m.

On 26 Jul 2010, at 13:22, David Abrahams wrote:

...

On Jul 26, 2010, at 11:51 AM, Matthias Troyer wrote:

...
On 26 Jul 2010, at 05:05, David Abrahams wrote:

...
[sent from tiny mobile device]

On Jul 26, 2010, at 1:56 AM, "Robert Ramey" <ramey@rrsd.com> wrote:

...
As I've said - I just never thought about this. On the other hand, I don't think the "detail" interface has changed very much (if at all) over time. I can't honestly say I know this - because as I've said - I never thought about it. I suspect that it hasn't changed much because we haven't had much if any breakage originating in this area.

Careful and consistent application of the boost concept check library would have caught any problem arising from models not satisfying stated concepts and operations relying on more than documented concept requirements. In fact, after this release would be a good time to apply BCCL to both libraries, to avoid such issues in the future.

Dave, the issue is that no concepts were defined for the classes under discussion.

Are you saying that "classes under discussion" (by which I suppose you mean the primitive size type) were assumed by the serialization library to model a particular concept that was never defined?

Indeed. The primitive types were implemented using a "strong typedef" which was designed to make them model most of the concepts that the underlying integral type models. However this has never been explicitly stated. What would be needed is that the concept those types model is explicitly defined - then I can easily design Boost.MPI to conform to that concept. However, if this is defined as an declared to be an implementation detail that I am not allowed to use without risking the code to be broken anytime then I have to essentially reimplement Boost.Serialization from scratch. In 1.44 Robert has changed the implementation of the "strong typedef", greatly reducing the concepts the type models. To cope with that Robert had to rewrite parts of Boost.Serialization, but the changes also lead to the breaking of Boost.MPI and most likely also other archives based on Boost.Serialization. As far as I can see Robert removed the default constructor since he did not need it anymore after changing his archives - but he did not realize that there was other code that might get broken. Matthias

Robert Ramey

27 Jul 27 Jul

12:06 a.m.

Matthias Troyer wrote:

...

On 26 Jul 2010, at 13:22, David Abrahams wrote: In 1.44 Robert has changed the implementation of the "strong typedef", greatly reducing the concepts the type models. To cope with that Robert had to rewrite parts of Boost.Serialization, but the changes also lead to the breaking of Boost.MPI and most likely also other archives based on Boost.Serialization. As far as I can see Robert removed the default constructor since he did not need it anymore after changing his archives - but he did not realize that there was other code that might get broken.

I've been thinking about this some more and now I remember a little more about the history and rationale for the way things are the way they are. When I factored out the class ?_?primitive I had in mind that this would be the "C primtive" layer which would include serialization for the c++ primitive types. The _?archive layer would handle other types with special handling or passing them on to the "primitive" class. I ended up with two prmitive classes text and binary. text used stream - binary just saved/loaded the raw bytes. At the time you posed the question of why not use least16_t, etc in the primitive class. I wasn't sold on the idea as it broke my original concept, but it did make me wonder - if maybe it wasn't a better idea. In fact, I think the original C made a big mistake in making primtives (int, etc) whose size vary from machine to machine. I'm thinking we would have better off using int16,etc as primitives and typedef int for each machine. So when I made the internal serializable types for archives, I made sure that they would all be convertible to int, unsigned int, etc. so that text_primitive could handle them without having to list them one by one and tie them to some specifc sort of integer. I got all this for free using STRONG_TYPEDEF. When a few of the types changed and I had to make them more complicated, I had some problems and made these types "less" than integers. This let me trap mis-use of these numbers which don't have all features that integers do - e.g. it makes no sense to add version #. Since I included conversions to c++ primitives - all my archives worked. The only changes I had to make were fixing inadvertant usage of these types as integers which before were generating warnings and now were generating errors. I never thought of of other archives. I guess i just presumed that they would use either text_primitive or binary_primitive or that they wouldn't be any more problem than my archive classes were. I knew little of MPI but I did know it derived from binary_archive so it would never occur to me that there would be a problem. I just assumed it worked just as binary_archive does - just send the bytes. Looking a little more, it seems that it sends the data as MPI types. so you have to convert each kind of integer. I'm still not getting why implicit convesion operators don't do class_id_type -> int16_t -> int ->send as mpi_integer or something like that. In otherwords, if this works for text_archives, why doesn't it work for mpi_archives? The "new" types do convert to integers or typedefs of integers (note: NOT STRONG_TYPEDEF) so I'm surprised this comes up - even in mpi. Anyway - the underlying concept for STRONG_TYPEDEF is "convertible to underlying type" which I thought I implemented in the "hand rolled" implementations. So, I'm not sure what should be added to the hand rolled implementions to address this. Note that I'm not declining to do anything, I'm just not sure what best thing to do is. Robert Ramey

Matthias Troyer

26 Jul 26 Jul

11:45 p.m.

...

I'm still not getting why implicit convesion operators don't do

class_id_type -> int16_t -> int ->send as mpi_integer

or something like that. In otherwords, if this works for text_archives, why doesn't it work for mpi_archives?

The "new" types do convert to integers or typedefs of integers (note: NOT STRONG_TYPEDEF) so I'm surprised this comes up - even in mpi.

The reason is that the MPI archives can deal with more types than just the builtin C++ types. For compound types we use their serialize function to actually create the corresponding MPI datatype. The problem now is that these "primitive" types do not have a serialize function, and thus, like for the builtin types, we have to manually specify the correct MPI datatype. What I need for that is not the integer value of the variable that is stored in the type (which I could get by casting to any integer), but the type of it. Now I could probably write a set of overloaded functions to figure out which conversion is the best and from that deduce that type, but it would be easier and safer to have the type accessible as a member. In any case I will not do anything unless I can expect a stable interface and do not have to rely again on what you call implementation details.

...

Anyway - the underlying concept for STRONG_TYPEDEF is "convertible to underlying type" which I thought I implemented in the "hand rolled" implementations. So, I'm not sure what should be added to the hand rolled implementions to address this.

...

Note that I'm not declining to do anything, I'm just not sure what best thing to do is.

Having the base_type or however you call it accessible, and having a documented interface to these primitive types and a stable list of them should be enough. Any additional type will need a change to Boost.MPI, just as any change in the interface of these types. Matthias

Robert Ramey

27 Jul 27 Jul

5:33 p.m.

Matthias Troyer wrote:

...

...
Note that I'm not declining to do anything, I'm just not sure what best thing to do is.

Having the base_type or however you call it accessible, and having a documented interface to these primitive types and a stable list of them should be enough. Any additional type will need a change to Boost.MPI, just as any change in the interface of these types.

I just looked at STRONG_TYPEDEF. It has always included a default constructor for the derived type. Would making sure that the "new" type implemenations include a default constructor fix the problem. I found it helpful to exclude it, but now that I've got the potentitial bugs out of my archives, it's not really that big a deal for me. That is for me it's been helpful to exclude it, but if you find it helpful to included it, I can put it back in so it will look just like all other STRONG_TYPEDEFS. Robert Ramey

...

Matthias

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey

5:45 p.m.

Robert Ramey wrote:

...

Matthias Troyer wrote:

...
...
Note that I'm not declining to do anything, I'm just not sure what best thing to do is.

Having the base_type or however you call it accessible, and having a documented interface to these primitive types and a stable list of them should be enough. Any additional type will need a change to Boost.MPI, just as any change in the interface of these types.

I just looked at STRONG_TYPEDEF. It has always included a default constructor for the derived type. Would making sure that the "new" type implemenations include a default constructor fix the problem. I found it helpful to exclude it, but now that I've got the potentitial bugs out of my archives, it's not really that big a deal for me.

That is for me it's been helpful to exclude it, but if you find it helpful to included it, I can put it back in so it will look just like all other STRONG_TYPEDEFS.

Actually I misspoke here. We could add a default constructor to BOOST_ARCHIVE_STRONG_TYPEDEF which is defined in base_archive.hpp where all the types in question are defined. Robert Ramey

Matthias Troyer

5:54 p.m.

On 27 Jul 2010, at 11:45, Robert Ramey wrote:

...

Robert Ramey wrote:

...
Matthias Troyer wrote:

...
...
Note that I'm not declining to do anything, I'm just not sure what best thing to do is.

Having the base_type or however you call it accessible, and having a documented interface to these primitive types and a stable list of them should be enough. Any additional type will need a change to Boost.MPI, just as any change in the interface of these types.

I just looked at STRONG_TYPEDEF. It has always included a default constructor for the derived type. Would making sure that the "new" type implemenations include a default constructor fix the problem. I found it helpful to exclude it, but now that I've got the potentitial bugs out of my archives, it's not really that big a deal for me.

The default constructor will be useful and fix the Sun problem.

...

...
That is for me it's been helpful to exclude it, but if you find it helpful to included it, I can put it back in so it will look just like all other STRONG_TYPEDEFS.

Actually I misspoke here. We could add a default constructor to BOOST_ARCHIVE_STRONG_TYPEDEF which is defined in base_archive.hpp where all the types in question are defined.

Yes, it would be good if all the STRONG_TYPEDEFS model the same concepts. Matthias

David Abrahams

12:25 a.m.

On Jul 26, 2010, at 4:15 PM, Matthias Troyer wrote:

...

...
...
...
Careful and consistent application of the boost concept check library would have caught any problem arising from models not satisfying stated concepts and operations relying on more than documented concept requirements. In fact, after this release would be a good time to apply BCCL to both libraries, to avoid such issues in the future.

Dave, the issue is that no concepts were defined for the classes under discussion.

Are you saying that "classes under discussion" (by which I suppose you mean the primitive size type) were assumed by the serialization library to model a particular concept that was never defined?

Indeed. The primitive types were implemented using a "strong typedef" which was designed to make them model most of the concepts that the underlying integral type models. However this has never been explicitly stated. What would be needed is that the concept those types model is explicitly defined - then I can easily design Boost.MPI to conform to that concept. However, if this is defined as an declared to be an implementation detail that I am not allowed to use without risking the code to be broken anytime then I have to essentially reimplement Boost.Serialization from scratch.

In 1.44 Robert has changed the implementation of the "strong typedef", greatly reducing the concepts the type models.

I understand all that, but none of that seems to be an example of the serialization library making concept assumptions that were not defined. Such an assumption would basically always take the form: template <class T> void some_serialization_component( T x ) { some_operation_on( x ); } or template <class T> class some_serialization_component { ... some_operation_on( T ) ... }; where T is not constrained by any concept in documentation, or some_operation_on( x ) is not a requirement of any concept that constrains T.

...

To cope with that Robert had to rewrite parts of Boost.Serialization, but the changes also lead to the breaking of Boost.MPI and most likely also other archives based on Boost.Serialization. As far as I can see Robert removed the default constructor since he did not need it anymore after changing his archives - but he did not realize that there was other code that might get broken.

All this I understand too. And yet, listening carefully in this thread, I haven't yet heard of any instances of under-documentation of concept requirements (or failure to model stated concepts, for that matter) on the part of Boost.Serialization I think it's crucially important to _correctly_ identify the cause of this impedance mismatch, and so far, I don't think that has happened. -- David Abrahams BoostPro Computing http://boostpro.com

Matthias Troyer

1:36 a.m.

On 26 Jul 2010, at 18:25, David Abrahams wrote:

...

On Jul 26, 2010, at 4:15 PM, Matthias Troyer wrote:

...
...
...
...
Careful and consistent application of the boost concept check library would have caught any problem arising from models not satisfying stated concepts and operations relying on more than documented concept requirements. In fact, after this release would be a good time to apply BCCL to both libraries, to avoid such issues in the future.

Dave, the issue is that no concepts were defined for the classes under discussion.

Are you saying that "classes under discussion" (by which I suppose you mean the primitive size type) were assumed by the serialization library to model a particular concept that was never defined?

Indeed. The primitive types were implemented using a "strong typedef" which was designed to make them model most of the concepts that the underlying integral type models. However this has never been explicitly stated. What would be needed is that the concept those types model is explicitly defined - then I can easily design Boost.MPI to conform to that concept. However, if this is defined as an declared to be an implementation detail that I am not allowed to use without risking the code to be broken anytime then I have to essentially reimplement Boost.Serialization from scratch.

In 1.44 Robert has changed the implementation of the "strong typedef", greatly reducing the concepts the type models.

I understand all that, but none of that seems to be an example of the serialization library making concept assumptions that were not defined.

Such an assumption would basically always take the form:

template <class T> void some_serialization_component( T x ) { some_operation_on( x ); }

or

template <class T> class some_serialization_component { ... some_operation_on( T ) ... };

where T is not constrained by any concept in documentation, or some_operation_on( x ) is not a requirement of any concept that constrains T.

...
To cope with that Robert had to rewrite parts of Boost.Serialization, but the changes also lead to the breaking of Boost.MPI and most likely also other archives based on Boost.Serialization. As far as I can see Robert removed the default constructor since he did not need it anymore after changing his archives - but he did not realize that there was other code that might get broken.

All this I understand too. And yet, listening carefully in this thread, I haven't yet heard of any instances of under-documentation of concept requirements (or failure to model stated concepts, for that matter) on the part of Boost.Serialization

I think it's crucially important to _correctly_ identify the cause of this impedance mismatch, and so far, I don't think that has happened.

Here is the issue: if one does not want to implement serialization from scratch one has to derive from boost::archive::detail::common_[io]archive. The *implicit* and not documented requirements for such a derived archive are to deal with serialization os - std::string (and optionally std::wstring?) - fundamental C++ integral, boolean, and floating point types - an *unspecified* list of "primitive" types Those primitive types had the semantics of integral types, but no concepts were documented for those types, and neither were those types one had to support part of the public interface. However, any archive still had to correctly serialize those types. would you call this missing documentation or specification? Matthias

Robert Ramey

5:19 p.m.

Matthias Troyer wrote:

...

On 26 Jul 2010, at 18:25, David Abrahams wrote:

...

...
I think it's crucially important to _correctly_ identify the cause of this impedance mismatch, and so far, I don't think that has happened.

Here is the issue: if one does not want to implement serialization from scratch one has to derive from boost::archive::detail::common_[io]archive. The *implicit* and not documented requirements for such a derived archive are to deal with serialization os

- std::string (and optionally std::wstring?) - fundamental C++ integral, boolean, and floating point types - an *unspecified* list of "primitive" types

...

Those primitive types had the semantics of integral types, but no concepts were documented for those types, and neither were those types one had to support part of the public interface. However, any archive still had to correctly serialize those types. would you call this missing documentation or specification?

First of all - these are serializable types by virtue of the fact that they are convertible to integers and references to integers. This is why the the issue never came to my attention. When I "improved" them, I did remove default constructability because it was convenient and helpful - and it did trip some compile errors in my archves. I just fixed them. The whole excersize diminished warnings and potential code breakage so I was happy with it. I confess that the impact other archives didn't occur to me. Sorry about that. There is a section in the documentation Implemenation Notes/Code Structure/Files which implement the library/Archive Development which includes information which I thought would be helpful to those creating their own versions of the archives implemented inside the library. I'm certainly willing to entertain submissions that enhance and/or improve this part of the documentation. I can easily add the list of types used internally, (version_type etc) to the documentation along with the guarentee that they will always be convertible to integers and they will be marked as "primitive". Would that be helpful? If you say that really need to know about each type I'm not going to dispute it. But I will say I'm not seeing it. I sort of expect to see something like: class mpi_oprimitive ... { mpioutputfunction m_out; template<typename T, std::size_t s> mpi_out{ // uh-oh programming error - no mpi size BOOST_STATIC_ASSERT(0 == s); } mpiout_int<int, 2>(int t){ mpi_out(MPI_SHORT_INT(t))); } mpiout_int<int, 4>(int t){ mpi_out(MPI_LONG_INT(t))); } ...? void save(int t){ mpiout_int(t, sizeof(t)); } void save(short int t){ mpiout_int(t, sizeof(t)); } void save(long int t){ mpiout_int(t, sizeof(t)); } ..? mpiout_uint<int, 2>(unsigned int t){ mpi_out(MPI_SHORT_INT(t))); } mpiout_uint<int, 4>(unsigned int t){ mpi_out(MPI_LONG_INT(t))); } ...? void save(unsigned int t){ mpiout_uint(t, sizeof(t)); } void save(const unsigned short int t){ mpiout_uint(t, sizeof(t)); } void save(const unsigned long int t){ mpiout_int(t, sizeof(t)); } ..? } ... I'm not even suggesting this, I'm just explaining why adding "more" to the documentation or spec or whatever would have occurred to me. Robert Ramey

Matthias Troyer

5:52 p.m.

On 27 Jul 2010, at 11:19, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
On 26 Jul 2010, at 18:25, David Abrahams wrote:

...
...
I think it's crucially important to _correctly_ identify the cause of this impedance mismatch, and so far, I don't think that has happened.

Here is the issue: if one does not want to implement serialization from scratch one has to derive from boost::archive::detail::common_[io]archive. The *implicit* and not documented requirements for such a derived archive are to deal with serialization os

- std::string (and optionally std::wstring?) - fundamental C++ integral, boolean, and floating point types - an *unspecified* list of "primitive" types

...
Those primitive types had the semantics of integral types, but no concepts were documented for those types, and neither were those types one had to support part of the public interface. However, any archive still had to correctly serialize those types. would you call this missing documentation or specification?

First of all - these are serializable types by virtue of the fact that they are convertible to integers and references to integers.

Robert, I don't see how implicit serializability by virtue of implicit conversion is a part of the serializability concept. Yes, it works for the text archive, and it works for the binary archive since you just copy the bits there - but does every archive have to check all possible conversions?

...

This is why the the issue never came to my attention. When I "improved" them, I did remove default constructability because it was convenient and helpful - and it did trip some compile errors in my archves. I just fixed them. The whole excersize diminished warnings and potential code breakage so I was happy with it. I confess that the impact other archives didn't occur to me. Sorry about that.

Let's try to improve the situation.

...

There is a section in the documentation Implemenation Notes/Code Structure/Files which implement the library/Archive Development which includes information which I thought would be helpful to those creating their own versions of the archives implemented inside the library. I'm certainly willing to entertain submissions that enhance and/or improve this part of the documentation. I can easily add the list of types used internally, (version_type etc) to the documentation along with the guarentee that they will always be convertible to integers and they will be marked as "primitive". Would that be helpful?

Yes, that would be very useful,

...

If you say that really need to know about each type I'm not going to dispute it. But I will say I'm not seeing it.

I sort of expect to see something like:

class mpi_oprimitive ... { mpioutputfunction m_out; template<typename T, std::size_t s> mpi_out{ // uh-oh programming error - no mpi size BOOST_STATIC_ASSERT(0 == s); }

mpiout_int<int, 2>(int t){ mpi_out(MPI_SHORT_INT(t))); } mpiout_int<int, 4>(int t){ mpi_out(MPI_LONG_INT(t))); } ...? void save(int t){ mpiout_int(t, sizeof(t)); } void save(short int t){ mpiout_int(t, sizeof(t)); } void save(long int t){ mpiout_int(t, sizeof(t)); } ..?

mpiout_uint<int, 2>(unsigned int t){ mpi_out(MPI_SHORT_INT(t))); } mpiout_uint<int, 4>(unsigned int t){ mpi_out(MPI_LONG_INT(t))); } ...? void save(unsigned int t){ mpiout_uint(t, sizeof(t)); } void save(const unsigned short int t){ mpiout_uint(t, sizeof(t)); } void save(const unsigned long int t){ mpiout_int(t, sizeof(t)); } ..? } ...

I'm not even suggesting this, I'm just explaining why adding "more" to the documentation or spec or whatever would have occurred to me.

Unfortunately it is not as easy since an arbitrary number of datatypes can be created by the user, e.g. for pairs, structs, arrays, ... . Hence there is not a limited list of types but a general template, and thus no automatic conversion takes place. Matthias

Robert Ramey

7:15 p.m.

Matthias Troyer wrote:

...

...
If you say that really need to know about each type I'm not going to dispute it. But I will say I'm not seeing it.

I sort of expect to see something like:

class mpi_oprimitive ... { mpioutputfunction m_out; template<typename T, std::size_t s> mpi_out{ // uh-oh programming error - no mpi size BOOST_STATIC_ASSERT(0 == s); }

mpiout_int<int, 2>(int t){ mpi_out(MPI_SHORT_INT(t))); } mpiout_int<int, 4>(int t){ mpi_out(MPI_LONG_INT(t))); } ...? void save(int t){ mpiout_int(t, sizeof(t)); } void save(short int t){ mpiout_int(t, sizeof(t)); } void save(long int t){ mpiout_int(t, sizeof(t)); } ..?

mpiout_uint<int, 2>(unsigned int t){ mpi_out(MPI_SHORT_INT(t))); } mpiout_uint<int, 4>(unsigned int t){ mpi_out(MPI_LONG_INT(t))); } ...? void save(unsigned int t){ mpiout_uint(t, sizeof(t)); } void save(const unsigned short int t){ mpiout_uint(t, sizeof(t)); } void save(const unsigned long int t){ mpiout_int(t, sizeof(t)); } ..? } ...

I'm not even suggesting this, I'm just explaining why adding "more" to the documentation or spec or whatever would have occurred to me.

Unfortunately it is not as easy since an arbitrary number of datatypes can be created by the user, e.g. for pairs, structs, arrays, ... . Hence there is not a limited list of types but a general template, and thus no automatic conversion takes place.

Ahh but these aren't handled in the mpi_?primitive class but rather in the mpi_archive class - assuming it were to follow the pattern in text and binary archives. here's what I envisioned when I factored the code in my archives. class text_archives template save_override( invoke code in oserialize if it's a primitive - redirect (eventually) to text_primitive if its an array - invoke array code if its an enum - invoke enum code if it's a pointer - invoke pionter code else invoke serialize code - this would handle all the types you mentin above - pairs, structs. etc. ... That is ONLY types marked primitive and C++ primitives have to be considered here. Since the types in questions are marked primitive and convertible to primitive, I believe that ONLY C++ integer types have to be considered in the mpi_?primitive classes. Looking at text_primitive and binary_primitive you can see this. A couple of types needed special handling. The can be filtered out in either the archive class or the primitive class. But most types need no special treatment. A couple do - bool is rendered as a character, char is rendered as a small binary number. But mostly this gets handled transparently. It also handles most of the hassle with different sizes for built-in types. Robert Ramey

...

Matthias

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

David Abrahams

6:28 p.m.

At Tue, 27 Jul 2010 09:19:53 -0800, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
On 26 Jul 2010, at 18:25, David Abrahams wrote:

...
...
I think it's crucially important to _correctly_ identify the cause of this impedance mismatch, and so far, I don't think that has happened.

Here is the issue: if one does not want to implement serialization from scratch one has to derive from boost::archive::detail::common_[io]archive. The *implicit* and not documented requirements for such a derived archive are to deal with serialization os

- std::string (and optionally std::wstring?) - fundamental C++ integral, boolean, and floating point types - an *unspecified* list of "primitive" types

...
Those primitive types had the semantics of integral types, but no concepts were documented for those types, and neither were those types one had to support part of the public interface. However, any archive still had to correctly serialize those types. would you call this missing documentation or specification?

First of all - these are serializable types by virtue of the fact that they are convertible to integers and references to integers.

If you are claiming that convertibility to integers and references to integers is enough to satisfy your Serializable concept, I can almost **guarantee** you that the concept is ill-defined, and that using the Boost Concept Check Library properly would prove it. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Robert Ramey

7:58 p.m.

David Abrahams wrote:

...

At Tue, 27 Jul 2010 09:19:53 -0800, Robert Ramey wrote:

...

...
First of all - these are serializable types by virtue of the fact that they are convertible to integers and references to integers.

If you are claiming that convertibility to integers and references to integers is enough to satisfy your Serializable concept, I can almost **guarantee** you that the concept is ill-defined, and that using the Boost Concept Check Library properly would prove it.

Hmm - that is in fact my claim. In fact I believe that any type implicitly convertible to a c++ primitive type (type and reference) is a serializable type. An archive class is required to be able to serialize all serializable types - so any archive class has to be able to serialize all c++ primitives. So what am I missing here. BTW - I just checked the documentation and althoutgh the above is what I meant - it doesn't quite say this. I can easily fix that. Robert Ramey

David Abrahams

7:31 p.m.

At Tue, 27 Jul 2010 11:58:19 -0800, Robert Ramey wrote:

...

David Abrahams wrote:

...
At Tue, 27 Jul 2010 09:19:53 -0800, Robert Ramey wrote:

...
...
First of all - these are serializable types by virtue of the fact that they are convertible to integers and references to integers.

If you are claiming that convertibility to integers and references to integers is enough to satisfy your Serializable concept, I can almost **guarantee** you that the concept is ill-defined, and that using the Boost Concept Check Library properly would prove it.

Hmm - that is in fact my claim. In fact I believe that any type implicitly convertible to a c++ primitive type (type and reference) is a serializable type.

...

An archive class is required to be able to serialize all serializable types - so any archive class has to be able to serialize all c++ primitives.

And what is your definition of Serializable (precisely, please)?

...

So what am I missing here.

try this example, and see how well your library deals with it. struct X { operator short() const { return 0; } operator short&() const { return 0; } operator long() const { return 0; } operator long&() const { return 0; } }; in concept requirements the use of convertibility almost always causes problems.

...

BTW - I just checked the documentation and althoutgh the above is what I meant - it doesn't quite say this. I can easily fix that.

For what it's worth, based on these discussions (and not a recent look at your docs, admittedly) I _think_ I can identify at least one problem with your specification and your idea of what is a proper implementation detail. Please tell me if I'm wrong: You require archives to handle all primitive types, yet there is a large class of such types for which you say the interface that creates instances, and gets and sets their values, is a private implementation detail. If I have that right, it means there's no reliable way to get their bits into an archive so they can be deserialized with the same values they went in with. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Robert Ramey

9:02 p.m.

David Abrahams wrote:

...

At Tue, 27 Jul 2010 11:58:19 -0800, try this example, and see how well your library deals with it.

struct X { operator short() const { return 0; } operator short&() const { return 0; }

operator long() const { return 0; } operator long&() const { return 0; } };

in concept requirements the use of convertibility almost always causes problems.

As written this would work fine. Since it is not a primitive, the default serialization would be to insist upon the existence of a serialize function. So this would only be a problem if the type is marked with the serialization trait "primitive". It so happens it's not very common to do so. Of course if this WAS marked primitive then there would be a compile time error due to the ambiguity. Actually, the convertability isn't stated in the documenation or concept. It's just what when I made the archive models, convertibility reduced/eliminated most of the code. I just plowed on and finished the job. So I suppose the concept as stated isn't accurate. I did in fact look into using the concept library. I had a few problems understanding it. To get a better idea I looked for other boost libraries which used it and didn't find any. (I took a special look at the iterators library!). At the time the c++0x was going to include concepts so it was more interesting to wait. Concept Traits was abandoned for this reason. There was a lot of discussion about using SFNAE to re-write it. So there were lots and lots of reasons to not invest the required effort. I do see the appeal of such a thing - especially in conjunction with making the documentation but it always seemed that it wasn't quite ready for prime time.

...

...
BTW - I just checked the documentation and althoutgh the above is what I meant - it doesn't quite say this. I can easily fix that.

For what it's worth, based on these discussions (and not a recent look at your docs, admittedly) I _think_ I can identify at least one problem with your specification and your idea of what is a proper implementation detail. Please tell me if I'm wrong:

You require archives to handle all primitive types, yet there is a large class of such types for which you say the interface that creates instances, and gets and sets their values, is a private implementation detail.

I haven't need getters/setters for any serialized types. In fact the whole code base only has maybe two.

...

If I have that right, it means there's no reliable way to get their bits into an archive so they can be deserialized with the same values they went in with.

I don't think this is an issue - at least its never come up as one. Robert Ramey

David Abrahams

28 Jul 28 Jul

1:14 p.m.

At Tue, 27 Jul 2010 15:31:50 -0400, David Abrahams wrote, and Robert Ramey snipped:

...

...
And what is your definition of Serializable (precisely, please)?

So could you please answer that question? At Tue, 27 Jul 2010 13:02:29 -0800, Robert Ramey wrote:

...

David Abrahams wrote:

...
At Tue, 27 Jul 2010 11:58:19 -0800, try this example, and see how well your library deals with it.

struct X { operator short() const { return 0; } operator short&() const { return 0; }

operator long() const { return 0; } operator long&() const { return 0; } };

in concept requirements the use of convertibility almost always causes problems.

As written this would work fine. Since it is not a primitive, the default serialization would be to insist upon the existence of a serialize function.

Then it wouldn't work fine. It's neither a primitive nor does it have a serialize function. You wrote: I believe that any type implicitly convertible to a c++ primitive type (type and reference) is a serializable type. and X contradicts that. We can go around and around on this until your definition of Serializable is solid, and I'm even willing to do so if that's what it takes to help you get this right.

...

Actually, the convertability isn't stated in the documenation or concept. It's just what when I made the archive models, convertibility reduced/eliminated most of the code. I just plowed on and finished the job. So I suppose the concept as stated isn't accurate.

Doesn't surprise me.

...

I did in fact look into using the concept library. I had a few problems understanding it. To get a better idea I looked for other boost libraries which used it and didn't find any.

Then you didn't look very hard.

...

(I took a special look at the iterators library!).

The iterators library does in fact use it (though probably not everywhere it should). The Graph library uses it all over the place.

...

At the time the c++0x was going to include concepts so it was more interesting to wait. Concept Traits was abandoned for this reason. There was a lot of discussion about using SFNAE to re-write it.

There may have been, among people that didn't understand the library, but I think I pretty clearly stated at the time that you can't get SFINAE to do what this library needs to do. SFINAE is all about not causing compilation errors, while BCCL is about intentionally causing them.

...

So there were lots and lots of reasons to not invest the required effort. I do see the appeal of such a thing - especially in conjunction with making the documentation but it always seemed that it wasn't quite ready for prime time.

The same basic technology has been in libstdc++ (the standard library that ships with GCC) for years and years, so I think you're mistaken.

...

...
For what it's worth, based on these discussions (and not a recent look at your docs, admittedly) I _think_ I can identify at least one problem with your specification and your idea of what is a proper implementation detail. Please tell me if I'm wrong:

You require archives to handle all primitive types, yet there is a large class of such types for which you say the interface that creates instances, and gets and sets their values, is a private implementation detail.

I haven't need getters/setters for any serialized types. In fact the whole code base only has maybe two.

I didn't say anything about getters and setters. I said “the interface that gets and sets their values.” An interface that sets the value might be the assignment operator. An interface that gets the value might be a conversion to int.

...

...
If I have that right, it means there's no reliable way to get their bits into an archive so they can be deserialized with the same values they went in with.

I don't think this is an issue - at least its never come up as one.

I think this is exactly the issue that Matthias faced. If you don't specify how to create a value of any given primitive type, how is he supposed to deserialize it? Is anyone else implementing archives other than you? If not, he's the only serious consumer you have of the archive concept. As the person in control of both sides of that contract, you're not going to notice these kinds of problems if you don't have solid concept definitions and concept checking in place, because you are free to (unintentionally) make changes that subtly alter the contract. This comes down to one thing: you need to decide what your public APIs are, and you need to have tests for all of them that don't make any assumptions beyond what's specified in the API. Maybe it would be easier to achieve if someone else were writing the tests. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Robert Ramey

5:50 p.m.

David Abrahams wrote:

...

At Tue, 27 Jul 2010 15:31:50 -0400, David Abrahams wrote, and Robert Ramey snipped:

...
...
And what is your definition of Serializable (precisely, please)?

So could you please answer that question?

At Tue, 27 Jul 2010 13:02:29 -0800, Robert Ramey wrote:

...
David Abrahams wrote:

...
At Tue, 27 Jul 2010 11:58:19 -0800, try this example, and see how well your library deals with it.

struct X { operator short() const { return 0; } operator short&() const { return 0; }

operator long() const { return 0; } operator long&() const { return 0; } };

in concept requirements the use of convertibility almost always causes problems.

As written this would work fine. Since it is not a primitive, the default serialization would be to insist upon the existence of a serialize function.

...

Then it wouldn't work fine. It's neither a primitive nor does it have a serialize function. You wrote:

Note that it could have a non-intrusive serialize function. So I guess it would be correct to say that whether the above is serializable would depend upon other information not present in the above example. I don't see anyway to verify this via concepts.

...

I believe that any type implicitly convertible to a c++ primitive type (type and reference) is a serializable type.

...

and X contradicts that. We can go around and around on this until your definition of Serializable is solid, and I'm even willing to do so if that's what it takes to help you get this right.

...

...
Actually, the convertability isn't stated in the documenation or concept. It's just what when I made the archive models, convertibility reduced/eliminated most of the code. I just plowed on and finished the job. So I suppose the concept as stated isn't accurate.

Doesn't surprise me.

The current documentation doesn't say anything about convertability. I just happened to be true for the internal types used by the library. It is only this which raises the question as to whether the concept as stated need be changed. One could well leave the concept as it is, and note that the archive implementations have this feature for the particular types used internally. At that point it would become an implemenation detail relevant only for those who leverage on the current implementations. So one would say that the current archives can also handle some types which are not defined as serializable even though this is not guarenteed by the concepts. And there is precedent for this. shared_ptr is NOT a serializable type as described by the concepts - and never can be. The implemented archives include special code for share_ptr to work around this and make it serializable anyway. Given the alternatives - I felt this was the best course - even though it muddles somewhat the question of exactly what is serializable. So I think it's accurate to say that the current concepts describe sufficient requirements for serializability but not necessary ones.

...

...
I did in fact look into using the concept library. I had a few problems understanding it. To get a better idea I looked for other boost libraries which used it and didn't find any.

Then you didn't look very hard.

...
(I took a special look at the iterators library!).

...

The iterators library does in fact use it (though probably not everywhere it should). The Graph library uses it all over the place.

I just looked again. I found ONE file in all of boost which includes boost/concept/requires.hpp. (That was in boost/graph/transitive_reduction.hpp) I found no such inclusions anywhere else - including the iterators library. So though I don't doubt that concepts are used through out boosts, I can't see where the concept library is used.

...

...
...
For what it's worth, based on these discussions (and not a recent look at your docs, admittedly) I _think_ I can identify at least one problem with your specification and your idea of what is a proper implementation detail. Please tell me if I'm wrong:

You require archives to handle all primitive types, yet there is a large class of such types for which you say the interface that creates instances, and gets and sets their values, is a private implementation detail.

I haven't need getters/setters for any serialized types. In fact the whole code base only has maybe two.

I didn't say anything about getters and setters. I said "the interface that gets and sets their values." An interface that sets the value might be the assignment operator. An interface that gets the value might be a conversion to int.

The documenation refers to primitive C++ types. These are all assignable and a reference can be taken on them. The current documentation says nothing about convertability so I think it's correct as it stands.

...

...
...
If I have that right, it means there's no reliable way to get their bits into an archive so they can be deserialized with the same values they went in with.

...

...
I don't think this is an issue - at least its never come up as one.

...

I think this is exactly the issue that Matthias faced.

I don't think that's the issue that Matthias faced, but he can speak to that if he want's to.

...

If you don't specify how to create a value of any given primitive type, how is he supposed to deserialize it?

These types (e.g. class_id_type, etc) are in fact created in the base archive implemenation. References to these types are serialized so the serialization doesn't have to construct them. The reason that I made the defaul constructors private was to detect cases where they were being constructed without a specific value - this would almost certainly an error. When I made these private I in fact did detect a couple of compile errors which represented potential errors. They were easy to fix and that was that.

...

Is anyone else implementing archives other than you? If not, he's the only serious consumer you have of the archive concept. As the person in control of both sides of that contract, you're not going to notice these kinds of problems if you don't have solid concept definitions and concept checking in place, because you are free to (unintentionally) make changes that subtly alter the contract.

This is true and admitidly a problem.

...

This comes down to one thing: you need to decide what your public APIs are, and you need to have tests for all of them that don't make any assumptions beyond what's specified in the API. Maybe it would be easier to achieve if someone else were writing the tests.

Great - any volunteers? There is one issue here that you might have overlooked. There are two "users" here. The main one is user of any archive already made. If he follows the requirements as stated in the documentation he will be guarenteed that the library will work as advertised. The other "user" is one who makes another archive class. If he follows the concepts as described in the documentation he's guarenteed that it will work as advertised. Presumable he would start with trivial_archive example as shown in the documentation. BUT - the documentation doesn't say much about archive semantics. There are several examples of archives in the documentation which implement different semantics. They all model the concepts, but they do different things. However, the most useful archive classes - the one's I included for usage out of the box, implement a lot of the sematics which make the system widely useful. serialization of pointers, etc.... It's appealing for someone making a new archive to leverage on this implemenation - just as Matthias has done. This does include some facilities which go beyond the original concepts I've included a section in the documentation which describes this implementation but not in a formal way. In these implementations, I did in fact depend on the fact that some internal types were not primitive - though convertible to primitives. I think Matthias did the same but I'm not sure. I think Matthias got surprised when I removed default constructability. But he also got surprised when I changed class_id_type from unsigned int to least_16_t which surprised me since I thought the latter was just a typedef and not a true class. I also never anticipated that anyone would care about the list of internally used types as I never needed such a list in the archives I had already created. In any case, make a concept for an archive called "All encompassing archive" similar to the family that we have would be quite a bit of work - and out of proportion to it's value in my opinion. And suppose I felt that it should not be necessary to provide a comprehensive internal types and Matthias did. We'd be back in the same soup. Robert Ramey

Stewart, Robert

5:06 p.m.

New subject: [serialization] Serializable Concept (Was: [1.44] Beta progress?)

Robert Ramey wrote:

...

I just looked again. I found ONE file in all of boost which includes boost/concept/requires.hpp. (That was in boost/graph/transitive_reduction.hpp) I found no such inclusions anywhere else - including the iterators library. So though I don't doubt that concepts are used through out boosts, I can't see where the concept library is used.

Perhaps you should look for boost/concept_check.hpp. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Robert Ramey

6:37 p.m.

New subject: [serialization] Serializable Concept (Was: [1.44] Beta progress?)

Stewart, Robert wrote:

...

Robert Ramey wrote:

...
I just looked again. I found ONE file in all of boost which includes boost/concept/requires.hpp. (That was in boost/graph/transitive_reduction.hpp) I found no such inclusions anywhere else - including the iterators library. So though I don't doubt that concepts are used through out boosts, I can't see where the concept library is used.

Perhaps you should look for boost/concept_check.hpp.

lol - Looks like you're correct, how foolish of me to look in the boost/concept library for the appropriate headers. Robert Ramey

...

_____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com

IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of vi ruses. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

David Abrahams

6:21 p.m.

New subject: [serialization] Serializable Concept (Was: [1.44] Beta progress?)

At Wed, 28 Jul 2010 10:37:48 -0800, Robert Ramey wrote:

...

Stewart, Robert wrote:

...
Robert Ramey wrote:

...
I just looked again. I found ONE file in all of boost which includes boost/concept/requires.hpp. (That was in boost/graph/transitive_reduction.hpp) I found no such inclusions anywhere else - including the iterators library. So though I don't doubt that concepts are used through out boosts, I can't see where the concept library is used.

Perhaps you should look for boost/concept_check.hpp.

lol - Looks like you're correct, how foolish of me to look in the boost/concept library for the appropriate headers.

Instead of crawling the Boost directory tree for _everything_, first * * * R E A D T H E D O C U M E N T A T I O N * * * This is a time-honored Boost principle. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Robert Ramey

8:01 p.m.

New subject: [serialization] Serializable Concept (Was: [1.44]Beta progress?)

David Abrahams wrote:

...

At Wed, 28 Jul 2010 10:37:48 -0800, Robert Ramey wrote:

...
Stewart, Robert wrote:

...
Robert Ramey wrote:

...
I just looked again. I found ONE file in all of boost which includes boost/concept/requires.hpp. (That was in boost/graph/transitive_reduction.hpp) I found no such inclusions anywhere else - including the iterators library. So though I don't doubt that concepts are used through out boosts, I can't see where the concept library is used.

Perhaps you should look for boost/concept_check.hpp.

lol - Looks like you're correct, how foolish of me to look in the boost/concept library for the appropriate headers.

Instead of crawling the Boost directory tree for _everything_, first

* * * R E A D T H E D O C U M E N T A T I O N * * *

This is a time-honored Boost principle.

aaa - does the documentation include a list of libraries which use concepts? I must have missed that. Robert Ramey

David Abrahams

8:35 p.m.

New subject: [serialization] Serializable Concept (Was: [1.44]Beta progress?)

At Wed, 28 Jul 2010 12:01:58 -0800, Robert Ramey wrote:

...

David Abrahams wrote:

...
At Wed, 28 Jul 2010 10:37:48 -0800, Robert Ramey wrote:

...
Stewart, Robert wrote:

...
Robert Ramey wrote:

...
I just looked again. I found ONE file in all of boost which includes boost/concept/requires.hpp. (That was in boost/graph/transitive_reduction.hpp) I found no such inclusions anywhere else - including the iterators library. So though I don't doubt that concepts are used through out boosts, I can't see where the concept library is used.

Perhaps you should look for boost/concept_check.hpp.

lol - Looks like you're correct, how foolish of me to look in the boost/concept library for the appropriate headers.

Instead of crawling the Boost directory tree for _everything_, first

* * * R E A D T H E D O C U M E N T A T I O N * * *

This is a time-honored Boost principle.

aaa - does the documentation include a list of libraries which use concepts? I must have missed that.

Not a complete list, but it actually does mention a few. But that's not what I meant: it shows clearly which headers you are supposed to use. Had you read it you'd have been more successful in your search. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

David Abrahams

6:13 p.m.

Robert, this thread is beginning to ramble. I won't be able to keep up with it if we keep going back and forth like this.

...

...
At Tue, 27 Jul 2010 13:02:29 -0800, Robert Ramey wrote:

...
David Abrahams wrote:

...
At Tue, 27 Jul 2010 11:58:19 -0800, try this example, and see how well your library deals with it.

struct X { operator short() const { return 0; } operator short&() const { return 0; }

operator long() const { return 0; } operator long&() const { return 0; } };

in concept requirements the use of convertibility almost always causes problems.

As written this would work fine. Since it is not a primitive, the default serialization would be to insist upon the existence of a serialize function.

...
Then it wouldn't work fine. It's neither a primitive nor does it have a serialize function. You wrote:

Note that it could have a non-intrusive serialize function.

But it doesn't.

...

So I guess it would be correct to say that whether the above is serializable would depend upon other information not present in the above example.

I don't see anyway to verify this via concepts.

If you mean with BCCL, it's *trivial* to do.

...

...
I believe that any type implicitly convertible to a c++ primitive type (type and reference) is a serializable type.

...
and X contradicts that. We can go around and around on this until your definition of Serializable is solid, and I'm even willing to do so if that's what it takes to help you get this right.

The offer stands.

...

...
...
Actually, the convertability isn't stated in the documenation or concept. It's just what when I made the archive models, convertibility reduced/eliminated most of the code. I just plowed on and finished the job. So I suppose the concept as stated isn't accurate.

Doesn't surprise me.

The current documentation doesn't say anything about convertability. I just happened to be true for the internal types used by the library. It is only this which raises the question as to whether the concept as stated need be changed.

I think I disagree with you; there are lots of reasons that the concepts as stated should come into question. The most glaring one is that your documentation says these things: 1. being primitive is sufficient to make a type Serializable 2. a saving archive ar has to support ar << x for all instances x of any Serializable type 3. Using serialization traits, any user type can also be designated as "primitive" but gives no other clue about how to get the value into or out of an arbitrary primitive type. What that means for an archive implementor is that he is required to support serialization of a (potentially unbounded) set of primitive types for which there is no API that will let him figure out how to appropriately write instances into his archive and read them out again. Even though this is an issue of un-achievable semantics (not syntax), using BCCL would actually uncover this problem because your archives would fail when primitive archetypes were serialized.

...

And there is precedent for this. shared_ptr is NOT a serializable type as described by the concepts - and never can be.

Perhaps not with those concept definitions, if you're unwilling to put its serialize member function in namespace Boost. But I don't see the relevance anyway.

...

The implemented archives include special code for share_ptr to work around this and make it serializable anyway. Given the alternatives - I felt this was the best course - even though it muddles somewhat the question of exactly what is serializable.

So I think it's accurate to say that the current concepts describe sufficient requirements for serializability but not necessary ones.

I don't see how that could possibly be accurate. Sufficient requirements are a superset of the necessary ones. If you described sufficient requirements, I can see no problems, provided those requirements were implementable. Operations constrained by the concepts would use more than absolutely necessary, but models of the concepts would provide more than absolutely necessary. No conflict.

...

...
The iterators library does in fact use it (though probably not everywhere it should). The Graph library uses it all over the place.

I just looked again. I found ONE file in all of boost which includes boost/concept/requires.hpp. (That was in boost/graph/transitive_reduction.hpp)

Well, you're looking for the wrong header. I don't know why you thought that particular header was the key to everything, but look for boost/concept* and you'll find boost/concept_check.hpp and boost/concept_archetype.hpp, and you'll also find whole library headers devoted just to defining concept checking classes and archetypes, like boost/graph/distributed/concepts.hpp This is information is all available if you read http://www.boost.org/doc/libs/1_43_0/libs/concept_check/using_concept_check.... and glance quickly at http://www.boost.org/doc/libs/1_43_0/libs/concept_check/reference.htm

...

...
...
...
For what it's worth, based on these discussions (and not a recent look at your docs, admittedly) I _think_ I can identify at least one problem with your specification and your idea of what is a proper implementation detail. Please tell me if I'm wrong:

I've now looked at the docs and confirmed what I said here.

...

...
...
...
You require archives to handle all primitive types, yet there is a large class of such types for which you say the interface that creates instances, and gets and sets their values, is a private implementation detail.

I haven't need getters/setters for any serialized types. In fact the whole code base only has maybe two.

I didn't say anything about getters and setters. I said "the interface that gets and sets their values." An interface that sets the value might be the assignment operator. An interface that gets the value might be a conversion to int.

The documenation refers to primitive C++ types. These are all assignable and a reference can be taken on them.

Yes, but it also says that any user-defined type can be made primitive, and the library supplies a whole bunch of library-defined primitive types eof its own!

...

The current documentation says nothing about convertability so I think it's correct as it stands.

Unless you don't really want to allow people to write archives, then it can't be.

...

...
If you don't specify how to create a value of any given primitive type, how is he supposed to deserialize it?

These types (e.g. class_id_type, etc) are in fact created in the base archive implemenation. References to these types are serialized so the serialization doesn't have to construct them.

Forget construction; I purposefully said “create a value.” If you pass my deserialize function a reference to T expecting me to fill the referenced object up with some value, but don't tell me about the interface for setting the value of a T (what types can be assigned into T, for example), then up a creek, paddle-less.

...

...
Is anyone else implementing archives other than you? If not, he's the only serious consumer you have of the archive concept. As the person in control of both sides of that contract, you're not going to notice these kinds of problems if you don't have solid concept definitions and concept checking in place, because you are free to (unintentionally) make changes that subtly alter the contract.

This is true and admitidly a problem.

...
This comes down to one thing: you need to decide what your public APIs are, and you need to have tests for all of them that don't make any assumptions beyond what's specified in the API. Maybe it would be easier to achieve if someone else were writing the tests.

Great - any volunteers?

Nail down what you think the concepts actually are, since you've said several times in this thread that you need to make adjustments. Then I will write some tests to reveal their brokenness [almost nobody, including me, gets concepts right without going through this exercise, so don't take it personally that I say they're broken]. I can't guarantee complete coverage but I can almost guarantee that I can reveal some holes. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Matthias Troyer

6:27 p.m.

On 28 Jul 2010, at 11:50, Robert Ramey wrote:

...

In these implementations, I did in fact depend on the fact that some internal types were not primitive - though convertible to primitives. I think Matthias did the same but I'm not sure. I think Matthias got surprised when I removed default constructability.

Yes, that broke Boost.MPI for the Sun compiler.

...

But he also got surprised when I changed class_id_type from unsigned int to least_16_t which surprised me since I thought the latter was just a typedef and not a true class.

Actually not. The breaking was caused by another change you did at the same time.

...

I also never anticipated that anyone would care about the list of internally used types as I never needed such a list in the archives I had already created.

In any case, make a concept for an archive called "All encompassing archive" similar to the family that we have would be quite a bit of work - and out of proportion to it's value in my opinion. And suppose I felt that it should not be necessary to provide a comprehensive internal types and Matthias did. We'd be back in the same soup.

All you need to do is document what a class deriving from common archive has to implement. These are - the save/load function for - a well-defined set of primitive types with a well-defined interfaced - and save_binary/load_binary. Matthias

Robert Ramey

8 p.m.

Matthias Troyer wrote:

...

On 28 Jul 2010, at 11:50, Robert Ramey wrote: All you need to do is document what a class deriving from common archive has to implement. These are

...

- the save/load function for - a well-defined set of primitive types with a well-defined interfaced

OK - that's not too bad. I'll find a place to put it into the documentation. There's around 10 of them and they're all convertable to some type of C++ integer so it shouldn't be too hard. My list of requirements on these is convertible to some type of c++ integer having serialization trait "primitive" Do you want to add default constructble as well? As I said I haven't needed it and think that needing this is likely an error - but I would add it if you think it's necessary.

...

- and save_binary/load_binary. I think this is already specified by the archive concept.

...

...
But he also got surprised when I changed class_id_type from unsigned int to least_16_t which surprised me since I thought the latter was just a typedef and not a true class.

...

Actually not. The breaking was caused by another change you did at the same time.

Would adding the above information address this as well? Robert Ramey

Matthias Troyer

7:23 p.m.

On 28 Jul 2010, at 14:00, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
On 28 Jul 2010, at 11:50, Robert Ramey wrote: All you need to do is document what a class deriving from common archive has to implement. These are

...
- the save/load function for - a well-defined set of primitive types with a well-defined interfaced

OK - that's not too bad. I'll find a place to put it into the documentation. There's around 10 of them and they're all convertable to some type of C++ integer so it shouldn't be too hard.

My list of requirements on these is convertible to some type of c++ integer having serialization trait "primitive"

If you do that you should specify which type it is - easiest probably with a typedef.

...

Do you want to add default constructble as well? As I said I haven't needed it and think that needing this is likely an error - but I would add it if you think it's necessary.

The Sun compiler wants it for some reason, although that's a compiler bug.

...

...
- and save_binary/load_binary. I think this is already specified by the archive concept.

I think so too, but it might be good stating that this function needs to be implemented as well.

...

...
...
But he also got surprised when I changed class_id_type from unsigned int to least_16_t which surprised me since I thought the latter was just a typedef and not a true class.

...
Actually not. The breaking was caused by another change you did at the same time.

Would adding the above information address this as well?

Yes. If you add a comment that one should derive from archive::detail::common_archive and not from archive::binary_archive Matthias

Matthias Troyer

27 Jul 27 Jul

1:48 p.m.

On 26 Jul 2010, at 18:25, David Abrahams wrote:

...

On Jul 26, 2010, at 4:15 PM, Matthias Troyer wrote:

...
...
...
...
Careful and consistent application of the boost concept check library would have caught any problem arising from models not satisfying stated concepts and operations relying on more than documented concept requirements. In fact, after this release would be a good time to apply BCCL to both libraries, to avoid such issues in the future.

Dave, the issue is that no concepts were defined for the classes under discussion.

Are you saying that "classes under discussion" (by which I suppose you mean the primitive size type) were assumed by the serialization library to model a particular concept that was never defined?

Indeed. The primitive types were implemented using a "strong typedef" which was designed to make them model most of the concepts that the underlying integral type models. However this has never been explicitly stated. What would be needed is that the concept those types model is explicitly defined - then I can easily design Boost.MPI to conform to that concept. However, if this is defined as an declared to be an implementation detail that I am not allowed to use without risking the code to be broken anytime then I have to essentially reimplement Boost.Serialization from scratch.

In 1.44 Robert has changed the implementation of the "strong typedef", greatly reducing the concepts the type models.

I understand all that, but none of that seems to be an example of the serialization library making concept assumptions that were not defined.

Such an assumption would basically always take the form:

template <class T> void some_serialization_component( T x ) { some_operation_on( x ); }

or

template <class T> class some_serialization_component { ... some_operation_on( T ) ... };

where T is not constrained by any concept in documentation, or some_operation_on( x ) is not a requirement of any concept that constrains T.

...
To cope with that Robert had to rewrite parts of Boost.Serialization, but the changes also lead to the breaking of Boost.MPI and most likely also other archives based on Boost.Serialization. As far as I can see Robert removed the default constructor since he did not need it anymore after changing his archives - but he did not realize that there was other code that might get broken.

All this I understand too. And yet, listening carefully in this thread, I haven't yet heard of any instances of under-documentation of concept requirements (or failure to model stated concepts, for that matter) on the part of Boost.Serialization

I think it's crucially important to _correctly_ identify the cause of this impedance mismatch, and so far, I don't think that has happened.

Here is one thing that might fit what you are looking for: all the special primitive types had been "strong typedefs", where the strong typedef is documented here: http://www.boost.org/doc/libs/1_43_0/libs/serialization/doc/strong_typedef.h... included in this documentation is the use of the default constructor of such a "strong typedef", but those types can no longer be default constructed. I don't think though that this is the real issue here. The real issue is that Robert has viewed these types as implementation details although they are important for anyone wanting to implement an archive. Matthias

David Abrahams

2:26 p.m.

At Tue, 27 Jul 2010 07:48:00 -0600, Matthias Troyer wrote:

...

...
I think it's crucially important to _correctly_ identify the cause of this impedance mismatch, and so far, I don't think that has happened.

Here is one thing that might fit what you are looking for: all the special primitive types had been "strong typedefs", where the strong typedef is documented here:

http://www.boost.org/doc/libs/1_43_0/libs/serialization/doc/strong_typedef.h...

included in this documentation is the use of the default constructor of such a "strong typedef", but those types can no longer be default constructed.

I don't think though that this is the real issue here. The real issue is that Robert has viewed these types as implementation details although they are important for anyone wanting to implement an archive.

OK. Then this has nothing to do with concepts—at least, not at this level. It seems to be a simple case of breaking changes to an API that was documented as though it were a public interface. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Matthias Troyer

26 Jul 26 Jul

4:09 p.m.

On 25 Jul 2010, at 23:56, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
I have one more question in addition to my previous comment:

common_oarchive is in namespace archive::detail while basic_binary_oarchive is in the top namespace archive.

...
Do I understand you correctly that deriving from archive::detail::common_oarchive is safe and not considered depending on implementation details, while deriving from archive::basic_binary_oarchive is not?

I can easily change all the Boost.MPI archives to use archive::detail::common_oarchive where they now use archive::basic_binary_oarchive (although this will not solve the issue we have right now).

I can see where this would be confusing. Let me indicate what I mean of a few of the terms being used. [...]

And the stuff in "detail" has it's own "public" and "implemention detail" aspects. For an archive developer who wants to develope an archive with functionality similar to the existing ones, it's not a detail. He wants to know that the public functions aren't going to change.

Indeed - and what I need to know is which of these classes I may derive from safely.

...

So - back to our problem.

I had thought the the source of the issue was coupling mpi_archive/skeleton to binary_archive implementation. That's why I thought deriving from common_archive would help.

If I'm wrong about the above then deriving mpi_archive from common_archive won't help - though it would probably be a good idea.

Indeed, it does not help but I still changed it to (hopefully) avoid future issues. It would be good if you could add somewhere to the documentation which of the classes one can safely derive from.

...

If the only problem is that version_type eliminated some operations mpi_archive depended on all integer types to have (STRONG_TYPEDEF) this can also be worked out one way or the other without too much difficulty.

If all the above is true - this shouldn't be so hard to address. Given that we've had so much difficulty with this, it's possible that one of the above is not true.

Finally, you've indicated that an archive writer needs to know the list of internal types in the archive and that they'll never change. This would suggest to me that perhaps a separate section in the documentation describing the "common archive implementation" (text_archive, etc)distinct from other "sample implementations" (trivial archive, simple_log_archive etc.)

a description of the functionality of these archives.

Basically this supplies the missing "semantics" left undefined by the archive concept. Basically it would list this functionaly, pointers, tracking versioning, etc.

common - this implemented special types used internally and their interface. We can discuss whether these types should have the rich interface permited by STRONG_TYPEDEF or a narrow one which is useful for catching coding errors.

Yes, what is missing is the list of special types and their interface. In the past they were all STRONG_TYPEDEF, and the semantics of STRONG_TYPEDEF was the same as that of the underlying integral type. That has changed and broken some code. The basic issue is that an archive deriving from detail::common_[io]archive has to support the fundamental C++ types, std:: string, and that collection of special types. In order to serialize those special types I need to be able to know something about them! The specific list of concepts they model is less important than that list being stable (as long as serialization can be implemented efficiently). Besides the default constructor it would be good to have a type member specifying the (current) implementation type: #define BOOST_ARCHIVE_STRONG_TYPEDEF(T, D) \ class D : public T { \ public: \ typedef T implementation_type; \ explicit D(const T t) : T(t){} \ D() : T() {} \ }; \ and also use the implementation_type member in the other special classes. That way you could change those integer types at any time without causing any breakage in Boost.MPI. Right now I have to manually set these types: BOOST_MPI_DATATYPE(boost::archive::class_id_optional_type, get_mpi_datatype(int_least16_t()), integer); and if you should change class_id_optional_type to be something else than int_least16_t the code will break again. Matthias

Robert Ramey

6:41 p.m.

Matthias Troyer wrote:

...

On 25 Jul 2010, at 23:56, Robert Ramey wrote:

...
Matthias Troyer wrote:

...

Yes, what is missing is the list of special types and their interface. In the past they were all STRONG_TYPEDEF, and the semantics of STRONG_TYPEDEF was the same as that of the underlying integral type. That has changed and broken some code.

The basic issue is that an archive deriving from detail::common_[io]archive has to support the fundamental C++ types, std:: string, and that collection of special types. In order to serialize those special types I need to be able to know something about them! The specific list of concepts they model is less important than that list being stable (as long as serialization can be implemented efficiently). Besides the default constructor it would be good to have a type member specifying the (current) implementation type:

#define BOOST_ARCHIVE_STRONG_TYPEDEF(T, D) \ class D : public T { \ public: \ typedef T implementation_type; \ explicit D(const T t) : T(t){} \ D() : T() {} \ }; \

A month or so ago I had most of these types implemented with STRONG_TYPEDEF. In addition to these I had a few implemented "by hand": tracking_type, class_name_type. So moving a few from the STRONG_TYPEDEF column to the "hand rolled" column didn't raise any red flags. I found that I had to add the following operations to these types: For T T:base_type // where base_type is some integer supported by binary_primitive and text_primitive. convertable to T::base_type & convertable to const T::base_type operator == operator< This was totally ad hoc- I was just "making things work" Of course STRONG_TYPEDEF supports all of these except base_type which could easily be added to it at no cost. Soooooo we could define such a concept and that would solve the current dilema, avoid future problems, and make mpi_archives easier to write and more robust. In principle I see this as an improvement. Of course the devil is in the details. I see the utility of augmenting STRONG_TYPEDEF but I wonder about it. if you have code T:base_type and T is not one of the types we're using - it won't have this available - compiler error. Wouldn't it be better to specify the implicit requirements as above and just know that T will be converted to what one wants? and sizeof(T) can be applied to both the (now) more elaborate types as well as C++ primitives.

...

and also use the implementation_type member in the other special classes. That way you could change those integer types at any time without causing any breakage in Boost.MPI. Right now I have to manually set these types:

BOOST_MPI_DATATYPE(boost::archive::class_id_optional_type, get_mpi_datatype(int_least16_t()), integer);

Though it's off topic at this point, I've never understood what this macro is for. why do these special types need handling different than any other common c++ primitive types like integers. Wasn't it enough for these types to be convertable to integers and references to integers?

...

and if you should change class_id_optional_type to be something else than int_least16_t the code will break again.

To repeat, I've no problem with the general idea. lets see some fleshing out of the details. Robert Ramey

Matthias Troyer

6:07 p.m.

On 26 Jul 2010, at 12:41, Robert Ramey wrote:

...

Matthias Troyer wrote:

...
On 25 Jul 2010, at 23:56, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
Yes, what is missing is the list of special types and their interface. In the past they were all STRONG_TYPEDEF, and the semantics of STRONG_TYPEDEF was the same as that of the underlying integral type. That has changed and broken some code.

The basic issue is that an archive deriving from detail::common_[io]archive has to support the fundamental C++ types, std:: string, and that collection of special types. In order to serialize those special types I need to be able to know something about them! The specific list of concepts they model is less important than that list being stable (as long as serialization can be implemented efficiently). Besides the default constructor it would be good to have a type member specifying the (current) implementation type:

#define BOOST_ARCHIVE_STRONG_TYPEDEF(T, D) \ class D : public T { \ public: \ typedef T implementation_type; \ explicit D(const T t) : T(t){} \ D() : T() {} \ }; \

A month or so ago I had most of these types implemented with STRONG_TYPEDEF. In addition to these I had a few implemented "by hand": tracking_type, class_name_type. So moving a few from the STRONG_TYPEDEF column to the "hand rolled" column didn't raise any red flags. I found that I had to add the following operations to these types:

For T T:base_type // where base_type is some integer supported by binary_primitive and text_primitive. convertable to T::base_type & convertable to const T::base_type operator == operator<

This was totally ad hoc- I was just "making things work" Of course STRONG_TYPEDEF supports all of these except base_type which could easily be added to it at no cost. Soooooo we could define such a concept and that would solve the current dilema, avoid future problems, and make mpi_archives easier to write and more robust. In principle I see this as an improvement. Of course the devil is in the details.

I see the utility of augmenting STRONG_TYPEDEF but I wonder about it. if you have code T:base_type and T is not one of the types we're using - it won't have this available - compiler error. Wouldn't it be better to specify the implicit requirements as above and just know that T will be converted to what one wants? and sizeof(T) can be applied to both the (now) more elaborate types as well as C++ primitives.

The only problem I see with sizeof(T) is that it cannot tell me about signedness. Otherwise I could hack it from sizeof(T)

...

...
and also use the implementation_type member in the other special classes. That way you could change those integer types at any time without causing any breakage in Boost.MPI. Right now I have to manually set these types:

BOOST_MPI_DATATYPE(boost::archive::class_id_optional_type, get_mpi_datatype(int_least16_t()), integer);

Though it's off topic at this point, I've never understood what this macro is for. why do these special types need handling different than any other common c++ primitive types like integers. Wasn't it enough for these types to be convertable to integers and references to integers?

This macro specializes the get_mpi_datatype() function for the type boost::archive::class_id_optional_type and implements it to return the appropriate MPI Datatype, which here is the same as that of an int_least16_t. Do you have a way how I can class_id_optional_type to an integer, without knowing what integer type to use? Matthias

David Abrahams

27 Jul 27 Jul

3:09 p.m.

At Mon, 26 Jul 2010 12:07:16 -0600, Matthias Troyer wrote:

...

...
I see the utility of augmenting STRONG_TYPEDEF but I wonder about it. if you have code T:base_type and T is not one of the types we're using - it won't have this available - compiler error. Wouldn't it be better to specify the implicit requirements as above and just know that T will be converted to what one wants? and sizeof(T) can be applied to both the (now) more elaborate types as well as C++ primitives.

The only problem I see with sizeof(T) is that it cannot tell me about signedness. Otherwise I could hack it from sizeof(T)

You should get the nested typedef; that is the non-hack solution. That said, I could write you a metafunction to discover signedness if you need it. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Matthias Troyer

3:58 p.m.

On 27 Jul 2010, at 09:09, David Abrahams wrote:

...

At Mon, 26 Jul 2010 12:07:16 -0600, Matthias Troyer wrote:

...
...
I see the utility of augmenting STRONG_TYPEDEF but I wonder about it. if you have code T:base_type and T is not one of the types we're using - it won't have this available - compiler error. Wouldn't it be better to specify the implicit requirements as above and just know that T will be converted to what one wants? and sizeof(T) can be applied to both the (now) more elaborate types as well as C++ primitives.

The only problem I see with sizeof(T) is that it cannot tell me about signedness. Otherwise I could hack it from sizeof(T)

You should get the nested typedef; that is the non-hack solution. That said, I could write you a metafunction to discover signedness if you need it.

I can also write a meta function that will tell me which integral conversion is preferred and from that deduce the type, but that is indeed more prone to compiler bugs or other problems than a nested typedef.

David Abrahams

26 Jul 26 Jul

11:52 a.m.

At Sat, 24 Jul 2010 22:12:24 -0800, Robert Ramey wrote:

...

David Abrahams wrote:

...
At Fri, 23 Jul 2010 22:30:03 -0800, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

...
Let me summarize concisely: no user-defined archive type may use any of the serialize function, pointer handling, or any other aspect of the serialization library since all those are implementation details that you might change at any time.

Naturally I think that's a little harsh.

As far as I can tell there was no criticism, express or implied, in Matthias's statement, so if it sounds harsh perhaps it is because the predicament in which he currently finds himself is harsh.

It's actually demonstrably incorrect - I was just being diplomatic

That may be so, but as far as I could tell at the time it was Matthias' best attempt to interpret what you had been telling him. Therefore, it was how he (mis?)understood his predicament. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Matthias Troyer

3:49 p.m.

On 26 Jul 2010, at 05:52, David Abrahams wrote:

...

At Sat, 24 Jul 2010 22:12:24 -0800, Robert Ramey wrote:

...
David Abrahams wrote:

...
At Fri, 23 Jul 2010 22:30:03 -0800, Robert Ramey wrote:

...
Matthias Troyer wrote:

...
On 23 Jul 2010, at 18:11, Robert Ramey wrote:

...
[snip] Robert Ramey

...
Let me summarize concisely: no user-defined archive type may use any of the serialize function, pointer handling, or any other aspect of the serialization library since all those are implementation details that you might change at any time.

Naturally I think that's a little harsh.

As far as I can tell there was no criticism, express or implied, in Matthias's statement, so if it sounds harsh perhaps it is because the predicament in which he currently finds himself is harsh.

It's actually demonstrably incorrect - I was just being diplomatic

That may be so, but as far as I could tell at the time it was Matthias' best attempt to interpret what you had been telling him. Therefore, it was how he (mis?)understood his predicament.

Robert still has not demonstrated how it should be possible, although I think that we are slowly getting to the point where he can define what is part of a stable public interface to make it possible. Matthias

Matthias Troyer

24 Jul 24 Jul

4:10 a.m.

...

Now, taking a look at the mpi usage of serializaiton.

I realy haven't looked at it enough to really understand it so I maybe wrong about this - these are only casual observations.

a) it seems that the "skeleton" idea seems to depend on the idea that the size of the data stored in the bniary archive be the same as the size of the underlying data type. Up until now that has been true even though there was never any explicity guarantee to that effect. I had to change the behavior in order to extract myself from some other fiasco and this "feature" was no longer true. I think this is where the problem started. It's no one's fault.

Not at all. We don't use binary archives at all.

...

b) the MPI file sends the class versions over the wire. It doesn't need to do this. If you look at some of the archives there is class_optional_id which is trapped by the archvie classes and suppressed both on input and output because that particular archive class doesn't need it. But it's there if someone want's to hook it (like an editing archive). I think MPI might want to do the same thing with version_type.

That's an idea but it does not solve the issue we have.

...

c) I'm not sure how MPI uses portable binary archive (if at all). Seems like that might be interesting.

It does not and it would make it VERY inefficient if it did. Please keep in mind that on modern networks memory bandwith is sometimes comparable to network bandwidth.

...

d) what is really needed to send data "over the wire" is to be able to supress tracking at the archive level.

No, we might want to send pointers.

...

The would permit the same data to be sent over and over and wouldn't presume that the data constant. So you wouldn't have to create a new archive for each transaction. I've puzzled about how to do this without breaking the archive concept. Turns out it's a little tricky. And there doesn't seem to be much demand for it - but maybe there would be if I did it.

e) this bit of code is what created the the issue with the Sun compiler.

...
The problem comes from this line in boost/mpi/datatype_fwd.hpp:

template<typename T> MPI_Datatype get_mpi_datatype(const T& x = T());

Frankly, it's just plain wrong and should be fixed. You might say that you know it's wrong but it works around this or that template or compiler quirk and it's too hard to fix. I could accept that. But if it's fixable, it should be fixed.

I did make the constructors of version_type public. I had made them private to trap errors in code where they were constructed but not initialized. Now error like this arn't trapped. So I think you should fix this.

So far ALL primitive types had default constructors and this was perfectly legal. Yes, I agree that you never specified what those types were and what there properties were. As I mentioned before, what you say now summarizes as "use any part of the library at your own risk. If you want to write your own archives the only safe way is to reimplement Boost.Serialization from scratch."

...

f) I believe that MPI uses binary_archive_base? as a basis. you could have used a higher level class as a basis. I don't know that that woudl have made things easier or harder but it's worth looking into. The binary_archive is actually very small - only a few hundred lines of code. This could have been cloned and edited. This might or might not have made things more/less intertwined with the other archive classes. This isn't a suggestion - just an observation that it might be worth looking into.

No, again that is not the issue. The issue were changes in the primitive types that any archive has to support. Matthias

Ryan Gallagher

21 Jul 21 Jul

7:38 p.m.

Robert Ramey <ramey <at> rrsd.com> writes:

...

...
...
I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library. [snip] Until it was discovered that a few type changes had broken the ability to read previous binary_?archives.

OK no problem - just wait until 1.43 comes out. EXCEPT I had neglected to bump the library version # . Doh!!!.

THAT is the real problem.

I've got a solution to fix the problem

Hi Robert, Based upon your messages, it's not clear to me what this "fix" does. Does it make it so that serialization in 1.44 will be able to read binary archives from 1.41 or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43? Or somehow both? Is there a trac issue? At my company we were considering upgrading to boost 1.43 but this may cause us to hold off. Thanks, -Ryan

Eric Niebler

7:56 p.m.

On 7/21/2010 3:38 PM, Ryan Gallagher wrote:

...

Robert Ramey <ramey <at> rrsd.com> writes:

...
...
...
I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library. [snip] Until it was discovered that a few type changes had broken the ability to read previous binary_?archives.

OK no problem - just wait until 1.43 comes out. EXCEPT I had neglected to bump the library version # . Doh!!!.

THAT is the real problem.

I've got a solution to fix the problem

Hi Robert,

Based upon your messages, it's not clear to me what this "fix" does. Does it make it so that serialization in 1.44 will be able to read binary archives from 1.41 or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43? Or somehow both? Is there a trac issue?

At my company we were considering upgrading to boost 1.43 but this may cause us to hold off. Thanks,

I have the same questions. I also want to know what is the impact of rolling back serialization on release to 1.43. Does this have the potential to break other libraries on release? Robert, I already know your feelings about s11n v1.43 and oil spills, but I have to consider the possibility. -- Eric Niebler BoostPro Computing http://www.boostpro.com

Robert Ramey

9 p.m.

Ryan Gallagher wrote:

...

Robert Ramey <ramey <at> rrsd.com> writes:

...
...
...
I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library. [snip] Until it was discovered that a few type changes had broken the ability to read previous binary_?archives.

OK no problem - just wait until 1.43 comes out. EXCEPT I had neglected to bump the library version # . Doh!!!.

THAT is the real problem.

I've got a solution to fix the problem

Hi Robert,

Based upon your messages, it's not clear to me what this "fix" does. Does it make it so that serialization in 1.44 will be able to read binary archives from 1.41 or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43? Or somehow both? Is there a trac issue?

I'll recapilate the history in a little more detail. a) 1.41 - every thing fine b) 1.42 - 17 November 2009? make some fixes to addressess warning messages. This actually exposed some dependence on undefined behaviorj so it was judged a good thing c) These changes broke MPI but this was not noticed at the time. Doug Gregor checked in some changes to the serialization library to address the MPI issue. But this broke compatibility with previous binary_archives. d) 1.43 released I spent a fair amount of time reconciling all the above. This does require a change in archives which derive from binary_?archive. I am sorry about that, but I could find no other way to reconcile all the requirments. as far as I could see, the changes should be small and perhaps trivial since MPI doesn't have support previous versions. I don't know as I haven't looked into it in detail and there are some macros whose function I don't understand. e) My method of handling breaking changes in serialization formats is to include the archive version implemenation number in the archive header so that subsequent code can handle old archives. This was stymied by the fact that in step b) above, I failed to increment this library version # - (I was only fixing a few warnings - right?). So I've had to add a program which will fix native binary archives writting by versions 1.42 and 1.43. I realise this is a huge pain but I don't see any other way. Normally I would say don't hold a release for any issue in a particular library. I'm giving myself a break here because the longer I wait the larger the number of archives will be created that require the fix. It's like an oil spill - the longer you let it go the bigger the problem. I hope that clarifies things. Robert Ramey

Daniel James

8:20 p.m.

On 21 July 2010 22:00, Robert Ramey <ramey@rrsd.com> wrote:

...

I failed to increment this library version # -

Would it be a good idea to have a point release with just this fixed? I guess it's such a trivial fix that it wouldn't need as much testing as a normal release and I think it's worth the effort. We should at least add a warning to the 1.43 release notes on the website. If you write it, I'll add it for you.

...

Normally I would say don't hold a release for any issue in a particular library. I'm giving myself a break here because the longer I wait the larger the number of archives will be created that require the fix. It's like an oil spill - the longer you let it go the bigger the problem.

I agree. This seems like an exceptional situation to me. Daniel

Eric Niebler

8:28 p.m.

On 7/21/2010 5:00 PM, Robert Ramey wrote:

...

Ryan Gallagher wrote:

...
Based upon your messages, it's not clear to me what this "fix" does. Does it make it so that serialization in 1.44 will be able to read binary archives from 1.41 or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43? Or somehow both? Is there a trac issue?

I'll recapilate the history in a little more detail.

a) 1.41 - every thing fine b) 1.42 - 17 November 2009? make some fixes to addressess warning messages. This actually exposed some dependence on undefined behaviorj so it was judged a good thing c) These changes broke MPI but this was not noticed at the time. Doug Gregor checked in some changes to the serialization library to address the MPI issue. But this broke compatibility with previous binary_archives. d) 1.43 released I spent a fair amount of time reconciling all the above. This does require a change in archives which derive from binary_?archive. I am sorry about that, but I could find no other way to reconcile all the requirments. as far as I could see, the changes should be small and perhaps trivial since MPI doesn't have support previous versions.

That sentence doesn't parse. I think you're missing a "to" in there. MPI doesn't have "to" support previous versions, presumably because the serialized forms are only traveling across a wire and are never persisted to disk. Is that right? Matthias, is that assumption correct.

...

I don't know as I haven't looked into it in detail and there are some macros whose function I don't understand. e) My method of handling breaking changes in serialization formats is to include the archive version implemenation number in the archive header so that subsequent code can handle old archives. This was stymied by the fact that in step b) above, I failed to increment this library version # - (I was only fixing a few warnings - right?). So I've had to add a program which will fix native binary archives writting by versions 1.42 and 1.43. I realise this is a huge pain but I don't see any other way.

Normally I would say don't hold a release for any issue in a particular library. I'm giving myself a break here because the longer I wait the larger the number of archives will be created that require the fix. It's like an oil spill - the longer you let it go the bigger the problem.

I hope that clarifies things.

Not quite there yet. You didn't answer Ryan's questions directly. Let me see if I can:

...

...
Based upon your messages, it's not clear to me what this "fix" does.

It makes binary archives distinguishable from their incompatible pre-1.42 variants. It provides a standalone utility to help users fix their broken binary archives from 1.42 and 1.43. It requires users that inherit from a binary archive to manually fix their code. It (does/does not?) help users identify these places in their code.

...

...
Does it make it so that serialization in 1.44 will be able to read binary archives from 1.41

Yes(?)

...

...
or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43?

No.

...

...
Or somehow both? Is there a trac issue?

No(?) -- Eric Niebler BoostPro Computing http://www.boostpro.com

Ryan Gallagher

9:03 p.m.

Eric Niebler <eric <at> boostpro.com> writes:

...

On 7/21/2010 5:00 PM, Robert Ramey wrote:

...
I hope that clarifies things.

Not quite there yet. You didn't answer Ryan's questions directly. Let me see if I can:

Thanks!

...

...
...
Based upon your messages, it's not clear to me what this "fix" does.

It makes binary archives distinguishable from their incompatible pre-1.42 variants. It provides a standalone utility to help users fix their broken binary archives from 1.42 and 1.43.

So this utility adds in the proper library version number to the archive, right?

...

It requires users that inherit from a binary archive to manually fix their code. It (does/does not?) help users identify these places in their code.

Thanks for the explanation. At least in my case I don't think I'll need to worry about this.

...

...
...
Does it make it so that serialization in 1.44 will be able to read binary archives from 1.41

Yes(?)

Good. =)

...

...
...
or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43?

No.

Sure, but if we have run the aforementioned utility to fix the archives from 1.42/43 then 1.44 will be able to read these, right? Alternatively, if we had the required patch applied to 1.43 before writing our archives then 1.44+ would be able to read them. As I am just looking to start using 1.43 I could still patch the source. This patch can be made available, right? Daniel's suggestion of a point release with this patch applied would be great, but a patch is sufficient for me at least.

...

...
...
Or somehow both? Is there a trac issue?

No(?)

Unless one patched 1.43 first. Perhaps we should create a trac issue and put the patch there. It seems that a late coming "fix" like this should be required to have a trac issues associated with it as well. (I do agree that this is like an oil spill and should be capped asap.) Thanks Eric and Robert! -Ryan

Robert Ramey

10:32 p.m.

Eric Niebler wrote:

...

On 7/21/2010 5:00 PM, Robert Ramey wrote:

...
as far as I could see, the changes should be small and perhaps trivial since MPI doesn't have support previous versions.

That sentence doesn't parse. I think you're missing a "to" in there. MPI doesn't have "to" support previous versions, presumably because the serialized forms are only traveling across a wire and are never persisted to disk. Is that right? Matthias, is that assumption correct.

correct:MPI doesn't support previous versions because it doesn't have to.

...

...
...
Based upon your messages, it's not clear to me what this "fix" does.

It makes binary archives distinguishable from their incompatible pre-1.42 variants. It provides a standalone utility to help users fix their broken binary archives from 1.42 and 1.43. It requires users that inherit from a binary archive to manually fix their code.

...

(does/does not?) help users identify these places in their code.

There's no problem in user's code. The problem is that version 1.42 and 1.43 have the same archive version # in the header as 1.41 even though they have different file formats.

...

...
...
Does it make it so that serialization in 1.44 will be able to read binary archives from 1.41

Yes(?)

This has never been an issue. The problem reading binary archives created with versions 1.42 and 1.43.

...

...
...
or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43?

No.

YES !!!

...

...
...
Or somehow both? Is there a trac issue?

...

No(?)

no. Robert Ramey

Eric Niebler

10:01 p.m.

On 7/21/2010 6:32 PM, Robert Ramey wrote:

...

Eric Niebler wrote:

...
...
...
Based upon your messages, it's not clear to me what this "fix" does.

It makes binary archives distinguishable from their incompatible pre-1.42 variants. It provides a standalone utility to help users fix their broken binary archives from 1.42 and 1.43. It requires users that inherit from a binary archive to manually fix their code. (does/does not?) help users identify these places in their code.

There's no problem in user's code.

You wrote:

...

I spent a fair amount of time reconciling all the above. This does require a change in archives which derive from binary_?archive. I am sorry about that, but I could find no other way to reconcile all the requirments.

So, does your fix require users to change their archive classes or not? <snip>

...

...
...
...
or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43?

No.

YES !!!

You wrote:

...

So I've had to add a program which will fix native binary archives writting by versions 1.42 and 1.43. I realise this is a huge pain but I don't see any other way.

If the new serialization version can read binary archives from 1.42/43 without any conversion (YES !!!), what is the purpose of the program you describe above? I'm still confused. -- Eric Niebler BoostPro Computing http://www.boostpro.com

Robert Ramey

11:39 p.m.

Eric Niebler wrote:

...

You wrote:

...
I spent a fair amount of time reconciling all the above. This does require a change in archives which derive from binary_?archive. I am sorry about that, but I could find no other way to reconcile all the requirments.

So, does your fix require users to change their archive classes or not?

...

...
...
...
...
or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43?

No.

YES !!!

You wrote:

...
So I've had to add a program which will fix native binary archives writting by versions 1.42 and 1.43. I realise this is a huge pain but I don't see any other way.

...

If the new serialization version can read binary archives from 1.42/43 without any conversion (YES !!!), what is the purpose of the program you describe above? I'm still confused.

someone wrote:

...

...
...
...
...
or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43?

No.

I wrote:

...

...
YES !!!

to re-phrase: 1.44 will not be able to read native binary archives created with versions 1.42/1.43 because it will confuse them with those written by 1.41. This can be fixed by invoking a program included in the distribute which will assign the correct library version #. The those files will be readable by version 144. Robert Ramey

Robert Ramey

11:46 p.m.

Robert Ramey wrote:

...

Eric Niebler wrote:

...

...
So, does your fix require users to change their archive classes or not?

I forgot to answer this. Users of the archives included with the library shouldn't have to change a thing. Users who make their own archive classes (like mpi) might have to make a minor change. Robert Ramey

Eric Niebler

22 Jul 22 Jul

12:51 a.m.

On 7/21/2010 7:46 PM, Robert Ramey wrote:

...

Robert Ramey wrote:

...
Eric Niebler wrote:

...
...
So, does your fix require users to change their archive classes or not?

I forgot to answer this.

Users of the archives included with the library shouldn't have to change a thing.

Users who make their own archive classes (like mpi) might have to make a minor change.

Thank you. So the answers I posited to Ryan's questions were correct AFAICT. Regarding the minor change that users may have to make to their own archive classes, will the fact that a change needs to be made be flagged at compile time? Is there anything you can do to make faulty archive classes fail to compile? Thanks for your patience and your work on this, -- Eric Niebler BoostPro Computing http://www.boostpro.com

Robert Ramey

4:54 a.m.

Eric Niebler wrote:

...

On 7/21/2010 7:46 PM, Robert Ramey wrote:

...
Robert Ramey wrote:

...
Eric Niebler wrote:

...
...
So, does your fix require users to change their archive classes or not?

I forgot to answer this.

Users of the archives included with the library shouldn't have to change a thing.

Users who make their own archive classes (like mpi) might have to make a minor change.

Thank you. So the answers I posited to Ryan's questions were correct AFAICT.

Regarding the minor change that users may have to make to their own archive classes, will the fact that a change needs to be made be flagged at compile time? Is there anything you can do to make faulty archive classes fail to compile?

The change manifests itself as a failure to compile. Robert Ramey

Matthias Troyer

21 Jul 21 Jul

11:19 p.m.

On 21 Jul 2010, at 14:28, Eric Niebler wrote:

...

On 7/21/2010 5:00 PM, Robert Ramey wrote:

...
Ryan Gallagher wrote:

...
Based upon your messages, it's not clear to me what this "fix" does. Does it make it so that serialization in 1.44 will be able to read binary archives from 1.41 or does it make it so that serialization in 1.44 will be able to read binary archives from 1.42/43? Or somehow both? Is there a trac issue?

I'll recapilate the history in a little more detail.

a) 1.41 - every thing fine b) 1.42 - 17 November 2009? make some fixes to addressess warning messages. This actually exposed some dependence on undefined behaviorj so it was judged a good thing c) These changes broke MPI but this was not noticed at the time. Doug Gregor checked in some changes to the serialization library to address the MPI issue. But this broke compatibility with previous binary_archives. d) 1.43 released I spent a fair amount of time reconciling all the above. This does require a change in archives which derive from binary_?archive. I am sorry about that, but I could find no other way to reconcile all the requirments. as far as I could see, the changes should be small and perhaps trivial since MPI doesn't have support previous versions.

That sentence doesn't parse. I think you're missing a "to" in there. MPI doesn't have "to" support previous versions, presumably because the serialized forms are only traveling across a wire and are never persisted to disk. Is that right? Matthias, is that assumption correct.

Yes

Steve M. Robbins

24 Jul 24 Jul

4:03 a.m.

On Wed, Jul 21, 2010 at 01:00:45PM -0800, Robert Ramey wrote:

...

Ryan Gallagher wrote:

...
Robert Ramey <ramey <at> rrsd.com> writes:

...
...
...
I made a mistake in release 1.42 and 1.43 which will mean that archives created by these versions won't be readable by subsequent versions of the serialization library.

Is this related to https://svn.boost.org/trac/boost/ticket/3990 ? Thanks, -Steve

Gaetano Mendola

10:48 a.m.

On 07/21/2010 11:00 PM, Robert Ramey wrote:

...

b) 1.42 - 17 November 2009? make some fixes to addressess warning messages. This actually exposed some dependence on undefined behaviorj so it was judged a good thing

Doesn't serialization have a regression tests for that, if not may be is time to make one. Regards Gaetano Mendola

Beman Dawes

23 Jul 23 Jul

11:29 a.m.

On Mon, Jul 19, 2010 at 9:35 AM, Beman Dawes <bdawes@acm.org> wrote:

...

It will be Wednesday before I can start pulling the 1.44 beta together...

I'm still swamped with non-boost obligations. It will be Monday at the earliest before I can devote the time to the release that it deserves. --Beman

Gaetano Mendola

24 Jul 24 Jul

10:39 a.m.

On 07/19/2010 03:35 PM, Beman Dawes wrote:

...

It will be Wednesday before I can start pulling the 1.44 beta together. In the meantime, does anyone have any serious issues we need to tackle before the beta?

Ticket 2330 (https://svn.boost.org/trac/boost/ticket/2330) has a proposed patch to solve a showstopper issue related to interrupt threads, waiting on same condition, not working. The issue is there at least since 1.41. Anthony Williams stated that the patch isn't sufficient but at same time he didn't specify why. That patch applied proved to solve the problem I'm experiencing in my application and on the submitter's patch one. I would like to have that patch applied. Regards Gaetano Mendola

Anthony Williams

26 Jul 26 Jul

9:25 a.m.

Gaetano Mendola <mendola@gmail.com> writes:

...

On 07/19/2010 03:35 PM, Beman Dawes wrote:

...
It will be Wednesday before I can start pulling the 1.44 beta together. In the meantime, does anyone have any serious issues we need to tackle before the beta?

Ticket 2330 (https://svn.boost.org/trac/boost/ticket/2330) has a proposed patch to solve a showstopper issue related to interrupt threads, waiting on same condition, not working. The issue is there at least since 1.41. Anthony Williams stated that the patch isn't sufficient but at same time he didn't specify why. That patch applied proved to solve the problem I'm experiencing in my application and on the submitter's patch one.

The patch substitutes one race condition for other problems. If the patch is applied, one race condition is this: 1. Thread A calls wait() 2. Thread A sets the mutex and condvar in the interruption checker 3. Thread A calls pthread_cond_wait 4. Thread B calls interrupt() on thread A 5. Thread B gets the mutex and condvar for thread A 6. Thread C notifies thread A 7. Thread A wakes from the wait 8. Thread A DESTROYS THE MUTEX AND CONDVAR AS NO LONGER NEEDED 9. Thread B TRIES TO LOCK THE MUTEX AND NOTIFY THE CONDVAR => UNDEFINED BEHAVIOUR Also there is a potential for deadlock: 1. Thread A locks mutex M 2. Thread A calls wait with a lock on mutex M 3. Thread A sets the mutex and condvar in the interruption checker 4. Thread A calls pthread_cond_wait, which unlocks M 5. Thread B locks mutex M 6. Thread B calls interrupt() on thread A 7. Thread B gets the mutex (which is M) and condvar for thread A 8. Thread B tries to lock the mutex M => deadlock with itself This deadlock can of course also occur with other threads holding the mutex, if thread B holds a mutex the other thread needs I have thought about this problem lots, and decided that you cannot safely interrupt a condition variable wait on POSIX unless you have a mutex tied directly to each condition variable (as with condition_variable_any). I'm therefore not sure how to approach this --- either every boost::condition_variable has an extra pthread_mutex_t inside it, or interruption is limited to waits on boost::condition_variable_any. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Gaetano Mendola

1:55 p.m.

On 07/26/2010 11:25 AM, Anthony Williams wrote:

...

I have thought about this problem lots, and decided that you cannot safely interrupt a condition variable wait on POSIX unless you have a mutex tied directly to each condition variable (as with condition_variable_any). I'm therefore not sure how to approach this --- either every boost::condition_variable has an extra pthread_mutex_t inside it, or interruption is limited to waits on boost::condition_variable_any.

I have to look in detail the boost::thread implementation to help you here. Is pthread_cancel() still a no go to be used in threads::interrupt implementation? Another source of hang with notify_all (and then with interrupt), is the following usage pattern: 1. Thread A: acquires a lock on mutex Ma 2. Thread A: does a wait on the condition C: C.wait(Ma) 3. Thread B: acquires a lock on mutex Mb 4. Thread B: does a wait on the condition C: C.wait(Mb) at this point a C.notify_all() doesn't awakes both threads, in order to awake both threads two C.notify_one() calls are needed. Regards Gaetano Mendola

Anthony Williams

2:01 p.m.

On 26/07/10 14:55, Gaetano Mendola wrote:

...

On 07/26/2010 11:25 AM, Anthony Williams wrote:

...
I have thought about this problem lots, and decided that you cannot safely interrupt a condition variable wait on POSIX unless you have a mutex tied directly to each condition variable (as with condition_variable_any). I'm therefore not sure how to approach this --- either every boost::condition_variable has an extra pthread_mutex_t inside it, or interruption is limited to waits on boost::condition_variable_any.

I have to look in detail the boost::thread implementation to help you here.

Of course.

...

Is pthread_cancel() still a no go to be used in threads::interrupt implementation?

pthread_cancel is no-go because it has different semantics and cannot be translated into an exception.

...

Another source of hang with notify_all (and then with interrupt), is the following usage pattern:

1. Thread A: acquires a lock on mutex Ma 2. Thread A: does a wait on the condition C: C.wait(Ma) 3. Thread B: acquires a lock on mutex Mb 4. Thread B: does a wait on the condition C: C.wait(Mb)

Undefined behaviour --- all concurrent calls to condition_variable::wait must use the same mutex. You can use different mutexes with condition_variable_any, and you can use different mutexes for separate waits, but not for concurrent ones. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

David Abrahams

2:14 p.m.

At Mon, 26 Jul 2010 15:01:40 +0100, Anthony Williams wrote:

...

...
Is pthread_cancel() still a no go to be used in threads::interrupt implementation?

pthread_cancel is no-go because it has different semantics and cannot be translated into an exception.

If you mean that its semantics are different because pthread_cancel is intended to be an unstoppable, non-ignorable command, but someone can always do a catch(...){}, I think that's silly (and I've made that clear to the Posix people—I'm not just picking on you). Even in the `C' world there's nothing to stop me from either a) disabling cancellation or b) doing something in a cancellation handler that never terminates (including re-trying whatever operation was cancelled). -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Anthony Williams

8:11 p.m.

David Abrahams <dave@boostpro.com> writes:

...

At Mon, 26 Jul 2010 15:01:40 +0100, Anthony Williams wrote:

...
...
Is pthread_cancel() still a no go to be used in threads::interrupt implementation?

pthread_cancel is no-go because it has different semantics and cannot be translated into an exception.

If you mean that its semantics are different because pthread_cancel is intended to be an unstoppable, non-ignorable command, but someone can always do a catch(...){}, I think that's silly (and I've made that clear to the Posix people—I'm not just picking on you). Even in the `C' world there's nothing to stop me from either a) disabling cancellation or b) doing something in a cancellation handler that never terminates (including re-trying whatever operation was cancelled).

In principle, I agree. pthread_cancel should be usable as the basis for, and interoperable with boost::thread::interrupt(). However, in practice it is not. I am not aware of any platforms where I can cleanly catch a cancellation from pthread_cancel and translate it into a boost::thread_interrupted exception. If I figure out a way, I'll gladly use it. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Gaetano Mendola

6 Aug 6 Aug

1:55 p.m.

On 07/26/2010 10:11 PM, Anthony Williams wrote:

...

However, in practice it is not. I am not aware of any platforms where I can cleanly catch a cancellation from pthread_cancel and translate it into a boost::thread_interrupted exception. If I figure out a way, I'll gladly use it.

Since gcc4.3 (?) it's possible to do this: #include <cxxabi.h> try { ... ::pthread_mutex_lock(&mutex); ::pthread_cond_wait (&condition, &mutex); } catch ( abi::__forced_unwind& ) { std::cout << "thread cancelled" << std::endl; throw boost::thread_interrupted; } isn't this what you are looking for ? Regards Gaetano Mendola

Anthony Williams

2:15 p.m.

On 06/08/10 14:55, Gaetano Mendola wrote:

...

On 07/26/2010 10:11 PM, Anthony Williams wrote:

...
However, in practice it is not. I am not aware of any platforms where I can cleanly catch a cancellation from pthread_cancel and translate it into a boost::thread_interrupted exception. If I figure out a way, I'll gladly use it.

Since gcc4.3 (?) it's possible to do this:

#include<cxxabi.h>

try { ... ::pthread_mutex_lock(&mutex); ::pthread_cond_wait (&condition,&mutex); } catch ( abi::__forced_unwind& ) { std::cout<< "thread cancelled"<< std::endl; throw boost::thread_interrupted; }

isn't this what you are looking for ?

On my linux system that aborts the program with "FATAL: exception not rethrown" Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Dmitry Goncharov

2:24 p.m.

...

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost- bounces@lists.boost.org] On Behalf Of Anthony Williams Sent: Friday, August 06, 2010 6:16 PM To: Gaetano Mendola Cc: boost@lists.boost.org Subject: Re: [boost] [1.44] Beta progress?

On 06/08/10 14:55, Gaetano Mendola wrote:

...
On 07/26/2010 10:11 PM, Anthony Williams wrote:

...
However, in practice it is not. I am not aware of any platforms where I can cleanly catch a cancellation from pthread_cancel and translate it into a boost::thread_interrupted exception. If I figure out a way, I'll gladly use it.

Since gcc4.3 (?) it's possible to do this:

#include<cxxabi.h>

try { ... ::pthread_mutex_lock(&mutex); ::pthread_cond_wait (&condition,&mutex); } catch ( abi::__forced_unwind& ) { std::cout<< "thread cancelled"<< std::endl; throw boost::thread_interrupted; }

isn't this what you are looking for ?

On my linux system that aborts the program with "FATAL: exception not rethrown"

That is exactly what is supposed to happen. NPTL uses abi::__forced_unwind for cancellation and it has to be rethrown. Regards, Dmitry

Gaetano Mendola

3:45 p.m.

On 08/06/2010 04:15 PM, Anthony Williams wrote:

...

On my linux system that aborts the program with "FATAL: exception not rethrown"

The following code produces: $ ./a.out Thread started interrupted terminate called after throwing an instance of 'std::runtime_error' what(): interrupted Aborted #include <cxxabi.h> #include <cstdlib> #include <iostream> #include <pthread.h> #include <stdexcept> static pthread_cond_t condition = PTHREAD_COND_INITIALIZER; static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; static void *foo (void *) { try { std::cout << "Thread started" << std::endl; ::pthread_mutex_lock(&mutex); ::pthread_cond_wait (&condition, &mutex); } catch (abi::__forced_unwind&) { std::cout << "interrupted" << std::endl; throw std::runtime_error("interrupted"); } catch (...) { std::cout << "..." << std::endl; throw; } } int main (int argc, char** argv) { pthread_t thread_foo; ::pthread_create (&thread_foo, NULL, foo, NULL); sleep (1); ::pthread_cancel (thread_foo); ::pthread_join (thread_foo, NULL); } Regards Gaetano Mendola

Anthony Williams

4:40 p.m.

On 06/08/10 16:45, Gaetano Mendola wrote:

...

On 08/06/2010 04:15 PM, Anthony Williams wrote:

...
On my linux system that aborts the program with "FATAL: exception not rethrown"

The following code produces:

$ ./a.out Thread started interrupted terminate called after throwing an instance of 'std::runtime_error' what(): interrupted Aborted

Yes, but if you catch the std::runtime_error then the application *still* aborts. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Gaetano Mendola

10 Aug 10 Aug

2:19 p.m.

On 08/06/2010 06:40 PM, Anthony Williams wrote:

...

On 06/08/10 16:45, Gaetano Mendola wrote:

...
On 08/06/2010 04:15 PM, Anthony Williams wrote:

...
On my linux system that aborts the program with "FATAL: exception not rethrown"

The following code produces:

$ ./a.out Thread started interrupted terminate called after throwing an instance of 'std::runtime_error' what(): interrupted Aborted

Yes, but if you catch the std::runtime_error then the application *still* aborts.

Yes, indeed I tried my self, also even if a std::runtime_error is thrown instead of original exception the application *aborts*... Regards Gaetano Mendola

Ryo IGARASHI

6 Aug 6 Aug

10:58 p.m.

Hi, 2010/7/27 Anthony Williams <anthony.ajw@gmail.com>:

...

In principle, I agree. pthread_cancel should be usable as the basis for, and interoperable with boost::thread::interrupt().

However, in practice it is not. I am not aware of any platforms where I can cleanly catch a cancellation from pthread_cancel and translate it into a boost::thread_interrupted exception. If I figure out a way, I'll gladly use it.

FYI, Urlich Drepper (glibc maintainer) wrote about pthread_cancel() and C++ issue in his blog. http://udrepper.livejournal.com/21541.html -- Ryo IGARASHI, Ph.D. rigarash@gmail.com

Gaetano Mendola

26 Jul 26 Jul

2:40 p.m.

On 07/26/2010 04:01 PM, Anthony Williams wrote:

...

Undefined behaviour --- all concurrent calls to condition_variable::wait must use the same mutex. You can use different mutexes with condition_variable_any, and you can use different mutexes for separate waits, but not for concurrent ones.

That pattern was indeed the result of a bug in our application. I was wondering if that can be spotted with a runtime error (or an assert for the matter) at least with NDEBUG not defined, for example the condition_variable::wait can check with __sync_bool_compare_and_swap if it's running concurrently with another call with a different mutex. I'm actualy using the above technique to spot if someone is using a not thread safe class concurrently from different threads. Regards Gaetano Mendola

5479

Age (days ago)

5501

Last active (days ago)

List overview

Download

125 comments

15 participants

participants (15)

Anthony Williams
Belcourt, Kenneth
Beman Dawes
Daniel James
David Abrahams
Dmitry Goncharov
Eric Niebler
Gaetano Mendola
John Maddock
Matthias Troyer
Robert Ramey
Ryan Gallagher
Ryo IGARASHI
Steve M. Robbins
Stewart, Robert