[MPI] Review comments

This isn't a full review - I've just read the documentation and perused the code. Its more like random observations. A lot of this is very particular to the way that the MPI library uses/builds upon serialization. So it may not be of interested to many. I've spent some more time reviewing the serialization currently checked into the head. It's quite hard to follow. Among other things, the MPI library includes the following: a) an optimization for serialization of std::vector, std::valarray and native C++ arrays in binary archives. b) A new type of archive (which should be called mpi_?archive) which serializes C++ structures in terms of MPI datatypes. This would complement the archive types that are already included in the package. i) text - renders C++ structures in terms of a long string of characters - the simplest portable method. ii) binary - renders C++ structures as native binary data. The fastest method - but non portable. iii) renders ... as xml elements - a special case of i) above. So we would end up with an mpi_archive and optionally mpi_primitive. In the other archives, I separated ?_primitive so this could be shared by both text and xml. In your case it isn't necessary to make an mpi_primitive class - though it might be helpful and it would certainly be convenient to leverage on the established pattern to ease understanding for the casual reader. c) the "skeleton" idea - which I still haven't totally figured out yet. I believe I would characterize this as an "archive adaptor" which changes the behavior of any the archive class to which it is applied. In this way it is similar to the "polymorphic_?archive" . In my view these enhancements are each independent of one another. This is not reflected in the current implementation. I would suggest the following: a) enhancements to the binary archive be handled as such. We're only talking about specializations for three templates - std::vector, std:valarray and native C++ arrays. I know these same three are also handled specially for mpi_?archives, but it's still a mistake to combine them. in binary_?archive they are handled one (load binary) while in mpi_archive they are handled another (load_array) I still think this would be best implemented as "enhanced_binary_?archive". b) mpi_?archive should derive directly from common_?archive like basic_binary_?archive does. The reason I have basic_... is that for xml and text there are separate wide character versions so I wanted to factor out the commonality. In your case, I don't think that's necessary so I would expect your hierarchy would look like class mpi_archive : public common_archive, public interface_archive ... I doubt it even has to be a template. It would 1) render the native archive types (class_id, etc) as small integers - like the binary archive currently does. 2) render C++ primitives (and std::string) as corresponding MPI datatypes 3) handle the special implementations for C++ native arrays, std::vector and std::valarray Note that you've used packed_archive - I would use mpi_archive instead. I think this is a better description of what it is. Really its only a name change - and "packed archive" is already inside an mpi namespace so its not a huge issue. BUT I'm wondering if the idea of rendering C++ data structures as MPI primitives should be more orthogonal to MPI prototcol itself. That is, might it not be sometimes convenient to save such serializations to disk? Wouldn' this provide a portable binary format for free? (Lots of people have asked for this but no one as been sufficiently interested to actually invest the required effort). 4) Shouldn't there be a logical place for other archive types for message passing - how about XDR? I would think it would be close cousin to MPI archives. c) The skeleton idea would be template<class BaseArchive> class skeleton_archive ....??? (I concede I haven't studied this enough). This would be coded as an "archive adaptor" (as is polymorphic archive) as described in a discussion thread some months ago. The concept of the"skeleton" seems very interesting but really orthogonal to any particular type of archive. Perhaps the skeleton idea would be useful to other types of data renderings. By making it as an archive adaptor, its facility could be added to any existing archive. Even if not useful anywhere else, it would help comprehensability and testability to factor it out in this way. So rather or in addtion to an MPI library you would end up with three logically distinct things. Each one can stand on its own. The only "repeated" or shared code might be that which determines when either a binary or mpi optimization can be applied. It's not clear to me whether this criteria applies to both kinds of archives ore each one has its own separate criteria. If it's the latter - there's no shared code and we're done. If it's the former, the a separate free standing concept has to be invented. In the past I've called this "binary serializable" and more lately "magic". ( a concession to physicist's fondness for whimsical names). So depending on this last, the serialization part of the MPI library falls into 3 or 4 independent pieces. If the code where shuffled around to reflect this, it would be much easier to use, test, verify, enhance and understand. Also the skeleton concept might be then applicable to other types of archives. Also the "magic" concept really is a feature of the type and is really part of the ad hoc C++ type reflection which is what serialization traits are. So, that's my assessment. Robert Ramey

Matthias is far more qualified than I to provide adequate responses to your questions, but this much I can answer...
Note that you've used packed_archive - I would use mpi_archive instead. I think this is a better description of what it is. Really its only a name change - and "packed archive" is already inside an mpi namespace so its not a huge issue. BUT I'm wondering if the idea of rendering C++ data structures as MPI primitives should be more orthogonal to MPI prototcol itself. That is, might it not be sometimes convenient to save such serializations to disk? Wouldn' this provide a portable binary format for free?
Unfortunately, the MPI packed archives do not give us a portable binary format, because there is no published MPI protocol. The MPI packed archivers use the MPI calls MPI_Pack and MPI_Unpack to pack a buffer. The only guarantee they give is that if you MPI_Pack something, you can transmit it via MPI and then MPI_Unpack it later. The protocol used varies from one MPI implementation to another, and could conceivably vary from one invocation to another. For instance, in a homogeneous environment, MPI_Pack and MPI_Unpack could be implemented as a memcpy(); in a heterogeneous environment, they might use some XDR representation. A really smart MPI might make the decision at run-time, after determining what kind of environment we're running in. So we can't really separate the MPI archivers from MPI. They're really very, very specialized. Doug

Hi Robert On Sep 17, 2006, at 7:56 PM, Robert Ramey wrote:
This isn't a full review - I've just read the documentation and perused the code. Its more like random observations. A lot of this is very particular to the way that the MPI library uses/builds upon serialization. So it may not be of interested to many.
I am a bit perplexed by your mail, since it is an identical copy of a private e-mail you sent me two weeks ago, even before the review started. I had replied to your e-mail but never heard back from you. I am thus confused about the purpose of your message but assume that you want me to reply in public now. I will essentially follow what I have written in my reply two weeks ago but modify it a bit so that the list readers who have not followed our private e-mail exchange can follow.
I've spent some more time reviewing the serialization currently checked into the head. It's quite hard to follow. Among other things, the MPI library includes the following:
a) an optimization for serialization of std::vector, std::valarray and native C++ arrays in binary archives.
I assume you mean the array wrapper here? If yes, then this is not restricted to the above types but also needed for other array-like data structures. Furthermore this is not part of the MPI library but a general extension to the serialization library that we had implemented there last spring, with your consent. The comments of several reviewers, which were initially skeptical about our use of the serialization library in a high performance context, but whose concerns vanished when they saw the array optimizations, should show you that it was not only me who needs these optimizations.
b) A new type of archive (which should be called mpi_?archive) which serializes C++ structures in terms of MPI datatypes. This would complement the archive types that are already included in the package.
Do you mean the mpi::packed_archive We prefer to call it mpi::packed_archive, since it is actually a slow and not the preferred way of sending via MPI. This does not make use of MPI data types, and is in fact - as you saw yourself - a variant of the binary_archive, with the only main distinction that save_binary cannot be used for fast serialization of array wrappers - a different function is needed, hence the abstraction using save_array.
i) text - renders C++ structures in terms of a long string of characters - the simplest portable method. ii) binary - renders C++ structures as native binary data. The fastest method - but non portable. iii) renders ... as xml elements - a special case of i) above.
So we would end up with an mpi_archive and optionally mpi_primitive. In the other archives, I separated ?_primitive so this could be shared by both text and xml. In your case it isn't necessary to make an mpi_primitive class - though it might be helpful and it would certainly be convenient to leverage on the established pattern to ease understanding for the casual reader.
Actually we already have an MPI primitive class, and the packed archive is - as you saw - just a binary archive with special primitives, and a special way of dealing with arrays. if we keep a save_array instead of just using save_binary in the binary archives, then indeed just by specifying the MPI primitives we create an MPI archive - I liked that design of yours once I grasped how you split off the primitives. Note, however, as Doug Gregor pointed out, that the mpi_primitives are not very useful outside an MPI message passing context.
c) the "skeleton" idea - which I still haven't totally figured out yet. I believe I would characterize this as an "archive adaptor" which changes the behavior of any the archive class to which it is applied. In this way it is similar to the "polymorphic_?archive" .
Indeed these are similar wrappers
In my view these enhancements are each independent of one another. This is not reflected in the current implementation.
Have you looked at the code? Actually the skeleton "archive adaptors" do not depend at all on the rest of the MPI library and they can easily be factored out in case they are useful also in another context. For now, since they are an implementation detail, never occur in the public API, and we do not see them useful in another context at the moment, we have left them in detail.
I would suggest the following:
a) enhancements to the binary archive be handled as such. We're only talking about specializations for three templates - std::vector, std:valarray and native C++ arrays. I know these same three are also handled specially for mpi_?archives, but it's still a mistake to combine them. in binary_? archive they are handled one (load binary) while in mpi_archive they are handled another (load_array) I still think this would be best implemented as "enhanced_binary_?archive".
Watch out that there are more such types: multi_array, ublas and MTL vectors and matrices, ... With the array wrapper we have an elegant solution to handle also these other types. Since we have discussed this topic many times on the list over the past year I will not comment further for now. If you do not like the way we have implemented the array optimizations in the binary archive then we can just roll back the CVS state to the version at the end of May where we had implemented a separate array-optimized binary archive, and non of the MPI archives needed to change any of your archives.
b) mpi_?archive should derive directly from common_?archive like basic_binary_?archive does. The reason I have basic_... is that for xml and text there are separate wide character versions so I wanted to factor out the commonality. In your case, I don't think that's necessary so I would expect your hierarchy would look like class mpi_archive : public common_archive, public interface_archive ...
Do you mean the packed archive? This is actually a binary archive - do you really mean that we should reimplement the functionality of the binary archive and not reuse what is there?
Note that you've used packed_archive - I would use mpi_archive instead. I think this is a better description of what it is.
I still prefer mpi::packed_archive, since there can also be other MPI archives. One possible addition to speed up things on homogeneous machines might be just an mpi::binary_archive, using a binary buffer.
Really its only a name change - and "packed archive" is already inside an mpi namespace so its not a huge issue. BUT I'm wondering if the idea of rendering C++ data structures as MPI primitives should be more orthogonal to MPI prototcol itself. That is, might it not be sometimes convenient to save such serializations to disk? Wouldn' this provide a portable binary format for free? (Lots of people have asked for this but no one as been sufficiently interested to actually invest the required effort).
As Doug Gregor pointed out this is not possible since the format is implementation-defined, and can change from one execution to another.
4) Shouldn't there be a logical place for other archive types for message passing - how about XDR? I would think it would be close cousin to MPI archives.
XDR might be used by an implementation or not - these are implementation details and a good MPI implementation is supposed to pick the best format.
c) The skeleton idea would be template<class BaseArchive> class skeleton_archive ....??? (I concede I haven't studied this enough).
Indeed, the skeleton archives could be factored out if anybody sees another use for them. This is an orthogonal piece of code, and we should discuss where it can be useful. One possible application is to visualize data structures without caring about the content, but only about types and pointers. But I don't know if anyone needs this or if there is another use for these code pieces. If there is then we can factor it out of the mpi detail namespace and put it into archive with no essential changes to the code.
The only "repeated" or shared code might be that which determines when either a binary or mpi optimization can be applied. It's not clear to me whether this criteria applies to both kinds of archives ore each one has its own separate criteria. If it's the latter - there's no shared code and we're done. If it's the former, the a separate free standing concept has to be invented. In the past I've called this "binary serializable" and more lately "magic". ( a concession to physicist's fondness for whimsical names).
The set of types for which an array optimization can be done is different for binary, MPI, XDR, ... archives, but a common dispatch mechanism is possible, which is what we have implemented in the array::[io]archive classes. Your "magic" idea (which you have not described to the list yet since it was only in private e-mails) can easily be incorporated into that. Just replace typedef is_fundamental<mpl::_1> use_array_optimization; by typedef is_bitwise_serializable<mpl::_1> use_array_optimization; or typedef is_magic<mpl::_1> use_array_optimization; and you have upgraded to your magic optimization!
So rather or in addtion to an MPI library you would end up with three logically distinct things. Each one can stand on its own.
So depending on this last, the serialization part of the MPI library falls into 3 or 4 independent pieces. If the code where shuffled around to reflect this, it would be much easier to use, test, verify, enhance and understand. Also the skeleton concept might be then applicable to other types of archives. Also the "magic" concept really is a feature of the type and is really part of the ad hoc C++ type reflection which is what serialization traits are.
If by three or four logically distinct things you mean 1. the array optimization 2. the skeleton&content archive wrappers 3. the MPI archives 4. the MPI library then my comments are: 1. is already factored out and in the serialization library. If anything should be done to it, there was the desire to extend array wrappers to strided arrays, which can easily be done without touching anything in the serialization library. 2. is independent of the rest of the proposed Boost.MPI library but we keep it in detail since we do not see any other use for this at the moment. Once someone could use it we can move it immediately to the serialization library. 3. and 4. are tightly coupled since the MPI archives do not make any sense outside the Boost.MPI context and I do not see that splitting this into two separate libraries makes any sense at all. The code itself is written cleanly though, with no part of the MPI archive types depending on any of the communication functions. Thus I see absolutely no reason at all to shuffle the code around anymore, unless you can come up with a reason to move the implementation details of skeleton&content to a public place in the serialization library. Matthias

Hi, Matthias Troyer wrote:
On Sep 17, 2006, at 7:56 PM, Robert Ramey wrote: [...]
Note that you've used packed_archive - I would use mpi_archive instead. I think this is a better description of what it is.
I still prefer mpi::packed_archive, since there can also be other MPI archives. One possible addition to speed up things on homogeneous machines might be just an mpi::binary_archive, using a binary buffer.
Yes, this is a realistic idea; almost all MPI programs are run on homogeneous clusters anyway. Even in a heterogeneous environment there remains the question as to whether one can do better than MPI_Pack/Unpack by using some kind of 'portable' archive (although 'transportable' might be a better word). In principle the answer is definitely yes, as the conversion functions can be inlined. [...]
Really its only a name change - and "packed archive" is already inside an mpi namespace so its not a huge issue. BUT I'm wondering if the idea of rendering C++ data structures as MPI primitives should be more orthogonal to MPI prototcol itself. That is, might it not be sometimes convenient to save such serializations to disk? Wouldn' this provide a portable binary format for free? (Lots of people have asked for this but no one as been sufficiently interested to actually invest the required effort).
As Doug Gregor pointed out this is not possible since the format is implementation-defined, and can change from one execution to another.
This is only true for MPI-1.1. MPI-2 supports multiple data representations and adds the functions MPI_Pack_external and MPI_Unpack_external to convert to/from the "external32" format defined in section 9.5.2 of the MPI-2 standard. The intent of this is to be able to transfer data between MPI implementations. Also, as part of the file I/O interface, MPI-2 also allows user-defined representations so in principle it would be possible to make some kind of adaptor to read a different archive format via the MPI-2 file I/O. Not that MPI file I/O seems to be used much anyway... Cheers, Ian

Hi Ian, On Sep 18, 2006, at 6:35 PM, Ian McCulloch wrote:
Really its only a name change - and "packed archive" is already inside an mpi namespace so its not a huge issue. BUT I'm wondering if the idea of rendering C++ data structures as MPI primitives should be more orthogonal to MPI prototcol itself. That is, might it not be sometimes convenient to save such serializations to disk? Wouldn' this provide a portable binary format for free? (Lots of people have asked for this but no one as been sufficiently interested to actually invest the required effort).
As Doug Gregor pointed out this is not possible since the format is implementation-defined, and can change from one execution to another.
This is only true for MPI-1.1. MPI-2 supports multiple data representations and adds the functions MPI_Pack_external and MPI_Unpack_external to convert to/from the "external32" format defined in section 9.5.2 of the MPI-2 standard. The intent of this is to be able to transfer data between MPI implementations. Also, as part of the file I/O interface, MPI-2 also allows user-defined representations so in principle it would be possible to make some kind of adaptor to read a different archive format via the MPI-2 file I/O. Not that MPI file I/O seems to be used much anyway...
The current library is based on MPI-1.1 only, which is reasonably stable. A future extension of the library to MPI-2 could indeed include the use of MPI-I/O to write portable binary archives. Matthias

Matthias Troyer wrote:
I am a bit perplexed by your mail, since it is an identical copy of a private e-mail you sent me two weeks ago, even before the review started.
I realize this - its just that I thought that someone else might have some other observations to add on the subject.
The comments of several reviewers, which were initially skeptical about our use of the serialization library in a high performance context, but whose concerns vanished when they saw the array optimizations, should show you that it was not only me who needs these optimizations.
I don't object to the array optimizations per se, I'm interested in seeing if there's a way to do that doesn't hard code coupling between particular pairs of archives and datatypes into the the original archive classes. Actually this question applies to the the modifications in binary_?archive so its a little off topic - but still related.
Watch out that there are more such types: multi_array, ublas and MTL vectors and matrices, ... With the array wrapper we have an elegant solution to handle also these other types. Since we have discussed this topic many times on the list over the past year I will not comment further for now.
I think this is the part I'm still not seeing. the changes to binary_?archive include specializations for std::valarray, std::vector and native C++ arrays. This pattern suggests that for these other data types for which an optimization might exist, more and more will have to added to binary archive. And all programs will have to include them even if they don't use them. When I originally suggested the idea of an array wrapper (admitadly not thought out in detail) I envisioned that the array.hpp would have the "default" serialization - lowest common denominator which is there so far so good. Then for say binary_?archive I expect to see:
If you do not like the way we have implemented the array optimizations in the binary archive then we can just roll back the CVS state to the version at the end of May where we had implemented a separate array-optimized binary archive, and non of the MPI archives needed to change any of your archives.
b) mpi_?archive should derive directly from common_?archive like basic_binary_?archive does. The reason I have basic_... is that for xml and text there are separate wide character versions so I wanted to factor out the commonality. In your case, I don't think that's necessary so I would expect your hierarchy would look like class mpi_archive : public common_archive, public interface_archive ...
Do you mean the packed archive? This is actually a binary archive - do you really mean that we should reimplement the functionality of the binary archive and not reuse what is there?
Note that you've used packed_archive - I would use mpi_archive instead. I think this is a better description of what it is.
I still prefer mpi::packed_archive, since there can also be other MPI archives. One possible addition to speed up things on homogeneous machines might be just an mpi::binary_archive, using a binary buffer.
Really its only a name change - and "packed archive" is already inside an mpi namespace so its not a huge issue. BUT I'm wondering if the idea of rendering C++ data structures as MPI primitives should be more orthogonal to MPI prototcol itself. That is, might it not be sometimes convenient to save such serializations to disk? Wouldn' this provide a portable binary format for free? (Lots of people have asked for this but no one as been sufficiently interested to actually invest the required effort).
As Doug Gregor pointed out this is not possible since the format is implementation-defined, and can change from one execution to another.
4) Shouldn't there be a logical place for other archive types for message passing - how about XDR? I would think it would be close cousin to MPI archives.
XDR might be used by an implementation or not - these are implementation details and a good MPI implementation is supposed to pick the best format.
c) The skeleton idea would be template<class BaseArchive> class skeleton_archive ....??? (I concede I haven't studied this enough).
Indeed, the skeleton archives could be factored out if anybody sees another use for them. This is an orthogonal piece of code, and we should discuss where it can be useful. One possible application is to visualize data structures without caring about the content, but only about types and pointers. But I don't know if anyone needs this or if there is another use for these code pieces. If there is then we can factor it out of the mpi detail namespace and put it into archive with no essential changes to the code.
The only "repeated" or shared code might be that which determines when either a binary or mpi optimization can be applied. It's not clear to me whether this criteria applies to both kinds of archives ore each one has its own separate criteria. If it's the latter - there's no shared code and we're done. If it's the former, the a separate free standing concept has to be invented. In the past I've called this "binary serializable" and more lately "magic". ( a concession to physicist's fondness for whimsical names).
The set of types for which an array optimization can be done is different for binary, MPI, XDR, ... archives, but a common dispatch mechanism is possible, which is what we have implemented in the array::[io]archive classes. Your "magic" idea (which you have not described to the list yet since it was only in private e-mails) can easily be incorporated into that. Just replace
typedef is_fundamental<mpl::_1> use_array_optimization;
by
typedef is_bitwise_serializable<mpl::_1> use_array_optimization;
or
typedef is_magic<mpl::_1> use_array_optimization;
and you have upgraded to your magic optimization!
So rather or in addtion to an MPI library you would end up with three logically distinct things. Each one can stand on its own.
So depending on this last, the serialization part of the MPI library falls into 3 or 4 independent pieces. If the code where shuffled around to reflect this, it would be much easier to use, test, verify, enhance and understand. Also the skeleton concept might be then applicable to other types of archives. Also the "magic" concept really is a feature of the type and is really part of the ad hoc C++ type reflection which is what serialization traits are.
If by three or four logically distinct things you mean
1. the array optimization 2. the skeleton&content archive wrappers 3. the MPI archives 4. the MPI library
then my comments are:
1. is already factored out and in the serialization library. If anything should be done to it, there was the desire to extend array wrappers to strided arrays, which can easily be done without touching anything in the serialization library.
2. is independent of the rest of the proposed Boost.MPI library but we keep it in detail since we do not see any other use for this at the moment. Once someone could use it we can move it immediately to the serialization library.
3. and 4. are tightly coupled since the MPI archives do not make any sense outside the Boost.MPI context and I do not see that splitting this into two separate libraries makes any sense at all. The code itself is written cleanly though, with no part of the MPI archive types depending on any of the communication functions.
Thus I see absolutely no reason at all to shuffle the code around anymore, unless you can come up with a reason to move the implementation details of skeleton&content to a public place in the serialization library.
Matthias
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On Sep 18, 2006, at 7:30 PM, Robert Ramey wrote:
Matthias Troyer wrote:
I am a bit perplexed by your mail, since it is an identical copy of a private e-mail you sent me two weeks ago, even before the review started.
I realize this - its just that I thought that someone else might have some other observations to add on the subject.
OK, thanks for explaining.
The comments of several reviewers, which were initially skeptical about our use of the serialization library in a high performance context, but whose concerns vanished when they saw the array optimizations, should show you that it was not only me who needs these optimizations.
I don't object to the array optimizations per se, I'm interested in seeing if there's a way to do that doesn't hard code coupling between particular pairs of archives and datatypes into the the original archive classes. Actually this question applies to the the modifications in binary_?archive so its a little off topic - but still related.
Watch out that there are more such types: multi_array, ublas and MTL vectors and matrices, ... With the array wrapper we have an elegant solution to handle also these other types. Since we have discussed this topic many times on the list over the past year I will not comment further for now.
I think this is the part I'm still not seeing. the changes to binary_?archive include specializations for std::valarray, std::vector and native C++ arrays.
Can you please show me these specializations? I do not see any except for a std::vector overload in the base class which is needed for special reasons and only for this class. valarray, C-array, as well as other classes like ublas arrays and matrices and multi_array can be serialized just using the array wrapper and will not need any modification to the archives. Matthias

"Robert Ramey" <ramey@rrsd.com> writes:
(admitadly not thought out in detail) I envisioned that the array.hpp would have the "default" serialization - lowest common denominator which is there so far so good. Then for say binary_?archive I expect to see:
<snip entire quoted message> -- Dave Abrahams Boost Consulting www.boost-consulting.com

That was a mistake - I accidently heat the RETURN key rather than the Shift key and my Outlook Express just sent the the email. Sooooo Sorry. Robert Ramey David Abrahams wrote:
"Robert Ramey" <ramey@rrsd.com> writes:
(admitadly not thought out in detail) I envisioned that the array.hpp would have the "default" serialization - lowest common denominator which is there so far so good. Then for say binary_?archive I expect to see:
<snip entire quoted message>

"Robert Ramey" <ramey@rrsd.com> writes:
That was a mistake - I accidently heat the RETURN key rather than the Shift key and my Outlook Express just sent the the email. Sooooo Sorry.
NP. We had 3 or 4 others of those in the past few days, so I felt it necessary to remind people. Thanks, -- Dave Abrahams Boost Consulting www.boost-consulting.com

Matthias Troyer wrote:
I am a bit perplexed by your mail, since it is an identical copy of a private e-mail you sent me two weeks ago, even before the review started.
I realize this - its just that I thought that someone else might have some other observations to add on the subject.
The comments of several reviewers, which were initially skeptical about our use of the serialization library in a high performance context, but whose concerns vanished when they saw the array optimizations, should show you that it was not only me who needs these optimizations.
I don't object to the array optimizations per se, I'm interested in seeing if there's a way to do that doesn't hard code coupling between particular pairs of archives and datatypes into the the original archive classes. Actually this question applies to the the modifications in binary_?archive so its a little off topic - but still related.
Watch out that there are more such types: multi_array, ublas and MTL vectors and matrices, ... With the array wrapper we have an elegant solution to handle also these other types. Since we have discussed this topic many times on the list over the past year I will not comment further for now.
I think this is the part I'm still not seeing. the changes to binary_?archive include specializations for std::valarray, std::vector and native C++ arrays. This pattern suggests that for these other data types for which an optimization might exist, more and more will have to added to binary archive. And all programs will have to include them even if they don't use them. When I originally suggested the idea of an array wrapper (admitadly not thought out in detail) I envisioned that the array.hpp would have the "default" serialization - lowest common denominator which is there so far so good. so then in say boost/serialization/vector.hpp I expect to see a specialization for array like void array<std::vector<T>::serialize<binary_iarchive &ar, const unsigned int) const { // special stuff for loading data into binary vectors } so now programs only have to compile and be aware of the specializations that they are in fact going to use. And each optimization can be compiled, tested, etc independently. This adds the same three "special cases" just in a different place - so the total work is the same. The only remain problem is to figure a way to do this through the base class so the optimization gets transmitted to any derivations. I'm not sure this is a huge deal. So far we only have two archives which can exploit these optimizations (binary and now mpi). It it becomes really bothersome, these specializations can forward to a commmon implementation. With some effort, it might be possible to avoid even this minimal effort. This would entail making the above somewhat more elaborate class array<std::vector> { template<class Archive> void binary_serialize(...){...} template<class Archive> void mpi_serialize(...){...} template<class Archive> void serialize<Archive &ar, const unsigned int version) const { // if Archive is derived from base_binary binary_serialize(ar, version); // else // if Archive is derived from base_mpi_archive mpi_serialize(...) // else array_default<T>::serialize(* this, version) } So - still one gets eactly what you want without forcing all users to include and compile specializations/optimizations for all the types you want to add in the future. Note that there could be one set of array wrappers for binary_serializable optimizations - a different set for mpi optimizable wrappers. This is the motivation for using the array wrapper - to permit the specializations for different types to be orthogonal to the archives and types that benefit from special treatment. I think its really the just the same code you already have - its just shuffled around so that it doesn't have to be included when you need it.
As Doug Gregor pointed out this is not possible since the format is implementation-defined, and can change from one execution to another.
OK - I just assumed (wrongly apparently) that an MPI protocol presumed a heterogeneous environment.
The set of types for which an array optimization can be done is different for binary, MPI, XDR, ... archives, but a common dispatch mechanism is possible, which is what we have implemented in the array::[io]archive classes.
And I think that is what I have a problem with. The "common" dispatch as I see it implemented presumes the known optimizable types. When other optimizable types are added, this will have to grow. It seems to me that it is fundementally not scalable. So personally, I would prefer to add the code to the derived types - but I understand this is my preference. Your "magic" idea (which you have not
described to the list yet since it was only in private e-mails) can easily be incorporated into that. Just replace
typedef is_fundamental<mpl::_1> use_array_optimization;
by
typedef is_bitwise_serializable<mpl::_1> use_array_optimization;
or
typedef is_magic<mpl::_1> use_array_optimization;
and you have upgraded to your magic optimization!
I would to find this code as part of the array_wrapper for std::vector rather than as part of the archive class.
If by three or four logically distinct things you mean
1. the array optimization 2. the skeleton&content archive wrappers 3. the MPI archives 4. the MPI library
then my comments are:
1. is already factored out and in the serialization library. If anything should be done to it, there was the desire to extend array wrappers to strided arrays, which can easily be done without touching anything in the serialization library.
Hmmm - what about MTL or ublas - don't these have there own special types for collections. I know boost::multi_array does. Wouldn't these have to be added to the std::valarray, and std::vector already in the binary archive?
2. is independent of the rest of the proposed Boost.MPI library but we keep it in detail since we do not see any other use for this at the moment. Once someone could use it we can move it immediately to the serialization library.
OK - there information the skeleton was a little - uhh skeletal . I really didn't understand how its implemented. The relationship - if any - to boost serialization isn't clear from the documentation. I suspect that this will be resolved by amplification in the documentation.
3. and 4. are tightly coupled since the MPI archives do not make any sense outside the Boost.MPI context and I do not see that splitting this into two separate libraries makes any sense at all. The code itself is written cleanly though, with no part of the MPI archive types depending on any of the communication functions.
This may be true - it wasn't obvious to me. by MPI archives I meant your packed_archive and that seemed to me a thin wrapper around base_binary_archive - which is fine with me. So I suspect my complaint is that the documentation seems to suggest that its something more elaborate than that.
Thus I see absolutely no reason at all to shuffle the code around anymore, unless you can come up with a reason to move the implementation details of skeleton&content to a public place in the serialization library.
I am intrigued by the skeleton - again the documentation doesn't really give a good idea of what it does an what else it might be used for. So my complaints really come down to two issues. a) I'm still not convinced you've factored optimizations which can be applied to certain pairs of types and archives in the best way. b) The MPI documention doesn't make very clear the organization of the disparate pieces. Its a user manual "cookbook" which is fine as far as it goes. But I think its going to need more explanation of the design itself. As an aside, I'm amazed you havn't gotten any flack for not including more formal documention - including concepts for the template parameters. Robert Ramey

On Sep 18, 2006, at 8:50 PM, Robert Ramey wrote:
The set of types for which an array optimization can be done is different for binary, MPI, XDR, ... archives, but a common dispatch mechanism is possible, which is what we have implemented in the array::[io]archive classes.
And I think that is what I have a problem with. The "common" dispatch as I see it implemented presumes the known optimizable types. When other optimizable types are added, this will have to grow. It seems to me that it is fundementally not scalable. So personally, I would prefer to add the code to the derived types - but I understand this is my preference.
No, the "optimizable types" are not the types (like std::vector, std::valarray) for which an array optimization exists, but rather the value_types of the array for which the storage can be optimized. This set depends only on the archive itself, and not on the types, and each archive can have its own lambda expression to determine whether the value_type is optimizable. Adding optimized serialization to e.g. multi_array will only mean that multi_array should use the array wrapper to serialize its data instead of writing a loop over all elements. This simplifies the serialization implementation for this class, and automatically provides optimized serialization for all the types, without any change in the serialization library, nor any change in an archive. This is perfectly scalable in contrast to your idea of having each archive class re-implement the serialization of all optimizable containers. I am a bit confused about your arguments above since it was actually you who suggested the array wrapper as the least intrusive and scalable solution
I would to find this code as part of the array_wrapper for std::vector rather than as part of the archive class.
again, there is no array_wrapper for std::vector, rather the std::vector<T> serialization serializes its data through an array<T> wrapper, as you had proposed.
This would entail making the above somewhat more elaborate
class array<std::vector> { template<class Archive> void binary_serialize(...){...}
template<class Archive> void mpi_serialize(...){...}
template<class Archive> void serialize<Archive &ar, const unsigned int version) const { // if Archive is derived from base_binary binary_serialize(ar, version); // else // if Archive is derived from base_mpi_archive mpi_serialize(...) // else array_default<T>::serialize(* this, version) }
Ouch!!! This is just what I mean by not scalable. We already have five cases now (plain, binary, packed MPI, MPI datatype, skeleton) with two more coming soon (XDR, HDF5). Do you really want that each author of a serialization function for an array-like data structure should reimplement an optimization for all these archives????????
If by three or four logically distinct things you mean
1. the array optimization 2. the skeleton&content archive wrappers 3. the MPI archives 4. the MPI library
then my comments are:
1. is already factored out and in the serialization library. If anything should be done to it, there was the desire to extend array wrappers to strided arrays, which can easily be done without touching anything in the serialization library.
Hmmm - what about MTL or ublas - don't these have there own special types for collections. I know boost::multi_array does. Wouldn't these have to be added to the std::valarray, and std::vector already in the binary archive?
I skipped most of the above because it seems there is a fundamental misunderstanding regarding the role of the array wrapper. The array wrapper, which you had suggested yourself, was introduced to completely decouple array optimizations from specific datatypes. When implementing MTL, ublas, Blitz or other serialization one just uses an array wrapper to serialize contiguous arrays. An archive can then user either the element-wise default serialization of the array wrapper, or decide to overload it, and implement an optimized way -- independent of which class the array wrapper came from. Thus, there is no std::vector, std::valarray, ... overload in any of the archives - not in the binary archive nor anywhere else. What you seem to propose, both above and in the longer text I cut, is to instead re-implement the optimized serialization for all these N classes in the M different archive types that can use it (we have M=4 now with the binary, packed MPI, MPI datatype, and skeleton archives, and soon we'll do M+=2 by adding XDR and HDF5 archives.). Besides leading to an M*N problem, which the array wrapper was designed to solve, this leads to intrusion problems into all classes that need to be serialized (including multi_array and all others), which is not feasible as we discussed last year.
I am intrigued by the skeleton - again the documentation doesn't really give a good idea of what it does an what else it might be used for.
The skeleton is just all types that you treat in the archive classes and not in the primitives, while the contents is all you treat in the primitives. It is just a formalization of your serialization library implementation details.
So my complaints really come down to two issues.
a) I'm still not convinced you've factored optimizations which can be applied to certain pairs of types and archives in the best way.
That's a separate discussion which we seem to be repeating every few months now. It seems to me from today's discussion that there is a confusion now about the use of the array wrapper, which we use in just the way you originally proposed.
b) The MPI documention doesn't make very clear the organization of the disparate pieces. Its a user manual "cookbook" which is fine as far as it goes. But I think its going to need more explanation of the design itself.
Most of the issues you are interested, such as the use of serialization for the skeleton&content are implementation details, the important points of which will be explained in a paper that is currently being written. Matthias

Matthias Troyer wrote:
On Sep 18, 2006, at 8:50 PM, Robert Ramey wrote:
Thus, there is no std::vector, std::valarray, ... overload in any of the archives - not in the binary archive nor anywhere else.
Well, we're not seeing the same thing. template<class Archive> class basic_binary_oarchive : public array::oarchive<Archive> { ... template <class Archive> class oarchive : public archive::detail::common_oarchive<Archive> { typedef archive::detail::common_oarchive<Archive> Base; public: oarchive(unsigned int flags) : archive::detail::common_oarchive<Archive>(flags) {} // save_override for std::vector and serialization::array dispatches to // save_optimized with an additional argument. // // If that argument is of type mpl::true_, an optimized serialization is provided // If it is false, we just forward to the default serialization in the base class //the default version dispatches to the base class template<class T> void save_optimized(T const &t, unsigned int version, mpl::false_) { Base::save_override(t, version); } // the optimized implementation for vector uses serialization::array template<class ValueType, class Allocator> void save_optimized( const std::vector<ValueType, Allocator> &t, unsigned int, mpl::true_) { const serialization::collection_size_type count(t.size()); * this->This() << BOOST_SERIALIZATION_NVP(count); * this->This() << serialization::make_array(serialization::detail::get_data(t),t.size()); } // the optimized implementation for serialization::array uses save_array template<class ValueType> void save_optimized( const serialization::array<ValueType> &t, unsigned int version, mpl::true_) { this->This()->save_array(t,version); } // to save a vector: // if the value type is trivially constructable or an optimized array save exists, // then we can use the optimized version template<class ValueType, class Allocator> void save_override(std::vector<ValueType,Allocator> const &x, unsigned int version) { typedef typename mpl::apply1< BOOST_DEDUCED_TYPENAME Archive::use_array_optimization , BOOST_DEDUCED_TYPENAME remove_const<ValueType>::type >::type use_optimized; save_optimized(x,version,use_optimized() ); } // dispatch saving of arrays to the optimized version where supported template<class ValueType> void save_override(serialization::array<ValueType> const& x, unsigned int version) { typedef typename mpl::apply1< BOOST_DEDUCED_TYPENAME Archive::use_array_optimization , BOOST_DEDUCED_TYPENAME remove_const<ValueType>::type >::type use_optimized; save_optimized(x,version,use_optimized()); } // Load everything else in the usual way, forwarding on to the // Base class template<class T> void save_override(T const& x, unsigned BOOST_PFTO int version) { Base::save_override(x, static_cast<unsigned int>(version)); } }; } } } // end namespace boost::archive::array which just moves some of the implementation out of binary_oarchive into oarchive. What attracts my attention is: // dispatch saving of arrays to the optimized version where supported template<class ValueType> void save_override(serialization::array<ValueType> const& x, unsigned int version) { Now these suggest to me that the next person who want's to handle his wrapper specially for certain archives will then want/have to back into binary_oarchive and/or oarchive to add special handling to HIS wrapper. That is what I mean by not scalable. and template<class ValueType, class Allocator> void save_override(std::vector<ValueType,Allocator> const &x, unsigned int version) { And then there is the special handling for std::vector - which suggests that when one want's to add special handling for multi_array he'll have to do something else. I realise that my nvp wrapper is similar to the above and perhaps that has led to some confusion. When faced with the problem I said to myself - "self - I know you wan't to maintain decoupling between types and archives - but this is a special case - live with it" And I listened to myself. But now I see that was a mistake. We only have two types of XML archives and they could have been handled by putting the NVP handling in the most derived class - even though it was some repetition. oh well. Finally - the usage of inheritance used above strikes me as confusing, misleading, and unnecessary. This may or may not be strictly an aesthetic issue. The only really public/protected function is the overload for the array wrapper and the default forwarder. These entry point functions are only called explictly by the drived class. Basically the base class is not being used to express an "IS-A" relationship but rather an "Implemented in terms of" relationship as decribed in Scott Meyers Effective C++ item 41. So I would have expected something in binary_oarchive like template<class T> void save_override(array<T> & t, const unsigned int version) const { // if array is optmizable save_optimized_array(t, version); // most of oarray in here // else // forward to default handling } This would have to be added to every archive which supports this but then we now have to explictly forward twice anyhow so I don't see a hugh difference.
What you seem to propose, both above and in the longer text I cut, is to instead re-implement the optimized serialization for all these N classes in the M different archive types that can use it (we have M=4 now with the binary, packed MPI, MPI datatype, and skeleton archives, and soon we'll do M+=2 by adding XDR and HDF5 archives.). Besides leading to an M*N problem, which the array wrapper was designed to solve, this leads to intrusion problems into all classes that need to be serialized (including multi_array and all others), which is not feasible as we discussed last year.
I am aware of the desire to avoid the M*N problem - that should be pretty clear from the design of the serialization library itself - which takes great pains to permit any serializable type to be saved/load from any archive. The problem is that when I look at the code, its not clear that this problem is being addressed. And its not even clear that there is an M*N problem here: binary archives benefit from optimization of types which are "binary serializable" mpi_archives benefit from optimizaton of types for which there exists a predefined MPI prmitive. etc. There may or may not be overlapp here. But it seems to me that we're trying to hard to factor where truely common factors don't exist. So I'm looking at binary, packed - 3 archives - valarray, vector, C++ native arrays 3 types- 9 overloads and one special trait - is_binary_serializable mpi - 1 archive * 3 types - 3 overloads and one special trait - is_mpi_optimizable and a couple more - but not a HUGE amount. And I don't think this intrudes into the other collection classes. They are already going to be wrapped in array - so they are done. The only issue is how is the best way to specify the table of archive - array type pairs which benefit from optimized handling. One way is to sprinkle special code for array in different places. Another way is to just bite the bullet and make the table for every archive.
That's a separate discussion which we seem to be repeating every few months now. It seems to me from today's discussion that there is a confusion now about the use of the array wrapper, which we use in just the way you originally proposed.
Hmmm - when I proposed it I had in mind that it would be used differently - as I outlined above. This is not anyone's fault, it just didn't occur to me that it would occur to anyone to use it differently. And truth is, I've really just wanted to keep ehancements/extensions to the library orthogonal to the original concepts. I thought the array wrapper suggestion would accomplish that so once it was "accepted" I got back to may paying job. Robert Ramey

Robert Ramey wrote:
But now I see that was a mistake. We only have two types of XML archives and they could have been handled by putting the NVP handling in the most derived class - even though it was some repetition. oh well.
Replace the above with But now I see that was a mistake. We only have two types of XML archives and they could have been handled by makeing two specializations of void nvp<T>::serialize(xml_oarchive &ar, ... and void nvp<T>::serialize(xmo_woarchive &ar, .. which is what propse that array do. Robert Ramey

Hi Robert, Since this is a discussion orthogonal to the MPI review, I have renamed the subject. On Sep 19, 2006, at 12:30 AM, Robert Ramey wrote:
Matthias Troyer wrote:
On Sep 18, 2006, at 8:50 PM, Robert Ramey wrote:
Thus, there is no std::vector, std::valarray, ... overload in any of the archives - not in the binary archive nor anywhere else.
Well, we're not seeing the same thing.
template<class Archive> class basic_binary_oarchive : public array::oarchive<Archive> { ...
template <class Archive> class oarchive : public archive::detail::common_oarchive<Archive>
[snip - implementation details - snip]
OK, now I understand your point. Let us recall the original design that we had discussed and that was in the end your suggestion: 1. introduce an array wrapper array<ValueType> to be used in the serialize function of all classes that can profit from array optimizations 2. provide a default serialization for the array wrapper 3. overload the array< ValueType> serialization in those classes that can optimize array serialization 4. to simplify these optimizations and avoid code duplication provide a wrapper like template <class Archive> array_oarchive<Archive> : Archive {....} to "add" array optimization to existing archives I hope you do not want to change this design now? We had next implemented just this, and placed in a separate namespace archive::array, without touching your binary archive. The save_override for array<ValueType> with all the dispatch logic you quoted above was implemented in this wrapper. In addition, we provided, in the archive::array namespace, separated from yours, an array optimized binary archive using that wrapper. Then, on June 4th you sent me an e-mail, suggesting that I move the array<ValueType> optimization into your binary archive. I quote from your e-mail: #Howevever, once we introduce the concept of wrapper, we can introduce the # special wrapper type into binary_archive without changing the requirements # for other archives. This is the model for xml_?archives. So I think you # should move your array type to somewhere higher - perhaps as # basic_binary_?archive. This would eliminate a set of special classes just # for this. In this way we all get what we want: You get enhanced # binary_?archive - I keep the minimal set of requirements for all new # archives. The easiest way of achieving this without code duplication was to actually provide an array-optimized base class from which all archives using array optimization could derive, and to put all of the array optimization logic into this base class array::oarchive, from which you quoted above. It seems now, however, that you dislike this, and that's why I proposed to just roll back to the state before June 4th, thus not touching any of your archives, and again providing our own version of an array optimized binary archive. You can then implement the array optimization as you feel like for your archives, and we have our mechanism for our archives. Please let me know if I should do this.
which just moves some of the implementation out of binary_oarchive into oarchive. What attracts my attention is:
// dispatch saving of arrays to the optimized version where supported template<class ValueType> void save_override(serialization::array<ValueType> const& x, unsigned int version) {
Now these suggest to me that the next person who want's to handle his wrapper specially for certain archives will then want/have to back into binary_oarchive and/or oarchive to add special handling to HIS wrapper. That is what I mean by not scalable.
Actually, as you can see from your own mail of June 4th, it was *you* who suggested to move this save_override into the binary archive. If you now have doubts about this idea of yours, then we can again roll back to the CVS state of June 4th.
template<class ValueType, class Allocator> void save_override(std::vector<ValueType,Allocator> const &x, unsigned int version) {
And then there is the special handling for std::vector - which suggests that when one want's to add special handling for multi_array he'll have to do something else.
No, this will not be needed for multi_array, nor for ublas or Blitz arrays or any other array class which requires default constructible value types. The std::vector override is only a workaround until we have an is_default_constructible traits in Boost. If your main objection is to that overload, then we can just implement is_default_constructible and remove that workaround.
Finally - the usage of inheritance used above strikes me as confusing, misleading, and unnecessary. This may or may not be strictly an aesthetic issue. The only really public/protected function is the overload for the array wrapper and the default forwarder. These entry point functions are only called explictly by the drived class. Basically the base class is not being used to express an "IS-A" relationship but rather an "Implemented in terms of" relationship as decribed in Scott Meyers Effective C++ item 41. So I would have expected something in binary_oarchive like
template<class T> void save_override(array<T> & t, const unsigned int version) const { // if array is optmizable save_optimized_array(t, version); // most of oarray in here // else // forward to default handling }
This would have to be added to every archive which supports this but then we now have to explictly forward twice anyhow so I don't see a hugh difference.
Well, you are suggesting to just copy this same function plus the " // if array is optmizable " dispatch logic into every new archive class that supports array optimization. Isn't heritance just the mechanism to avoid this copy&pasting of hundred lines of identical code? Again, as I said before, if you do not like this in your binary archive, we can easily remove it from there again and you can do your own implementation, while we do ours. Sha
What you seem to propose, both above and in the longer text I cut, is to instead re-implement the optimized serialization for all these N classes in the M different archive types that can use it (we have M=4 now with the binary, packed MPI, MPI datatype, and skeleton archives, and soon we'll do M+=2 by adding XDR and HDF5 archives.). Besides leading to an M*N problem, which the array wrapper was designed to solve, this leads to intrusion problems into all classes that need to be serialized (including multi_array and all others), which is not feasible as we discussed last year.
I am aware of the desire to avoid the M*N problem - that should be pretty clear from the design of the serialization library itself - which takes great pains to permit any serializable type to be saved/load from any archive. The problem is that when I look at the code, its not clear that this problem is being addressed.
And its not even clear that there is an M*N problem here:
binary archives benefit from optimization of types which are "binary serializable"
mpi_archives benefit from optimizaton of types for which there exists a predefined MPI prmitive.
etc.
There may or may not be overlapp here. But it seems to me that we're trying to hard to factor where truely common factors don't exist.
So I'm looking at binary, packed - 3 archives - valarray, vector, C++ native arrays 3 types- 9 overloads and one special trait - is_binary_serializable
No, wait a moment! If by packed you mean a packed MPI archive then is_binary_serializable is not the right trait. Also, where do you get only 3 types? We will have multi_array, ublas matrices and vectors, MTL arrays, Blitz arrays and many more coming. But the array wrapper reduced this down to just one class: array.
mpi - 1 archive * 3 types - 3 overloads and one special trait - is_mpi_optimizable
Again why 3 overloads? There is only one for array (and the temporary workaround for vector which we can remove if you want). So, if we keep the array wrapper then we are halfways there, Having reduced the M*N to an M*1 problem. What we have done next is to go further and simplify even this M*1 problem by introducing a base class that implements the dispatch logic for array. I believe that this is what you do not like. Thus we can easily remove it again from your archive, and leave the array optimization for your binary archive to you, while we use our method for the archives that we are writing (packed MPI, MPI datatype, skeleton, XDR, HDF5).
Hmmm - when I proposed it I had in mind that it would be used differently - as I outlined above. This is not anyone's fault, it just didn't occur to me that it would occur to anyone to use it differently.
And truth is, I've really just wanted to keep ehancements/extensions to the library orthogonal to the original concepts. I thought the array wrapper suggestion would accomplish that so once it was "accepted" I got back to may paying job.
OK, since I also do not get paid for this, and your criticism is based on a change that you yourself proposed on June 4th, my proposal is to stop the discussion right here, go back to the state of June 4th, where our optimizations were completely decoupled from your archives. That way you will be able to implement the array optimization for the binary archive in the way you like best, and we have our own way. Please confirm and I will do that. Matthias

Matthias Troyer wrote:
OK, since I also do not get paid for this, and your criticism is based on a change that you yourself proposed on June 4th, my proposal is to stop the discussion right here, go back to the state of June 4th, where our optimizations were completely decoupled from your archives. That way you will be able to implement the array optimization for the binary archive in the way you like best, and we have our own way. Please confirm and I will do that.
Honestly, I don't remember the state of things 4 June. And I don't remember the character of my complaints. I presume that going back would just substitute the original complaints for the current ones. I've stated my current reservations. I wanted to make them known in case someone else might share them or others. It seems that I'm the only one who has these views - a familiar and comfortable position for me.. I do appreciate the interest, initiative and effort and I'm flattered that you found it worthy of this investment of effort. And, I AM happy to relinquish responsability for enhancement, support and maintainence for this portion of the library. You've listened to my concerns and tried to answer them and I appreciate that. But, we'll just have to agree to disagree - I can live with that. So feel free to move forward with the accordance with your good judgment. Robert Ramey

"Robert Ramey" <ramey@rrsd.com> writes:
Matthias Troyer wrote:
OK, since I also do not get paid for this, and your criticism is based on a change that you yourself proposed on June 4th, my proposal is to stop the discussion right here, go back to the state of June 4th, where our optimizations were completely decoupled from your archives. That way you will be able to implement the array optimization for the binary archive in the way you like best, and we have our own way. Please confirm and I will do that.
Honestly, I don't remember the state of things 4 June. And I don't remember the character of my complaints. I presume that going back would just substitute the original complaints for the current ones.
I've stated my current reservations. I wanted to make them known in case someone else might share them or others. It seems that I'm the only one who has these views - a familiar and comfortable position for me..
For the record, the difficulty is not your iconoclastic viewpoint, which is almost always useful. The problem we're having is that what you're complaining about is a situation created, and of whose merit many of us were convinced, by you. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
The problem we're having is that what you're complaining about is a situation created, and of whose merit many of us were convinced, by you.
The problem we're having is that there is a disagreement about how to design such an extension in such a way that it isn't a future maintainence headache. Of course, if I make an observation that conflicts with some currently (perhaps widely) held opinion I suppose that one can say I "created a situation". LOL - but in my view the "situation" is already created - I'm just stating what to me is obvious. A perfect example of this is the boost test/release system. When I first made obvervations on this topic they were roundly dismissed (that's OK) and it was sort of suggested that I was just being contrary for its own sake. (not really OK - but I can just ignore it). I think that the current situation and ideas floating around on this subject make it clear that my original ideas on the subject had some merit after all - and that this is coming to be recognized. I can't help wonder how many worthy ideas are NOT posted because the poster is reluctant to subject himself to this kind of characterization. Robert Ramey

On 21.09.2006, at 16:34, David Abrahams wrote:
"Robert Ramey" <ramey@rrsd.com> writes:
Matthias Troyer wrote:
OK, since I also do not get paid for this, and your criticism is based on a change that you yourself proposed on June 4th, my proposal is to stop the discussion right here, go back to the state of June 4th, where our optimizations were completely decoupled from your archives. That way you will be able to implement the array optimization for the binary archive in the way you like best, and we have our own way. Please confirm and I will do that.
Honestly, I don't remember the state of things 4 June. And I don't remember the character of my complaints. I presume that going back would just substitute the original complaints for the current ones.
I've stated my current reservations. I wanted to make them known in case someone else might share them or others. It seems that I'm the only one who has these views - a familiar and comfortable position for me..
For the record, the difficulty is not your iconoclastic viewpoint, which is almost always useful. The problem we're having is that what you're complaining about is a situation created, and of whose merit many of us were convinced, by you.
Dave, actually in this case it was partially a slightly different situation. We had made a design which was non-intrusive on the archive classes, and thus had Robert's original binary_archive and our array-optimized wrapper array::binary_archive in a separate namespace. On June 4th Robert invited me to merge them since at that time he thought that it would make sense moving the array optimization into one of the archive base classes. In the meantime it seems that Robert does not like all aspects of that idea, or of our specific implementation, and I thus offered to go back to having an array-optimized binary archive separately from he non-optimized ones. Robert, you did not have any complaints on June 4th, but just asked me to add our optimizations directly to your binary archive. If you feel more comfortable without this, we can undo this at any time since it will not break any existing archives. If however, as you write, you feel you can live with what we have done then we can leave it as it is and I can certainly take over responsibility for maintaining the array optimizations and any further extensions to it that might be desired. Matthias

Matthias Troyer wrote:
On 21.09.2006, at 16:34, David Abrahams wrote:
Robert, you did not have any complaints on June 4th, but just asked me to add our optimizations directly to your binary archive.
Its entirely possible I might have suggested this without correctly envisioning what the final result would look like. uhhh - that happens to me all the time. I suppose that's why I'm a software developer as opposed to say ... - a diamond cutter or suicide prevention counseler.
If you feel more comfortable without this, we can undo this at any time since it will not break any existing archives. If however, as you write, you feel you can live with what we have done then we can leave it as it is and I can certainly take over responsibility for maintaining the array optimizations and any further extensions to it that might be desired.
That's fine with me - really. My concern is that the maintainence and support of this has been underestimated. Personally I've found the code very hard to follow. However, It's not a big issue for me now that you've offered to do it. At the risk of picking at a festering sore - here is my (imperfect) recollection of how we got here. The original proposal required adding save_array to all present and future archives. I objected to the idea of expanding the interface just to accomodate a special case. This was resolved with the "array_wrapper" which provided an overridable default which was suitable for "other" archives. This was a HUGE improvement as far as I was concerned. As things progressed, it became clear that support for"serialization wrappers" - including the new array_wrapper - needed to be generalized some in order to remove direct dependency of the serialization code on certain specific types (such as nvp and array, and C++ arrays). This you did - Another significant improvement. So now we have a specific wrapper (array_wrapper) handled in a specific archive binary_archive. This is analogous to the situation in xml_archive where nvp is handled in xml_archive. (Its not quite the same - as xml_archive without nvp really makes no sense.) To me the current implementation stops short of carrying the wrapper idea to its logical conclusion. - having the binary archive code independent of array_wrapper (or any other particular wrapper for that matter). This is in line with my (mostly successful) campaign to keep archive code independent of particular data types. I also believe that things have been confused by an evolution in the original idea for save/load array. As I remember the motivation was that save/load array could be re-implemented differently for binary and/or mpi and/or other archives. Of course we have this now as the array_wrapper can be implemented differently for each archive. But now it seems that save/load array are going to be quite different for binary and mpi type archives. I think that at one point there might have been the idea that save/load implementations could be shared between different binary-type archives. But now it seems that binary_archive an mpi and/or other archives will have less in common than originally thought. I think this evolution has led to code which is very hard for me to follow and which will be more work to maintain and support. I can't help but think that if 10x speed up of binary_archive were considered today, (given the wrapper and other changes), the enhancement would be less general and much simpler. Good Luck Robert Ramey

On 22.09.2006, at 00:18, Robert Ramey wrote:
I also believe that things have been confused by an evolution in the original idea for save/load array. As I remember the motivation was that save/load array could be re-implemented differently for binary and/or mpi and/or other archives. Of course we have this now as the array_wrapper can be implemented differently for each archive. But now it seems that save/load array are going to be quite different for binary and mpi type archives. I think that at one point there might have been the idea that save/load implementations could be shared between different binary-type archives. But now it seems that binary_archive an mpi and/or other archives will have less in common than originally thought.
Our motivation was actually not that the save/load implementation code could be shared, but the dispatch code to decide for which types the optimized serialization can be used and for which it can't. That code is 50x bigger than the actual save_array/load_array implementations which are just ne or few-line dispatches to save_binary, MPI_Pack, or the creation of an MPI data type.
I think this evolution has led to code which is very hard for me to follow and which will be more work to maintain and support. I can't help but think that if 10x speed up of binary_archive were considered today, (given the wrapper and other changes), the enhancement would be less general and much simpler.
Sure, if you want to restrict the enhancements to binary archives and primitive data types a slightly simpler implementation is possible. That solution would not be scalable, however. Matthias
participants (6)
-
David Abrahams
-
Douglas Gregor
-
Ian McCulloch
-
Matthias Troyer
-
Matthias Troyer
-
Robert Ramey