Re: [boost] [MPI] Review comments

18 Sep 2006

      Matthias Troyer wrote:
...
I am a bit perplexed by your mail, since it is an identical copy of a
private e-mail you sent me two weeks ago, even before the review
started.
I realize this - its just that I thought that someone else might
have some other observations to add on the subject.
...
The comments of several reviewers, which were initially skeptical
about our use of the serialization library in a high performance
context, but whose concerns vanished when they saw the array
optimizations, should show you that it was not only me who needs
these optimizations.
I don't object to the array optimizations per se,  I'm interested
in seeing if there's a way to do that doesn't hard code coupling
between particular pairs of archives and datatypes into the
the original archive classes.  Actually this question applies to the
the modifications in binary_?archive so its a little off topic - but
still related.
...
Watch out that there are more such types: multi_array, ublas and MTL
vectors and matrices, ... With the array wrapper we have an elegant
solution to handle also these other types. Since we have discussed
this topic many times on the list over the past year I will not
comment further for now.
I think this is the part I'm still not seeing.  the changes to 
binary_?archive
include specializations for std::valarray, std::vector and native C++
arrays.  This pattern suggests that for these other data types for which
an optimization might exist, more and more will have to added to
binary archive.  And all programs will have to include them even
if they don't use them.

When I originally suggested the idea of an array wrapper
(admitadly not thought out in detail) I envisioned that the array.hpp
would have the "default" serialization - lowest common denominator
which is there so far so good.  Then for say binary_?archive I expect
to see:
...
If you do not like the way we have implemented the array
optimizations in the binary archive then we can just roll back the
CVS state to the version at the end of May where we had implemented a
separate array-optimized binary archive, and non of the MPI archives
needed to change any of your archives.
...
b) mpi_?archive should derive directly from common_?archive like
basic_binary_?archive does. The reason I have basic_... is that for
xml and
text there are separate wide character versions so I wanted to
factor out
the commonality. In your case, I don't think that's necessary so I
would
expect your hierarchy would look like
class mpi_archive :
public common_archive,
public interface_archive
...
Do you mean the packed archive? This is actually a binary archive -
do you really mean that we should reimplement the functionality of
the binary archive and not reuse what is there?
...
Note that you've used packed_archive - I would use mpi_archive
instead. I
think this is a better description of what it is.
I still prefer mpi::packed_archive, since there can also be other MPI
archives. One possible addition to speed up things on homogeneous
machines might be just an mpi::binary_archive, using a binary buffer.
...
Really its only a name change - and "packed archive" is already
inside an
mpi namespace so its not a huge issue.  BUT I'm wondering if the
idea of
rendering C++ data structures as MPI primitives should be more
orthogonal to
MPI prototcol itself.  That is, might it not be sometimes
convenient to save
such serializations to disk?  Wouldn' this provide a portable
binary format
for free?  (Lots of people have asked for this but no one as been
sufficiently interested to actually invest the required effort).
As Doug Gregor pointed out this is not possible since the format is
implementation-defined, and can change from one execution to another.
...
4) Shouldn't there be a logical place for other archive types for
message
passing - how about XDR?  I would think it would be close cousin to
MPI
archives.
XDR might be used by an implementation or not - these are
implementation details and a good MPI implementation is supposed to
pick the best format.
...
c) The skeleton idea would be
template<class BaseArchive>
class skeleton_archive
....???
(I concede I haven't studied this enough).
Indeed, the skeleton archives could be factored out if anybody sees
another use for them. This is an orthogonal piece of code, and we
should discuss where it can be useful. One possible application is to
visualize data structures without caring about the content, but only
about types and pointers. But I don't know if anyone needs this or if
there is another use for these code pieces. If there is then we can
factor it out of the mpi detail namespace and put it into archive
with no essential changes to the code.
...
The only "repeated" or shared code might be that which determines
when either a binary or mpi optimization can be applied. It's not
clear
to me
whether this criteria applies to both kinds of archives ore each
one has its
own separate criteria. If it's the latter - there's no shared code
and we're
done. If it's the former, the a separate free standing concept has
to be
invented. In the past I've called this "binary serializable" and
more lately
"magic". ( a concession to physicist's fondness for whimsical names).
The set of types for which an array optimization can be done is
different for binary, MPI, XDR, ... archives, but a common dispatch
mechanism is possible, which is what we have implemented in the
array::[io]archive classes. Your "magic" idea (which you have not
described to the list yet since it was only in private e-mails) can
easily be incorporated into that. Just replace
typedef is_fundamental<mpl::_1> use_array_optimization;
by
typedef is_bitwise_serializable<mpl::_1> use_array_optimization;
or
typedef is_magic<mpl::_1> use_array_optimization;
and you have upgraded to your magic optimization!
...
So rather or in addtion to an MPI library you would end up with three
logically distinct things. Each one can stand on its own.
...
So depending on this last, the serialization part of the MPI
library falls
into 3 or 4 independent pieces. If the code where shuffled around
to reflect
this, it would be much easier to use, test, verify, enhance and
understand.
Also the skeleton concept might be then applicable to other types of
archives. Also the "magic" concept really is a feature of the type
and is
really part of the ad hoc C++ type reflection which is what
serialization
traits are.
If by three or four logically distinct things you mean
1. the array optimization
2. the skeleton&content archive wrappers
3. the MPI archives
4. the MPI library
then my comments are:
1. is already factored out and in the serialization library. If
anything should be done to it, there was the desire to extend array
wrappers to strided arrays, which can easily be done without touching
anything in the serialization library.
2. is independent of the rest of the proposed Boost.MPI library but
we keep it in detail since we do not see any other use for this at
the moment. Once someone could use it we can move it immediately to
the serialization library.
3. and 4. are tightly coupled since the MPI archives do not make any
sense outside the Boost.MPI context and I do not see that splitting
this into two separate libraries makes any sense at all. The code
itself is written cleanly though, with no part of the MPI archive
types depending on any of the communication functions.
Thus I see absolutely no reason at all to shuffle the code around
anymore, unless you can come up with a reason to move the
implementation details of skeleton&content to a public place in the
serialization library.
Matthias
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] [MPI] Review comments

Robert Ramey