New subject: Please do not overquote!

17 Sep 2006

      This isn't a full review - I've just read the documentation and perused the 
code.  Its more like random observations. A lot of this is very particular 
to the way that the MPI library uses/builds upon serialization.  So it may 
not be of interested to many.

I've spent some more time reviewing the serialization currently checked into 
the head. It's quite hard to follow.
Among other things, the MPI library includes the following:

a) an optimization for serialization of std::vector, std::valarray and 
native C++ arrays in binary archives.

b) A new type of archive (which should be called mpi_?archive) which 
serializes C++ structures in terms of MPI datatypes. This would complement 
the archive types that are already included in the package.

i) text - renders C++ structures in terms of a long string of characters - 
the simplest portable method.
ii) binary - renders C++ structures as native binary data. The fastest 
method - but non portable.
iii) renders ... as xml elements - a special case of i) above.

So we would end up with an mpi_archive and optionally mpi_primitive. In the 
other archives, I separated ?_primitive so this could be shared by both text 
and xml. In your case it isn't necessary to make an mpi_primitive class - 
though it might be helpful and it would certainly be convenient to leverage 
on the established pattern to ease understanding for the casual reader.

c) the "skeleton" idea - which I still haven't totally figured out yet. I 
believe I would characterize this as an "archive adaptor" which changes the 
behavior of any the archive class to which it is applied. In this way it is 
similar to the "polymorphic_?archive" .

In my view these enhancements are each independent of one another. This is 
not reflected in the current implementation. I would suggest the following:

a) enhancements to the binary archive be handled as such. We're only talking 
about specializations for three templates - std::vector, std:valarray and 
native C++ arrays. I know these same three are also handled specially for 
mpi_?archives, but it's still a mistake to combine them. in binary_?archive 
they are handled one (load binary) while in mpi_archive they are handled 
another (load_array) I still think this would be best implemented as 
"enhanced_binary_?archive".

b) mpi_?archive should derive directly from common_?archive like 
basic_binary_?archive does. The reason I have basic_... is that for xml and 
text there are separate wide character versions so I wanted to factor out 
the commonality. In your case, I don't think that's necessary so I would 
expect your hierarchy would look like
class mpi_archive :
public common_archive,
public interface_archive
...
I doubt it even has to be a template. It would
1) render the native archive types (class_id, etc) as small integers - like 
the binary archive currently does.
2) render C++ primitives (and std::string) as corresponding MPI datatypes
3) handle the special implementations for C++ native arrays, std::vector and 
std::valarray
Note that you've used packed_archive - I would use mpi_archive instead. I 
think this is a better description of what it is.
Really its only a name change - and "packed archive" is already inside an 
mpi namespace so its not a huge issue.  BUT I'm wondering if the idea of 
rendering C++ data structures as MPI primitives should be more orthogonal to 
MPI prototcol itself.  That is, might it not be sometimes convenient to save 
such serializations to disk?  Wouldn' this provide a portable binary format 
for free?  (Lots of people have asked for this but no one as been 
sufficiently interested to actually invest the required effort).
4) Shouldn't there be a logical place for other archive types for message 
passing - how about XDR?  I would think it would be close cousin to MPI 
archives.

c) The skeleton idea would be
template<class BaseArchive>
class skeleton_archive
....???
(I concede I haven't studied this enough).
This would be coded as an "archive adaptor" (as is polymorphic archive) as 
described in a discussion thread some months ago.  The concept of 
the"skeleton" seems very interesting but really orthogonal to any particular 
type of archive. Perhaps the
skeleton idea would be useful to other types of data renderings.  By making 
it as an archive adaptor, its facility could be added to any existing 
archive.  Even if not useful anywhere else, it would help comprehensability 
and testability to factor it out in this way.

So rather or in addtion to an MPI library you would end up with three 
logically distinct things. Each one can stand on its own.
The only "repeated" or shared code might be that which determines when 
either a binary or mpi optimization can be applied. It's not clear to me 
whether this criteria applies to both kinds of archives ore each one has its 
own separate criteria. If it's the latter - there's no shared code and we're 
done. If it's the former, the a separate free standing concept has to be 
invented. In the past I've called this "binary serializable" and more lately 
"magic". ( a concession to physicist's fondness for whimsical names).

So depending on this last, the serialization part of the MPI library falls 
into 3 or 4 independent pieces. If the code where shuffled around to reflect 
this, it would be much easier to use, test, verify, enhance and understand. 
Also the skeleton concept might be then applicable to other types of 
archives. Also the "magic" concept really is a feature of the type and is 
really part of the ad hoc C++ type reflection which is what serialization 
traits are.

So, that's my assessment.

Robert Ramey

[MPI] Review comments

Robert Ramey

Douglas Gregor

Matthias Troyer

Ian McCulloch

Matthias Troyer

Robert Ramey

Matthias Troyer

David Abrahams

Robert Ramey

David Abrahams

Robert Ramey

Matthias Troyer

Robert Ramey

Robert Ramey

Matthias Troyer

Robert Ramey

David Abrahams

Robert Ramey

Matthias Troyer

Robert Ramey

Matthias Troyer

tags

participants (6)