
David Abrahams wrote:
"Robert Ramey" <ramey@rrsd.com> writes:
,----
For many archive formats and common datatypes there exist APIs that can quickly read or write contiguous sequences of those types all at once (**). Reading or writing such a sequence by separately reading or writing each element (as the serialization library currently does) can be an order of magnitude more expensive. `----
I have no problem with the above.
We want to be able to capitalize on the existence of those APIs, and to do that we need a "hook" that will be used whenever a contiguous sequence is going to be (de)serialized. No such hook exists in Boost.Serialization.
Whether or not such a hook is necessary is the crux of the issue. I consider the submission a use case for archive creation and/or extension. As far as I could tell, that particular one didn't require any new hooks in the library. Maybe the next iteration will be different - but that's how I see it now.
(**) Note that this capability is not necessarily tied to bitwise serialization or the use of a binary representation.
The Design ========== We've attempted to use programming idioms and terminology found in the existing serialization library wherever possible, so that it's easy for you to read and understand, and you won't be distracted by minor stylistic differences.
Thanks for your consideration. I realize its an extra burden to make it easier for me to read and understantd and I appreciate your consideration.
In the messages to follow, the word "array" will normally mean a contiguous sequence of instances of a single datatype, and not to a C++ builtin array type of the form T[N]. I'll try to be explicit when I intend to describe builtin arrays.
Let me explain one place where our difference lies. The serialization library is basically three pieces a) serialization specifications for each data type to be serialized. (serialize functions) which are independent of the archive. That is these specifications depend only upon the requirements of the Saving Archive or Loading Archive concepts. b) archive classes which implement the Archive concept for different file formats. These archive classes have common implementation features factored out into common modules. Due to "practical" considerations like whether something should be pre-compiled in the library, whether it is dependent on a use's application type, minimzation of code bloat etc, This common implemnetation code might be included in one of the base classes or in the file i/oserializer.hpp. (The code in i/o serializer.hpp) would normally be one of the base classes but I believe that template meta programming consideratons related to less-conforming compilers). These "common code" modules are designed to hold code applicable to all archives. c) Finally, the escape hatch. Those serialization implementations which have to be dependent on the combinaron of archive type and datatype. The most obvious case is name-value pairs - nvp. nvp has its own default serialization which just serializes the value part. Withinxml archives this is overriden with a special version for that archive type. This is the model which I have always envisioned that the library be extended. It is only in this way the the library can be extended without being complicated geometircally as time goes on. I realize that this design and more importantly, it's motivation, might not be all that apparent from the the documentation on archive implementation. Sorry about that. As time goes on I would hope that this can be improved. But maybe this explains my reluctance to maintain parts of the library beyond the reach of those making other archives. This forms my main objection to the proposal. Of course I have/had lots of other objections to it and probably would have a lot more if I spent more time looking into it. I suspect that the job of making a protable binary archive is much harder than it first appears. Making it so that it can exploit opportuninties to be much faster while still being as "monkey - proof" is even harder still. I didn't pursue this as I really don't want to discourage these kinds of efforts and they are (or should be) orhogonal to the library as it is currently implemented.. If they can be implemented without altering the core - then I have no problem. If someone believes that modifying the core is unavoidable, then either he or I have made some sort of mistake and it will have to be resolved. If they don't reallly have to alter the core, but the archive auther thinks it would make his job easier - then we have a probem. I get a suggestion about once a month to modify the core of he library for this or that reason. Aside from bugs, it usually boils down to the suggestor looking at the code and seeing - "Oh I could fix this right there!" without considering all the repercussions and without considering the alternatives. (As you might guess, this is what I believe happened in this case). Another common occurence is the attempt to use the serialization system to accomplish some end for which it is not suited. A typical idea is to use it to implement some externally defined file format. I know I drag my feet, I know it drives people crazy, but I truely believe that the success of the library is due in no small part to my reluctance to add in any more than is absolutly necessary. So, I look forward to seeing progress on the following: a) better handling of special optimization opportunites which obtain for certain combinations of data-types and archives. Hopefully, an elegantl implementation will serve as a model for other people's pet addiitions. b) A protable binary implementation suitable for such things as MPI messages. I also expect these to take some time and hope they can be subjected to the boost "process" of public criticism and refinement. This will take more time but result in a better product. Hopefully, it will be less stressful as well - though I doubt it. I really am trying to wind down my involvement in the serialization library. I do want to spend some more time on execution profiling and performance tweaks. I would like to see the documentation improved on how to do things like you and matthias are attempting to do. The current documenation does have a section titled "case studies" which seems to me handy place to put examples of this nature and at the same time show users how to exploit any "add-in" functionality. Good luck on this Robert Ramey