
"Robert Ramey" <ramey@rrsd.com> writes:
David Abrahams wrote:
"Robert Ramey" <ramey@rrsd.com> writes:
,----
For many archive formats and common datatypes there exist APIs that can quickly read or write contiguous sequences of those types all at once (**). Reading or writing such a sequence by separately reading or writing each element (as the serialization library currently does) can be an order of magnitude more expensive. `----
I have no problem with the above.
We want to be able to capitalize on the existence of those APIs, and to do that we need a "hook" that will be used whenever a contiguous sequence is going to be (de)serialized. No such hook exists in Boost.Serialization.
Whether or not such a hook is necessary is the crux of the issue.
Yes. Or more precisely, whether the consequences of not having the hook in the serialization library itself are bad enough to warrant creating it there. I will discuss those consequences after I present our new design, which adds the hook, but only in our own extensions -- essentially a library built on top of the current serialization library without modifying it.
I consider the submission a use case for archive creation and/or extension.
I don't understand what you're trying to say. I presume by "the submission" you mean Matthias' proposed changes to your library. But I don't understand what you mean about it being a "use case."
As far as I could tell, that particular one didn't require any new hooks in the library.
Functionally speaking, that is correct. You /can/ do fast serialization of contiguous arrays without changing the library. You don't even have to write a whole new serialization library.
Maybe the next iteration will be different - but that's how I see it now.
There are some negative consequences of creating the hooks outside Boost.Serialization. Once you understand them, I'm pretty sure you will think they are significant. Whether they will be significant enough to induce you to make changes in Boost.Serialization is of course an open question.
Let me explain one place where our difference lies.
Having read everything that follows, I don't see any explanation of a "place where our difference lies." The parts I understand (most of it) sound like "motherhood and apple pie" -- good, common sense that's hard to disagree with. Is it a thought that was never finished? Would you care to try to put it more succinctly?
The serialization library is basically three pieces
a) serialization specifications for each data type to be serialized. (serialize functions) which are independent of the archive. That is these specifications depend only upon the requirements of the Saving Archive or Loading Archive concepts.
b) archive classes which implement the Archive concept for different file formats. These archive classes have common implementation features factored out into common modules. Due to "practical" considerations like whether something should be pre-compiled in the library, whether it is dependent on a use's application type, minimzation of code bloat etc, This common implemnetation code might be included in one of the base classes or in the file i/oserializer.hpp. (The code in i/o serializer.hpp) would normally be one of the base classes but I believe that template meta programming consideratons related to less-conforming compilers). These "common code" modules are designed to hold code applicable to all archives.
c) Finally, the escape hatch. Those serialization implementations which have to be dependent on the combinaron of archive type and datatype. The most obvious case is name-value pairs - nvp. nvp has its own default serialization which just serializes the value part. Withinxml archives this is overriden with a special version for that archive type. This is the model which I have always envisioned that the library be extended. It is only in this way the the library can be extended without being complicated geometircally as time goes on.
I realize that this design and more importantly, it's motivation, might not be all that apparent from the the documentation on archive implementation. Sorry about that.
No, it's perfectly clear what you're trying to do once you study the library implementation. Your design philosophy makes good sense AFAICT. I am a bit surprised to hear you state flatly that there is only one way to extend the library that can ever work. How can you possibly know you've considered every possibility? I don't have the same confidence, even about problems I've studied for years.
As time goes on I would hope that this can be improved. But maybe this explains my reluctance to maintain parts of the library beyond the reach of those making other archives.
Other archives? Beyond reach? I don't understand what you're saying here.
This forms my main objection to the proposal.
Sorry, I don't have any clue what you are referring to. Regardless, we are going to start from new code that doesn't change any part of Boost.Serialization, so if possible, it might be better to try to forget about what you've seen before.
Of course I have/had lots of other objections to it and probably would have a lot more if I spent more time looking into it.
Fortunately, you won't have to. We're going to present new code.
I suspect that the job of making a protable binary archive is much harder than it first appears.
Actually it's almost trivial (I did it over 10 years ago), but I don't know what that has to do with what we're trying to accomplish.
Making it so that it can exploit opportuninties to be much faster while still being as "monkey - proof" is even harder still.
The speedups we're proposing don't have anything in particular to do with portable binary archives.
I didn't pursue this as I really don't want to discourage these kinds of efforts and they are (or should be) orhogonal to the library as it is currently implemented.. If they can be implemented without altering the core - then I have no problem. If someone believes that modifying the core is unavoidable, then either he or I have made some sort of mistake and it will have to be resolved.
It's not unavoidable; as I've said before, it just has consequences that we don't like, and we think you probably won't like either. If you can hang on until we've presented what we think is the best design that avoids altering the core, then we can look at the consequences. Once you understand them, if you still don't want to make any changes and you're willing to accept the consequences, we're not going to press the issue any further.
If they don't reallly have to alter the core, but the archive auther thinks it would make his job easier - then we have a probem.
Let me be very clear about this, at least: ,---- | Ease of archive implementation is unrelated to the motivation for | requesting core changes. `---- I hope that allays at least one of your concerns.
I get a suggestion about once a month to modify the core of he library for this or that reason. Aside from bugs, it usually boils down to the suggestor looking at the code and seeing - "Oh I could fix this right there!" without considering all the repercussions and without considering the alternatives. (As you might guess, this is what I believe happened in this case).
Actually Matthias' considerations went much deeper than you give him credit for. In my opinion, he just failed to communicate his rationale properly, and since the details of his code seemed to you to violate basic principles of your design, I'm sure it was all the more difficult for you to understand the problems he is trying to avoid. Working from new code that (I hope!) won't cause you any alarm, it might be easier to understand the rationale.
Another common occurence is the attempt to use the serialization system to accomplish some end for which it is not suited. A typical idea is to use it to implement some externally defined file format. I know I drag my feet, I know it drives people crazy, but I truely believe that the success of the library is due in no small part to my reluctance to add in any more than is absolutly necessary.
Understood. It might be a good idea for you to clearly define the intended scope of the library. What criteria distinguish an appropriate application from an inappropriate one? I'm interested in hearing your intention as the library author, rather than something like "an appropriate application is one that works well with the library as it is currently specified and/or implemented." Depending on your answer, we might indeed be barking up the wrong tree.
So, I look forward to seeing progress on the following:
a) better handling of special optimization opportunites which obtain for certain combinations of data-types and archives. Hopefully, an elegantl implementation will serve as a model for other people's pet addiitions.
I hope we'll be able to show you something elegant very soon.
b) A protable binary implementation suitable for such things as MPI messages.
Portable binary archives and MPI have little relationship to one another. You don't flatten your data into a portable format, ship it in an MPI message that is just a sequence of bytes, and then deserialize. MPI handles portability internally.
I also expect these to take some time and hope they can be subjected to the boost "process" of public criticism and refinement. This will take more time but result in a better product. Hopefully, it will be less stressful as well - though I doubt it.
I really am trying to wind down my involvement in the serialization library.
That's a bit alarming, actually. Have you got someone else lined up to maintain it? It's important to us and to many others that the library has a future. Without the involvement of the original author, that would be in doubt.
I do want to spend some more time on execution profiling and performance tweaks.
I would like to see the documentation improved on how to do things like you and matthias are attempting to do. The current documenation does have a section titled "case studies" which seems to me handy place to put examples of this nature and at the same time show users how to exploit any "add-in" functionality.
Good luck on this
Thanks. -- Dave Abrahams Boost Consulting www.boost-consulting.com