Re: [boost] [serialization] performance comparison

B"H Robert Ramey writes:
Our experiments with 1.3 revealed that the single biggest time consumer with the binary archive was with the stream i/o. version 1.34 - is re-implemented in terms of std::streambuf rather than stream i/o. It is significantly faster because of this.
I hope to test with 1.34 and report what I find. I decided to go with 1.33.1 as it is packaged and easier to use. Does the 1.34 binary_oarchive take a streambuf instead of an ofstream? Is this documented somewhere?
If someone has a situation where he performance is a concern I would recommed increasing the buffer size to a larger amount - on the order of 1 MB. This can be done using library facilities. That is, serialization depends upon standard streambuf i/o and if performance is a big consideration, one should configure the standard streambuf accordingly.
In cases like your test case, there is still the opportunity to specialize the binary archives to handle these cases faster. Indeed its quite simple using the facility of the library to implement special handlers for these types which can benefit from it. So a realistic test/comparison would have to consider that this is what a real user would want to do when confronted with this kind of situation.
However, this was deemed - not good enough - in some quarters. So specializations were added to the default 1.35 version of the binary archives to speed up exactly these special cases - large collections of primitives.
I agree with those quarters and have supported this approach for over a year now.
Up shot is that, for large collections of primitves like chars and ints, our test show that the current version in the head - 1.35 will be approximately 10 times faster than 1.33 you have used as a basis for comparison. ( Hmmm - actually we used arrays and vectors so we don't have a complete comparison.) Anyway, I would guess that the current version of the library and anyother recently created serialization library would be comparible.
I read a little bit about that and it applied to arrays and vectors. I don't think it touched B.Ser's list, set or deque handling. I would describe comparing marshalling performance of a vector in B.Ser to a list in EE as an invalid comparison. Same with comparing a vector in EE to a list in B.Ser.
So I don't think this is a big issue. Actually, I never thought it was a big issue since large collections of primitives are not the most common application of the library and the library could (and can) accomdate special cases with the facilities built into the library itself. What I think/thought about this didn't really matter though, as the library was easily extended by the interested parties to accomodate those who wanted to invest the effort.
The ratios I mentioned hold for smaller collections of ints as well. If the containers have 10,000 ints, the test results are not noticably different. These may not be the most common application, but the C++ middleware/serialization frameworks that I'm familiar with support container classes. From my perspective the tests I chose are basic things you have to have in order to support more complicated things. Brian Wood www.webEbenezer.net _______________________________________

brass goowy wrote:
B"H Robert Ramey writes:
Our experiments with 1.3 revealed that the single biggest time consumer with the binary archive was with the stream i/o. version 1.34 - is re-implemented in terms of std::streambuf rather than stream i/o. It is significantly faster because of this.
I hope to test with 1.34 and report what I find.
I'll be curious how much difference still exists.
I decided to go with 1.33.1 as it is packaged and easier to use.
I'm not critical of that decision - that's the one that's out there. My point is that I believe that efforts to address these issues have been undertaken and are in fact on going.
Does the 1.34 binary_oarchive take a streambuf instead of an ofstream? Is this documented somewhere?
It can take a stream buf or or an ofstream. If passed a stream, the stream buf is used directly. This preserves the common archive interface while gaining benefits of avoiding stream i/o operators. I double checked the CVS using the browser interface. 1.34 indeed includes updated documention and code which supports the streambuffer interface. It doesn't say much, because whether binary archives were implemented in terms of stream operators or streambuf calls was considered an implementation detail. The only difference from a user stand point is that one can create and use a streambuf without having to create a stream itself. In our tests we found avoid the stream io and using streambuf calls directly decreased the time required for vectos and arrays of primitives by a factor of 4 (if I remember correctly) - when used with a large buffer.
I agree with those quarters and have supported this approach for over a year now.
What I think/thought about this didn't really matter though, as the library was easily extended by the interested parties to accomodate those who wanted to invest the effort.
The ratios I mentioned hold for smaller collections of ints as well. If the containers have 10,000 ints, the test results are not noticably different.
I would hope that the 1.34 version would show different results.
These may not be the most common application, but the C++ middleware/serialization frameworks that I'm familiar with support container classes. From my perspective the tests I chose are basic things you have to have in order to support more complicated things.
I took a cursory look at the www.ebenezer.net and the documentation. If you want I can include a pointer to it in the introduction to the serialization library - as I have with other libraries which have been brought to my attention. Robert Ramey
participants (2)
-
brass goowy
-
Robert Ramey