Comparison of serialization results

I've recently updated a page -- http://webEbenezer.net/comparison.html comparing results obtained using Boost Serialization (1.48) and the C++ Middleware Writer (1.13) on both Linux and Windows7. There's an archive that has all the test files mentioned on that page here -- http://webEbenezer.net/comp/tests.tar.bz2 . In order to build the tests, you would need to build the Boost Serialization library and an Ebenezer library yourself. There's an archive on this page that has the Ebenezer library -- http://webEbenezer.net/build_integration.html . To summarize the results, the Boost versions range from 1.1 to 3 times slower than the Ebenezer versions. And the sizes of the Boost executables range from portly to obese compared to functionally equivalent Ebenezer executables. Comments on the how to improve the tests/process/etc. are welcome. Shalom Brian Wood Ebenezer Enterprises http://webEbenezer.net

on Tue Dec 27 2011, Brian Wood <woodbrian77-AT-gmail.com> wrote:
I've recently updated a page -- http://webEbenezer.net/comparison.html comparing results obtained using Boost Serialization (1.48) and the C++ Middleware Writer (1.13) on both Linux and Windows7.
There's an archive that has all the test files mentioned on that page here -- http://webEbenezer.net/comp/tests.tar.bz2 . In order to build the tests, you would need to build the Boost Serialization library and an Ebenezer library yourself. There's an archive on this page that has the Ebenezer library -- http://webEbenezer.net/build_integration.html .
To summarize the results, the Boost versions range from 1.1 to 3 times slower than the Ebenezer versions. And the sizes of the Boost executables range from portly to obese compared to functionally equivalent Ebenezer executables.
Comments on the how to improve the tests/process/etc. are welcome.
A few: 1. The claim that you provide "full generation of marshalling functions" is a little hard to swallow. In principle, it's impossible to know based on a class declaration what to serialize, and how to serialize it. 2. If you are going to compare yourself to Boost.Serialization, it would be a good idea to replicate all of that library's tests with your own system, to demonstrate that you cover the same expressive range. It's usually very easy to build something that beats benchmarks of a more-general system. 3. That "our approach writes marshalling functions based on the content of the types involved" does not explain how you get past the private member access barrier. The only legal technique I've seen for that is at https://gist.github.com/1528856, and Boost.Serialization should probably have support for that approach. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
on Tue Dec 27 2011, Brian Wood <woodbrian77-AT-gmail.com> wrote:
This first came up several years ago and I spent a little time looking at it. I was sort of intrigued in that the usage of TMP driven inline code should be optimal for cases where tracking wasn't involved. And in these test cases there isn't any tracking. So I looked a little at the tests on the page. a) first I couldn't compile them so I couldn't repeat the results. Something was missing (I forget what now). b) so I inspected the code. My conclusion was that the being used was that in the i/o itself rather than the serialization. Boost serialization relies on streambuf to do it's i/o and the time is sensitive to things like buffer sizes etc. It seems that brian's code uses some other method which whose source isn't visible to me. (SendBuffer, etc). c) Brian did point out some cases where boost serialize could be improved particularly using hints in std collections. These were incorporated. I also implemented the binary i/o in terms of streambuf rather than stream which was a noticible improvement. d) I also implemented some performance tests which are included in the library and run with bjam. I didn't really flesh these out since there wasn't that much interest. Running performance tests in the boost test suite would have been a problem so the applicability was limited. e) So, given tat the timings aren't that far apart, and lacking any other information, I'm inclined to attribute any discrepencies to the way i/o is handled in the different systems. If one had nothing else to do he could make a replacement for streambuf which focused on fast i/o perhaps at the expense of portability. In particular, an asyncronous memory mapped i/o library might be an interesting addition to boost. And that would be cool to plug in the serialization library. Finally, I think that the scale and breadth of brian's library is much, much smaller than boost serialization library. Then there is the fact that the distribution model is totally different than that of boost and open source libraries in general. I just don't see how these packages/approaches could be considered comparable in any way.
The only legal technique I've seen for that is at https://gist.github.com/1528856, and Boost.Serialization should probably have support for that approach.
I looked at this - extremely clever - I expect nothing less from you! Actually, too clever in my opinion. Basically we have private: which means "don't let anyone outside of here access this !". This is pretty clear. Then we add: friend boost::serialization::access "which means - make an exception to the above for boost::serialization!". Which is also pretty clear. But your method would let me get my hands on someone else's private members without their giving permission. If this were to happen to me, I don't think I'd like it. Users have the option of doing this if they want to avoid having to use the friend declaration in some special cases. So I would be skeptical that something like this would be a good thing to include in the serialization library. Robert Ramey

on Wed Dec 28 2011, "Robert Ramey" <ramey-AT-rrsd.com> wrote:
The only legal technique I've seen for that is at https://gist.github.com/1528856, and Boost.Serialization should probably have support for that approach.
I looked at this - extremely clever - I expect nothing less from you!
It's not my cleverness; it comes from http://bloglitb.blogspot.com/2010/07/access-to-private-members-thats-easy.ht... (as noted in the Gist title) I had to spend quite some time analyzing that code to figure out what it was doing and why it worked, so I wanted to publish a version with comments.
Actually, too clever in my opinion. Basically we have private: which means "don't let anyone outside of here access this !". This is pretty clear. Then we add: friend boost::serialization::access "which means - make an exception to the above for boost::serialization!". Which is also pretty clear. But your method would let me get my hands on someone else's private members without their giving permission. If this were to happen to me, I don't think I'd like it.
Meh, I don't care. As noted in the comments to http://bloglitb.blogspot.com/2010/07/access-to-private-members-thats-easy.ht..., the point of private is to prevent accidental misuse, not to be some kind of inviolable firewall.
Users have the option of doing this if they want to avoid having to use the friend declaration in some special cases.
Obviously, if you can alter the class declaration to make boost::serialization::access a friend, that's preferable. The point would be to provide the means to serialize even when altering the class declaration was not an option. This ability to gain access non-intrusively might be useful for Boost.Python as well (though usually the problem there is access to a base class virtual function that may well be private, and this technique can only get you access to a virtual function's final override).
So I would be skeptical that something like this would be a good thing to include in the serialization library.
You're the boss. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
on Wed Dec 28 2011, "Robert Ramey" <ramey-AT-rrsd.com> wrote:
The only legal technique I've seen for that is at https://gist.github.com/1528856, and Boost.Serialization should probably have support for that approach.
I looked at this - extremely clever - I expect nothing less from you!
It's not my cleverness; it comes from http://bloglitb.blogspot.com/2010/07/access-to-private-members-thats-easy.ht... (as noted in the Gist title)
I had to spend quite some time analyzing that code to figure out what it was doing and why it worked, so I wanted to publish a version with comments.
Actually, too clever in my opinion. Basically we have private: which means "don't let anyone outside of here access this !". This is pretty clear. Then we add: friend boost::serialization::access "which means - make an exception to the above for boost::serialization!". Which is also pretty clear. But your method would let me get my hands on someone else's private members without their giving permission. If this were to happen to me, I don't think I'd like it.
Meh, I don't care. As noted in the comments to http://bloglitb.blogspot.com/2010/07/access-to-private-members-thats-
easy.html,
the point of private is to prevent accidental misuse, not to be some kind of inviolable firewall.
Users have the option of doing this if they want to avoid having to use the friend declaration in some special cases.
Obviously, if you can alter the class declaration to make boost::serialization::access a friend, that's preferable. The point would be to provide the means to serialize even when altering the class declaration was not an option.
This ability to gain access non-intrusively might be useful for Boost.Python as well (though usually the problem there is access to a base class virtual function that may well be private, and this technique can only get you access to a virtual function's final override).
So I would be skeptical that something like this would be a good thing to include in the serialization library.
You're the boss.
I'd like to offer a decenting opinion. The use of this is where you need to add serialization to some existing code which you are not free to modify. I would find this very useful.

on Thu Dec 29 2011, Neal Becker <ndbecker2-AT-gmail.com> wrote:
Dave Abrahams wrote:
on Wed Dec 28 2011, "Robert Ramey" <ramey-AT-rrsd.com> wrote:
So I would be skeptical that something like this would be a good thing to include in the serialization library.
You're the boss.
I'd like to offer a decenting opinion. The use of this is where you need to add serialization to some existing code which you are not free to modify. I would find this very useful.
Me too, potentially... but as Robert points out you can always implement the technique yourself. In fact, maybe it should be a separate Boost library. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
on Thu Dec 29 2011, Neal Becker <ndbecker2-AT-gmail.com> wrote:
Dave Abrahams wrote:
on Wed Dec 28 2011, "Robert Ramey" <ramey-AT-rrsd.com> wrote:
So I would be skeptical that something like this would be a good thing to include in the serialization library.
You're the boss.
I'd like to offer a decenting opinion. The use of this is where you need to add serialization to some existing code which you are not free to modify. I would find this very useful.
Me too, potentially... but as Robert points out you can always implement the technique yourself. In fact, maybe it should be a separate Boost library.
Exactly. That would be the right way to go about it. Note that the serialization library has a number of "sublibraries" which I needed at the time but weren't available - some of them are now. Testing, documentation, usage, is independent of the main library itself. Actually I believe that boost has a large number of such components. I sometimes find them and cut them out into a separate module for inclusion in my own code to solve some problem. Robert Ramey
participants (4)
-
Brian Wood
-
Dave Abrahams
-
Neal Becker
-
Robert Ramey