
Matthias Troyer wrote:
Suggestions ===========
a) Do more work in finding the speed bottlenecks. Run a profiler. Make a buffer based non-stream based archive and re-run your tests.
I have attached a benchmark for such an archive class and ran benchmarks for std::vector<char> serialization. Here are the numbers (using gcc-4 on a Powerbook G4):
I've take a look at your benchmark.cpp. First of all its very nice and simple and shows an understanding how the primitive i/o is isolated from the archives that use it. Its a step in the right direction. But I see some problems. The usage of std::vector<char> isn't what I would expect for an output buffer. You arn't using this in your own archives are you? Here are my timing results on my windoz XP system with a 2.4 gHz pentiem. With your original program I get for value_type set char Time using serialization library: 9.454 Size is 100000004 Time using direct calls to save in a loop: 8.844 Size is 100000000 Time using direct call to save_array: 0.266 Size is 100000000 for value type set to double Time using serialization library: 1.281 Size is 100000004 Time using direct calls to save in a loop: 1.218 Size is 100000000 Time using direct call to save_array: 0.266 Size is 100000000 I modified the to use a simple buffer output closer to what I would expect to use if I were going to make a primitive buffer output. BTW - that would be a very nice addition. This would be much faster than using strstream as is being use now. Here I the results with the program modified in this way For value type set to char I get Time using serialization library: 0.797 Size is 100000004 Time using direct calls to save in a loop: 0.297 (1) Size is 100000000 Time using direct call to save_array: 0.203 Size is 100000000 and for value_type set to double I get Time using serialization library: 0.109 (3) Size is 100000004 Time using direct calls to save in a loop: 0.078 (2) Size is 100000000 Time using direct call to save_array: 0.25 Size is 100000000 a) the usage of save_array does not have a huge effect on performance. It IS measureable. It seems that it saves about 1/3 the time over using a loop of saves in the best case. (1) b) In the worst case, its even slower than a loop of saves!!! (2) and even slower than the raw serialization system (3) c) the overhead of the serialization library isn't too bad. It does show up when doing 100M characters one by one, but generally it doesn't seem to be a big issuues. In my view, it does support my contention that implementing save_array - regardless of how it is in fact implemented - represents a premature optimization. I suspect that the net benefit in the kind of scenario you envision using it will be very small. Obviously, this test raises more questions than it answers and I think it should be persued further. Another thing I would like to see is a version of the test applied to C++ arrays. My interest is to isolate bottlenecks in the serialization library from those in the stl libraries. Robert Ramey begin 666 test_zmisc.cpp`` ` end