
I've reviewed the profile and found it interesting. Have you tried binary_?archive. You would find it much, much, faster in this this case for a variety of reasons. To maintain portability of text files, the library has to manipulate each character sent. This takes a lot of time and it adds up. You might experiment with creating a temporary array, wrapping in binary_obect and sending it that way. But still, the very fastest will be to use binary_?archive. Robert Ramey Vjekoslav Brajkovic wrote:
Hi,
I am using the serialization library in my project and it has been functioning perfectly thus far. However, when I've scaled up the usage requirements, I had hit a very odd problem.
Let me explain the use case. Serialization library is used in a DFS framework, handling large data structures (in terms of size, not complexity) such as std::vector<char> of size 1MB and above. The framework consists of two major components: Chunkserver (server) and a Client. Files are chunked, wrapped in a class, serialized and sent over the wire. Same things is done on the server side, but in a reverse order. The actual binary data is stored in a vector (previously, I've tried using string instead, but I had some issues with it and Robert suggested using some an alternative container).
When I was depositing large files to Chunkserver, disk utilization was almost non-existent, whereas the CPU was maxed out. It is important to realize that this problem occurred only on the server side, not client. Upon further investigation using gprof I have concluded that the bottleneck was in the serialization library (it also may be the case that I am misusing it). According to the profiler, above 97% of the CPU time was spent in a singe function. Profiler results can be found at this address:
http://www.cs.washington.edu/homes/balkan/gprof.txt
For the reference, the signature of that function is: boost::serialization::serialize_adl< boost::archive::text_iarchive, std::vector...> and I am using text archive. I as mentioned before, this issue only occurs on a server side.
I would appreciate if anybody could explain why this is happening and more importantly how to circumvent the issue.
Thank you!
Vjeko