
On Sun, 14 Sep 2008, Robert Ramey wrote:
I've reviewed the profile and found it interesting.
Have you tried binary_archive. You would find it much, much, faster in this case for a variety of reasons.
After modifying the code to use binary_archive, everything runs as expected. The reason why I posed this email was to make sure that this is not a bug within the library.
To maintain portability of text files, the library has to manipulate each character sent. This takes a lot of time and it adds up. You might experiment
I see. That explains everything.
with creating a temporary array, wrapping in binary_obect and sending it that way. But still, the very fastest will be to use binary_?archive.
I was not aware of binary_object. Thanks for pointing this out. I will consult the documentation
Robert Ramey
Thanks a bunch for helping out in such short notice. I really appreciate it. Best! ;) -vjeko
Vjekoslav Brajkovic wrote:
Hi,
I am using the serialization library in my project and it has been functioning perfectly thus far. However, when I've scaled up the usage requirements, I had hit a very odd problem.
Let me explain the use case. Serialization library is used in a DFS framework, handling large data structures (in terms of size, not complexity) such as std::vector<char> of size 1MB and above. The framework consists of two major components: Chunkserver (server) and a Client. Files are chunked, wrapped in a class, serialized and sent over the wire. Same things is done on the server side, but in a reverse order. The actual binary data is stored in a vector (previously, I've tried using string instead, but I had some issues with it and Robert suggested using some an alternative container).
When I was depositing large files to Chunkserver, disk utilization was almost non-existent, whereas the CPU was maxed out. It is important to realize that this problem occurred only on the server side, not client. Upon further investigation using gprof I have concluded that the bottleneck was in the serialization library (it also may be the case that I am misusing it). According to the profiler, above 97% of the CPU time was spent in a singe function. Profiler results can be found at this address:
http://www.cs.washington.edu/homes/balkan/gprof.txt
For the reference, the signature of that function is: boost::serialization::serialize_adl< boost::archive::text_iarchive, std::vector...> and I am using text archive. I as mentioned before, this issue only occurs on a server side.
I would appreciate if anybody could explain why this is happening and more importantly how to circumvent the issue.
Thank you!
Vjeko
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users