Performance of boost::serialization

Hi! I'm having performance issues with boost::serilaization. First a short description what my data looks like: I serialize a container of boost::shared_ptr<Base> where i put polymorph objects in (no very deep hierarchy ~2 layers). The objects are ~40 byte each. The container has 100 elements in my test case. In the test application i can only serialize that container ~40 times per second. That's to slow for me because the data is used for the network communication. What i've found is this thread: http://old.nabble.com/-Serialization--Speedding-up-client-server-communicati... But it seems that i have to recreate the archive in my case because the data can get send, changed and send again. The documentation says that this behaviour can be disabled through disabling tracking. But that doesn't work for me because I use pointers. Robert Ramey also mentioned that implementing a lightweight version of stringstream could improve the performance. I liked to look at that but it seems that i'm missing something completely. class mystreambuf : public std::streambuf { public: virtual std::streamsize xsputn(const char*_Ptr, std::streamsize _Count) { } }; So this stream buffer does nothing right? But somehow serialization throughput is even worse with that implementation (compared to stringstream). I've never implemented a stream buffer so help would be nice. What are the requirements of boost::serialization for the stream buffer? Also generally it would be nice if you could share your oppinion if you think if the performance of boost::serialization can be tuned to serialize data for network communication. Is it only possible without polymorphism/shared_ptr? Any help is welcome! Thanks -- View this message in context: http://old.nabble.com/Performance-of-boost%3A%3Aserialization-tp29739639p297... Sent from the Boost - Users mailing list archive at Nabble.com.

Daniel Herb wrote:
Hi! I'm having performance issues with boost::serilaization. First a short description what my data looks like: I serialize a container of boost::shared_ptr<Base> where i put polymorph objects in (no very deep hierarchy ~2 layers). The objects are ~40 byte each. The container has 100 elements in my test case.
In the test application i can only serialize that container ~40 times per second. That's to slow for me because the data is used for the network communication. What i've found is this thread: http://old.nabble.com/-Serialization--Speedding-up-client-server-communicati... But it seems that i have to recreate the archive in my case because the data can get send, changed and send again. The documentation says that this behaviour can be disabled through disabling tracking. But that doesn't work for me because I use pointers.
Robert Ramey also mentioned that implementing a lightweight version of stringstream could improve the performance. I liked to look at that but it seems that i'm missing something completely.
class mystreambuf : public std::streambuf { public: virtual std::streamsize xsputn(const char*_Ptr, std::streamsize _Count) { } };
So this stream buffer does nothing right? But somehow serialization throughput is even worse with that implementation (compared to stringstream). I've never implemented a stream buffer so help would be nice. What are the requirements of boost::serialization for the stream buffer?
Also generally it would be nice if you could share your oppinion if you think if the performance of boost::serialization can be tuned to serialize data for network communication. Is it only possible without polymorphism/shared_ptr?
a) polymorphic archives are measurable slower than the non-polymorphic on b) Which archive class are you using? For maximum speed use binary_archive. c) It is a current feature of the library that you would have to reconstruct the archive class for each usage. This costs a lot of time. In the future this may be addressed. For now I would suggest a) don't use a polymorphic archive b) make a special class just for data transmission i) don't use pointers, ii) mark the class NOT_TRACKED iii) make sure all it's non-primitive members are not tracked c) Open the archive just one and invoke ar << multiple times d) Make your own stream buffer implementation which sends the raw data down the pipe. e) Performance test your implementation with gcc profile or similar msvc tool Robert Ramey

Zitat von Robert Ramey
For now I would suggest
d) Make your own stream buffer implementation which sends the raw data down the pipe.
you might find some inspiration for this here: https://svn.boost.org/svn/boost/sandbox/transaction/boost/transact/archive.h... https://svn.boost.org/svn/boost/sandbox/transaction/boost/transact/char_arch... it isn't a streambuf implementation but a Boost.Serialization archive that can write to any output iterator with a "char" valuetype. e.g. serialization_oarchive< char_oarchive< std::back_insert_iterator< std::vector<char> >
by writing an output iterator to a fixed-size network buffer you should be able to efficiently send your network message in chunks.

Thanks for the answers! I've tested some other cases based on your suggestions and now have the following situation: - I create package classes which are derived from base classes, these will hold non-pointer objects - The packages split the objects into categories - On the lowest level I have a list of shared_ptr<Package> - That way I still have some kind of dynamic without wasting to much performance I guess that's at least somehow what Robert suggested. The problem I have with this is the following: My application uses shared_ptr to hold most of the data. So if i want to create a package I would have to copy the data first before I can put it into the container within the package. Am I right or have I missed some part of the idea? Stefan Strasser-2 wrote:
Zitat von Robert Ramey
: For now I would suggest
d) Make your own stream buffer implementation which sends the raw data down the pipe.
you might find some inspiration for this here: https://svn.boost.org/svn/boost/sandbox/transaction/boost/transact/archive.h... https://svn.boost.org/svn/boost/sandbox/transaction/boost/transact/char_arch...
it isn't a streambuf implementation but a Boost.Serialization archive that can write to any output iterator with a "char" valuetype. e.g. serialization_oarchive< char_oarchive< std::back_insert_iterator< std::vector<char> >
by writing an output iterator to a fixed-size network buffer you should be able to efficiently send your network message in chunks.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- View this message in context: http://old.nabble.com/Performance-of-boost%3A%3Aserialization-tp29739639p297... Sent from the Boost - Users mailing list archive at Nabble.com.

Daniel Herb wrote:
Thanks for the answers! I've tested some other cases based on your suggestions and now have the following situation: - I create package classes which are derived from base classes, these will hold non-pointer objects - The packages split the objects into categories - On the lowest level I have a list of shared_ptr<Package> - That way I still have some kind of dynamic without wasting to much performance
I guess that's at least somehow what Robert suggested. The problem I have with this is the following: My application uses shared_ptr to hold most of the data. So if i want to create a package I would have to copy the data first before I can put it into the container within the package. Am I right or have I missed some part of the idea?
I think you've got it. As you can see, it's basically a way to use the right subset of serialization functionality. To refine the idea, I thought about this a tiny bit more and here would be another thing to try. I don't remember if its the same as I suggested before struct my_data { ... shared_ptr<T1> m_t1; ... template<class Archive> void serialization(... // the normal way }; // this makes sure that new values are resent each time BOOST_SERIALIZATION_TRACKING(T1, no_tracking) ... struct my_tranmission_wrapper { // don't put any pointers here !!! const T1 & m_t1; ... my_transmission_wrapper(my_data &d) : m_t1(d.m_t1), ... {} } sender binary_ostream_buf bs; // created connection to other machine binary_oarchive boa(bs) my_transmission_wrapper tw(d); my_data d; boa << d; // first transmission creates all the shared pointers at the other end of the system loop{ // update d with new values even though archive is not re-created boa << tw(d); // serialize again with the new values } } reciever binary_istream_buf bs; // created connection to other machine binary_iarchive bia(bs); my_data d; bia >> d; // first transmission creates all the shared pointers at the other end of the system my_transmission_wrapper tw(d); loop{ bia >> tw(d); // serialize again with the new values // update d with new values even though archive is not re-created // d has new values do_work(&d); } Good luck with this Robert Ramey
participants (3)
-
Daniel Herb
-
Robert Ramey
-
Stefan Strasser