[Serialization] Improving performance

Hi all, in one of my projects I have a lot of types (>1000) to be serialized using a pointer to a single base class. At some point we found the serialization/deserialzation time to be O(N*M), where N is the number of types and M the number of classes in the derivation hierarchy. Wondering why this is so significant I started digging and measuring. I found the type information registry used for the void_upcast() to be the culprit. It's a plain std::vector<const void_caster *> ($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:37) which is searched sequentially a lot (once for each derived/base pair for each serialization call). Moreover, this vector isn't even kept sorted. it = std::find_if( s.begin(), s.end(), void_cast_detail::match(& ca) ); ($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:180). Changing this to be a std::set improves the picture significantly! What's the reasoning behind using a std::vector<> instead of a std::set<> or a similar indexed structure? Regards Hartmut

on Tue Jan 20 2009, "Hartmut Kaiser" <hartmut.kaiser-AT-gmail.com> wrote:
Hi all,
in one of my projects I have a lot of types (>1000) to be serialized using a pointer to a single base class. At some point we found the serialization/deserialzation time to be O(N*M), where N is the number of types and M the number of classes in the derivation hierarchy.
Wondering why this is so significant I started digging and measuring. I found the type information registry used for the void_upcast() to be the culprit. It's a plain std::vector<const void_caster *> ($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:37) which is searched sequentially a lot (once for each derived/base pair for each serialization call). Moreover, this vector isn't even kept sorted.
it = std::find_if( s.begin(), s.end(), void_cast_detail::match(& ca) );
($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:180). Changing this to be a std::set improves the picture significantly!
What's the reasoning behind using a std::vector<> instead of a std::set<> or a similar indexed structure?
I have a highly optimized component in Boost.Python that I *believe* is doing the same job. Perhaps we should factor it out and share? -- Dave Abrahams BoostPro Computing http://www.boostpro.com

David Abrahams skrev:
on Tue Jan 20 2009, "Hartmut Kaiser" <hartmut.kaiser-AT-gmail.com> wrote:
it = std::find_if( s.begin(), s.end(), void_cast_detail::match(& ca) );
($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:180). Changing this to be a std::set improves the picture significantly!
What's the reasoning behind using a std::vector<> instead of a std::set<> or a similar indexed structure?
I have a highly optimized component in Boost.Python that I *believe* is doing the same job. Perhaps we should factor it out and share?
What about using boost::unordered_set or boost::interprocess::flat_set? -Thorsten

David Abrahams wrote:
on Tue Jan 20 2009, "Hartmut Kaiser" <hartmut.kaiser-AT-gmail.com> wrote:
Hi all,
in one of my projects I have a lot of types (>1000) to be serialized using a pointer to a single base class. At some point we found the serialization/deserialzation time to be O(N*M), where N is the number of types and M the number of classes in the derivation hierarchy.
Wondering why this is so significant I started digging and measuring. I found the type information registry used for the void_upcast() to be the culprit. It's a plain std::vector<const void_caster *> ($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:37) which is searched sequentially a lot (once for each derived/base pair for each serialization call). Moreover, this vector isn't even kept sorted.
it = std::find_if( s.begin(), s.end(), void_cast_detail::match(& ca) );
($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:180). Changing this to be a std::set improves the picture significantly!
What's the reasoning behind using a std::vector<> instead of a std::set<> or a similar indexed structure?
LOL - the reason is that in my wildest imagination I never concieved that one would ever have more than a few derived types, much less that the serialization library would ever have to deal with something like this. I'm sure it's not tooooo hard to address this if you want to post a fix I would be happy to look at it. And while your at it. Perhaps you might want to take a look into the directory boost/libs/serialization/performance/... This directory includes all the infrastructure to generate profiling tables with the gcc compiler. It works with bjam, simple shell script and maybe library_test in order to implement this. I've tested with cygwin under windows XP. It includes the a set of performance tests which generate profiles. The tests aren't very good - I just copied over a few from the ../test directory to test the performance infrastructure. After getting this far I got distracted by other stuff so there it stands. If anyone were to add a performance_test for void_cast and review what I've got I would be most greatful. If someone want's a reasonable size Summer of Code 09 project I would also very much like to see this happen. Robert Ramey
I have a highly optimized component in Boost.Python that I *believe* is doing the same job. Perhaps we should factor it out and share?
participants (4)
-
David Abrahams
-
Hartmut Kaiser
-
Robert Ramey
-
Thorsten Ottosen