
on Tue Jan 20 2009, "Hartmut Kaiser" <hartmut.kaiser-AT-gmail.com> wrote:
Hi all,
in one of my projects I have a lot of types (>1000) to be serialized using a pointer to a single base class. At some point we found the serialization/deserialzation time to be O(N*M), where N is the number of types and M the number of classes in the derivation hierarchy.
Wondering why this is so significant I started digging and measuring. I found the type information registry used for the void_upcast() to be the culprit. It's a plain std::vector<const void_caster *> ($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:37) which is searched sequentially a lot (once for each derived/base pair for each serialization call). Moreover, this vector isn't even kept sorted.
it = std::find_if( s.begin(), s.end(), void_cast_detail::match(& ca) );
($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:180). Changing this to be a std::set improves the picture significantly!
What's the reasoning behind using a std::vector<> instead of a std::set<> or a similar indexed structure?
I have a highly optimized component in Boost.Python that I *believe* is doing the same job. Perhaps we should factor it out and share? -- Dave Abrahams BoostPro Computing http://www.boostpro.com