
Hi there, I've been looking at speed issues loading archives for the last few days and came across something that puzzles me. We are using binary_archive's for storing relatively large data sets (> 100mg) comprised largely of std::vectors of pod types, this was causing huge inefficencies as for each element it was serialising it seperately so I derived a new archive type with specialisations for std::vector load/save that dispatches on podness (?) so if pod it just block writes/read the entire vector. This gave us a massive speedup so it could be useful to include this by default in the provided binary archive as the reaction of a number of our developers when first looking at vtune before this optimisation was that Boost.serialisation was just rubbish as it was so slow. (let me know if you want to see the implementation, its only about 100 lines or so). The next slowest thing as revealed by vtune is the call to basic_iarchive_impl::register_type and inparticular this code :- cobject_type co(cobject_info_set.size(), bis); std::pair<cobject_info_set_type::const_iterator, bool> result = cobject_info_set.insert(co); with the call to extended_type_info_typeid_0::less_than caused by the std::set::insert call consuming nearly 8% of the time needed to load the archive. Tracing this through revealed there seems to be some effort to optimise this call by using extended_type_info::type_info_key which is used by type_info_key_cmp (extended_type_info.cpp) to give an early out but this code if(lhs.type_info_key == rhs.type_info_key) return 0; in type_info_key_cmp always compares true (a breakpoint set after never gets hit) which then causes operator<(const extended_type_info &lhs, const extended_type_info &rhs) to eventually call type_info::before to give ordering information which is where the slowness comes from. This always returns true as it is just a pointer to class static data declared in extended_type_info_typeid_0. The comment attached to the member in extended_type_info states // used to uniquely identify the type of class derived from this one // so that different derivations of this class can be simultaneously // included in implementation of sets and maps. const char * type_info_key; but I cannot see how this will ever be different for any types. I get the impression that it is intended to be different for all types and the test is just relying on its address being unique per specialization of extended_type_info_typeid but I may be well off the mark, can anyone (Robert?) clarify this for me please. thanks Martin ps. This is using vc7.1 compiler.

This was with 1.32 , having just had a look at cvs head a lot of this seems to have changed significantly so i'll try and port over to that and recheck some of this. cheers Martin Martin Slater wrote:
Hi there,
I've been looking at speed issues loading archives for the last few days and came across something that puzzles me. We are using binary_archive's for storing relatively large data sets (> 100mg) comprised largely of std::vectors of pod types, this was causing huge inefficencies as for each element it was serialising it seperately so I derived a new archive type with specialisations for std::vector load/save that dispatches on podness (?) so if pod it just block writes/read the entire vector. This gave us a massive speedup so it could be useful to include this by default in the provided binary archive as the reaction of a number of our developers when first looking at vtune before this optimisation was that Boost.serialisation was just rubbish as it was so slow. (let me know if you want to see the implementation, its only about 100 lines or so).
The next slowest thing as revealed by vtune is the call to basic_iarchive_impl::register_type and inparticular this code :-
cobject_type co(cobject_info_set.size(), bis); std::pair<cobject_info_set_type::const_iterator, bool> result = cobject_info_set.insert(co);
with the call to extended_type_info_typeid_0::less_than caused by the std::set::insert call consuming nearly 8% of the time needed to load the archive. Tracing this through revealed there seems to be some effort to optimise this call by using extended_type_info::type_info_key which is used by type_info_key_cmp (extended_type_info.cpp) to give an early out but this code
if(lhs.type_info_key == rhs.type_info_key) return 0;
in type_info_key_cmp always compares true (a breakpoint set after never gets hit) which then causes operator<(const extended_type_info &lhs, const extended_type_info &rhs) to eventually call type_info::before to give ordering information which is where the slowness comes from. This always returns true as it is just a pointer to class static data declared in extended_type_info_typeid_0. The comment attached to the member in extended_type_info states
// used to uniquely identify the type of class derived from this one // so that different derivations of this class can be simultaneously // included in implementation of sets and maps. const char * type_info_key;
but I cannot see how this will ever be different for any types. I get the impression that it is intended to be different for all types and the test is just relying on its address being unique per specialization of extended_type_info_typeid but I may be well off the mark, can anyone (Robert?) clarify this for me please.
thanks
Martin
ps. This is using vc7.1 compiler.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (1)
-
Martin Slater