[serialization] baffled by very long archive load time
I have been developing and maintaining a large computatation statistics package (http://www.astro.umass.edu/BIE). It is an MPI application that uses boost::sersialization to implement persistence. After recently adding a new class, I notice that the archive load times have gone from seconds to over an hour!! The binary archive size is 30MB. Regression tests suggest that the resumed state is correct albeit the very long load time. The package is implemented in multiple shared libraries and classes are exported as suggested in the documentation. I have noticed that I need to compile with "-Ox -fno-inline-small-functions" when x=2,3. I'm not sure if this is significant. The system is Debian/GNU Linux, 2.2Ghz 0pteron 64bit. The behavior is the same with Boost 1.37, 1.38 and 1.42. The new class has many nested STL container instances e.g. vector< vector<float> > . . . but very few archived pointers. I've not been able to isolate the cause in test cases (e.g. making very large instances of STL nested containers; these are deserialized in under a second). Any thoughts on things to track down? Thanks! -- Martin Weinberg Phone: (413) 545-3821 Dept. of Astronomy FAX: (413) 545-4223 530 Graduate Research Tower weinberg@astro.umass.edu University of Massachusetts Amherst, MA 01003-4525
AMDG Martin Weinberg wrote:
I have been developing and maintaining a large computatation statistics package (http://www.astro.umass.edu/BIE). It is an MPI application that uses boost::sersialization to implement persistence.
After recently adding a new class, I notice that the archive load times have gone from seconds to over an hour!! The binary archive size is 30MB. Regression tests suggest that the resumed state is correct albeit the very long load time. The package is implemented in multiple shared libraries and classes are exported as suggested in the documentation. I have noticed that I need to compile with "-Ox -fno-inline-small-functions" when x=2,3. I'm not sure if this is significant. The system is Debian/GNU Linux, 2.2Ghz 0pteron 64bit. The behavior is the same with Boost 1.37, 1.38 and 1.42.
The new class has many nested STL container instances e.g.
vector< vector<float> > . . .
but very few archived pointers. I've not been able to isolate the cause in test cases (e.g. making very large instances of STL nested containers; these are deserialized in under a second). Any thoughts on things to track down?
Can you build with profiling (-pg)? In Christ, Steven Watanabe
Zitat von Martin Weinberg
The system is Debian/GNU Linux
Any thoughts on things to track down?
valgrind --tool=callgrind and kcachegrind. debian has both. an hour for 30 mb must be the result of non-linear complexity. I guess that the error must be somewhere in your serialize() code, since Boost.Serialization is at most logarithmic to the number of tracked pointers, and you've said there are very few. but even if there are, I don't think 30 MB of tracked pointers would result in an hour.
I have noticed that I need to compile with "-Ox -fno-inline-small-functions" when x=2,3. I'm not sure if this is significant.
I would expect this to be extremely significant. why do you need to do this. I have used gcc profiling to investigate bottlenecks in the library. You might look at the directory ../libs/serialization/performance to see how to do this. Robert Ramey
On Mon, Mar 29, 2010 at 11:09:42AM -0800, Robert Ramey wrote:
I have noticed that I need to compile with "-Ox -fno-inline-small-functions" when x=2,3. I'm not sure if this is significant.
I would expect this to be extremely significant. why do you need to do this. I have used gcc profiling to investigate bottlenecks in the library. You might look at the directory ../libs/serialization/performance to see how to do this.
Thanks to all who replied! I'll try compiling with profiling. Meanwhile, in direct answer to the question above, if I compile without suppressing inline-small-function optimization, I get these errors: cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> >::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> >::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> >::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> >::get_const_instance().end()' failed.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Martin Weinberg Phone: (413) 545-3821 Dept. of Astronomy FAX: (413) 545-4223 530 Graduate Research Tower weinberg@astro.umass.edu University of Massachusetts Amherst, MA 01003-4525
Martin Weinberg wrote:
On Mon, Mar 29, 2010 at 11:09:42AM -0800, Robert Ramey wrote:
I have noticed that I need to compile with "-Ox -fno-inline-small-functions" when x=2,3. I'm not sure if this is significant.
I would expect this to be extremely significant. why do you need to do this. I have used gcc profiling to investigate bottlenecks in the library. You might look at the directory ../libs/serialization/performance to see how to do this.
Thanks to all who replied! I'll try compiling with profiling. Meanwhile, in direct answer to the question above, if I compile without suppressing inline-small-function optimization, I get these errors:
The most effective thing you could is to investigate the cause of these errors. prsumable they are due to the the new class you added. Also, polymorphic archives a measurable slower than the non-polymorphic versions. So if performance is important to you, then consider changing. I realize that this will result in more instanciated code, but the execution will be faster. Finally, you might consider moving to the most recent version of the library. It's much harder to help someone who's using an older package. Robert Ramey
cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive>
::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
On Mon, Mar 29, 2010 at 12:58:38PM -0800, Robert Ramey wrote:
Martin Weinberg wrote:
On Mon, Mar 29, 2010 at 11:09:42AM -0800, Robert Ramey wrote:
I have noticed that I need to compile with "-Ox -fno-inline-small-functions" when x=2,3. I'm not sure if this is significant.
I would expect this to be extremely significant. why do you need to do this. I have used gcc profiling to investigate bottlenecks in the library. You might look at the directory ../libs/serialization/performance to see how to do this.
Thanks to all who replied! I'll try compiling with profiling. Meanwhile, in direct answer to the question above, if I compile without suppressing inline-small-function optimization, I get these errors:
The most effective thing you could is to investigate the cause of these errors. prsumable they are due to the the new class you added. Also, polymorphic archives a measurable slower than the non-polymorphic versions. So if performance is important to you, then consider changing. I realize that this will result in more instanciated code, but the execution will be faster.
Finally, you might consider moving to the most recent version of the library. It's much harder to help someone who's using an older package.
Thanks for the commments. Yes, I've tried to find the cause of the optimization errors but without much success (which is why I posted here). I've tried compiing "-g -finline-small-functions" but this combination does not fail. Sigh. BTW, "-O1" level is fine as well (as previously mentioned). If you have any specific suggestions on the optimization issue, I'd be happy to try them. I've also tried long valgrind runs but (excepting the usual system library cruft) it is valgrind clean. Regarding versions, I've tried Boost 1.37, 1.38 and 1.42 and all exhibit the same behavior. I'm happy to stick to 1.42. Regarding runtime, I understand, in principle, the overhead incurred by dynamic binding. But the polymorphic approach is very convenient (and clever) and I don't care about serialization performance within reason (e.g. 10 sec or even 5 minutes) since it is not the runtime bottleneck. But a 1 hour archive load for a 32MB archive makes no sense to me. No doubt that it is my own code that is at fault, but I'm not having success on my own at diagnosing the boost::serialization-generated clues. My gprof run with boost and the application compiled "-g -pg" is still going . . .
cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive>
::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Martin Weinberg Phone: (413) 545-3821 Dept. of Astronomy FAX: (413) 545-4223 530 Graduate Research Tower weinberg@astro.umass.edu University of Massachusetts Amherst, MA 01003-4525
Thanks to all for the suggestions. I did end up using valgrind --tool=callgrind and kcachegrind. Compiling with -pg was not feasible. Anyway, I found that all the time (80%) was spent in reset_object_address(). Eliminating the polymorphic archives decreased the load time by a factor of 1000! I believe I found the issue requiring the -fno-inline-small functions as well. So at this point everything is working "-O3". I'm grateful for the help. On Mon, Mar 29, 2010 at 12:58:38PM -0800, Robert Ramey wrote:
Martin Weinberg wrote:
On Mon, Mar 29, 2010 at 11:09:42AM -0800, Robert Ramey wrote:
I have noticed that I need to compile with "-Ox -fno-inline-small-functions" when x=2,3. I'm not sure if this is significant.
I would expect this to be extremely significant. why do you need to do this. I have used gcc profiling to investigate bottlenecks in the library. You might look at the directory ../libs/serialization/performance to see how to do this.
Thanks to all who replied! I'll try compiling with profiling. Meanwhile, in direct answer to the question above, if I compile without suppressing inline-small-function optimization, I get these errors:
The most effective thing you could is to investigate the cause of these errors. prsumable they are due to the the new class you added. Also, polymorphic archives a measurable slower than the non-polymorphic versions. So if performance is important to you, then consider changing. I realize that this will result in more instanciated code, but the execution will be faster.
Finally, you might consider moving to the most recent version of the library. It's much harder to help someone who's using an older package.
Robert Ramey
cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive>
::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed. cli: /usr/local/boost_1_38_0/include/boost-1_38/boost/archive/impl/archive_pointer_oserializer.ipp:64: static const boost::archive::detail::basic_pointer_oserializer* boost::archive::detail::archive_pointer_oserializer<Archive>::find(const boost::serialization::extended_type_info&) [with Archive = boost::archive::polymorphic_oarchive]: Assertion `it != boost::serialization::singleton< oserializer_map<Archive> ::get_const_instance().end()' failed.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Martin Weinberg Phone: (413) 545-3821 Dept. of Astronomy FAX: (413) 545-4223 530 Graduate Research Tower weinberg@astro.umass.edu University of Massachusetts Amherst, MA 01003-4525
On 30/03/2010 11:58 PM, Martin Weinberg wrote:
I believe I found the issue requiring the -fno-inline-small functions as well. So at this point everything is working "-O3". I'm grateful for the help.
Please share :) -- Sohail Somani http://uint32t.blogspot.com
Well, false alarm. Maybe not _completely_ solved. I made sure that all implementations emitted objects, e.g. by moving null constructors to the source and including asm("") in the body. And this helped! So I conclude that (perhaps appropriately) the gcc inline-small-functions is removing null-body functions. But: (1) it is difficult to track all of these down in a large code: and (2) I believe that this does not cover all the cases of that optimization. So in the end, I'm stuck with "-O3 -fno-inline-small-functions" which does seem to give me working code and no obvious performance hit. Any thoughts?? On Wed, Mar 31, 2010 at 12:00:16AM -0400, Sohail Somani wrote:
On 30/03/2010 11:58 PM, Martin Weinberg wrote:
I believe I found the issue requiring the -fno-inline-small functions as well. So at this point everything is working "-O3". I'm grateful for the help.
Please share :)
-- Sohail Somani http://uint32t.blogspot.com
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Martin Weinberg Phone: (413) 545-3821 Dept. of Astronomy FAX: (413) 545-4223 530 Graduate Research Tower weinberg@astro.umass.edu University of Massachusetts Amherst, MA 01003-4525
Martin Weinberg wrote:
Thanks to all for the suggestions. I did end up using valgrind --tool=callgrind and kcachegrind. Compiling with -pg was not feasible.
Anyway, I found that all the time (80%) was spent in reset_object_address(). Eliminating the polymorphic archives decreased the load time by a factor of 1000!
I would be interested if you could make a small test case which demonstrates this. Robert Ramey
I tried for some time to generate a small working example but failed. The class I added instantiates a number of nested STL vector containers. I tried all sorts of combinations, large and small. My timing tests for the tests showed (sigh) that the polymorphic archives were always faster than the non-polymorphic case (binary and xml). So, sorry to report, I'm not sure what is causing the behavior I observe in my application. Frustrating. I'd like to understand this. On Tue, Mar 30, 2010 at 10:41:57PM -0800, Robert Ramey wrote:
Martin Weinberg wrote:
Thanks to all for the suggestions. I did end up using valgrind --tool=callgrind and kcachegrind. Compiling with -pg was not feasible.
Anyway, I found that all the time (80%) was spent in reset_object_address(). Eliminating the polymorphic archives decreased the load time by a factor of 1000!
I would be interested if you could make a small test case which demonstrates this.
Robert Ramey
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Martin Weinberg Phone: (413) 545-3821 Dept. of Astronomy FAX: (413) 545-4223 530 Graduate Research Tower weinberg@astro.umass.edu University of Massachusetts Amherst, MA 01003-4525
participants (5)
-
Martin Weinberg
-
Robert Ramey
-
Sohail Somani
-
Steven Watanabe
-
strasser@uni-bremen.de