
Martin Slater wrote:
Just to reiterate my experience of profiling boost::serialization loading under vtune (I cannot recommend this enough for serious performance analysis) using one of our real data files for testing (weighs in at > 100mb and thousands of individual instances of objects) that the major bottleneck was strcmp caused by the type_id compare looking up the per type information used for tracking and the such. This outweighed everything else by a wide margin. I can dig up the details if your interested and up some point I will need to look at this again and try and optimise it. This was using 1.32.0 so take the above with a grain of salt *if* this area has changed significantly.
A couple of observations. I believe the intel machine has hardware instructions which implement strcmp and that compilers support them. So even if strcmp is the bottleneck, I wouldn't expect it to show up on the profiler unless some sort of inlining were turned off. Or maybe the vtune profiler has special provision for these cases somewhere. I did check to verify that the strcmp in the type-id lookup has been removed. Instead we just make sure there is only one instance of a particular extended_type_info record so that we can just compare the addresses. There are still some optimizations to be implemented - but I can't predict how much they will speed up anything. Robert Ramey