
Slawomir Lisznianski wrote: [...]
During tests involving manipulations of shared_ptr variables similar to those performed by algorithms on container elements – their copying and removal – a significant overhead was measured. During tests, care was taken to assure that timing did not include creation of new objects managed by shared_ptrs nor their destruction.
One way to customize the shared_ptr without breaking existing code would be to introduce an optional template parameter stating locking policy.
Attached you will find test code used on a win32 platform that produced the following results on Intel Pentium 4 2.66Mhz single processor machine. Each line represents separate testing iteration, so averages can be calculated:
[...]
std::ostream& _M_out; LARGE_INTEGER _M_startEvent; LARGE_INTEGER _M_endEvent; LARGE_INTEGER _M_frequency;
Please note that identifiers that start with _M (underscore followed by an uppercase letter) are reserved by the implementation, as are identifiers containing a douible underscore.
};
void run() { boost::shared_ptr<int> ptrA__(new int(0)), ptrB__, ptrC__; Timer timer__(std::cout); for (int i=0; i<4000000; ++i) { ptrB__ = ptrA__; ptrC__ = ptrB__; } }
Thank you for the test. I was able to confirm your results on an AMD Athlon 1.4. However, you have to agree that your test code isn't very realistic, or to be precise, it's very unrealistic. ;-) I was able to cut both single- and multithreaded times to 20ms by replacing shared_count::operator= as shown below: shared_count & operator= (shared_count const & r) // nothrow { sp_counted_base * tmp = r.pi_; if(tmp != pi_) { if(tmp != 0) tmp->add_ref_copy(); if(pi_ != 0) pi_->release(); pi_ = tmp; } return *this; } That's because you are measuring a tight cycle of no-ops. While it would be trivial to modify the test to avoid this particular optimization, I'd appreciate it if you can produce a test sample that is derived from a real code base that uses shared_ptr extensively. That said, your test, when rerun with the "next release shared_count" (proof of concept available at http://www.pdimov.com/cpp/shared_count_x86_exp2.hpp ) produces BOOST_HAS_THREADS is: TRUE Elapsed time: 254150 microseconds Elapsed time: 225076 microseconds Elapsed time: 225000 microseconds Elapsed time: 224875 microseconds Elapsed time: 226152 microseconds Elapsed time: 224947 microseconds Elapsed time: 225070 microseconds Elapsed time: 229008 microseconds Elapsed time: 227057 microseconds Elapsed time: 224856 microseconds Press any key to continue I find this (~3x instead of 10x) slightly less alarming. ;-)