Hi all, i've written a small code for estimating the mean from a series of random numbers. The random numbers are generated with and without boost::thread. The simulation shows me that with boost::thread you only get a small performance. So my question is, how much performance can you get for simulation with boost::thread? with best regards, Kim Tang My code is shown below: #include <iostream> #include <fstream> #include <ctime> // std::time #include <climits> #include <boost/timer.hpp> #include <boost/random/linear_congruential.hpp> #include <boost/random/uniform_int.hpp> #include <boost/random/uniform_real.hpp> #include <boost/random/variate_generator.hpp> #include <boost/random/normal_distribution.hpp> #include <boost/thread/thread.hpp> // Sun CC doesn't handle boost::iterator_adaptor yet #if !defined(__SUNPRO_CC) || (__SUNPRO_CC > 0x530) #include <boost/generator_iterator.hpp> #endif #ifdef BOOST_NO_STDC_NAMESPACE namespace std { using ::time; } #endif typedef boost::minstd_rand base_generator_type; typedef boost::variate_generator<base_generator_type&, boost::normal_distribution<> > NormalRandomNumber; class SimClass { public: SimClass(double& data,NormalRandomNumber & generator, std::size_t number) : data_(data), generator_(generator), number_(number) {}; void operator()() { data_=0.0; for(std::size_t i = 0; i < number_; i++) data_+=generator_(); }; private: double& data_; NormalRandomNumber generator_; std::size_t number_; }; int main() { double mean=0.0; const std::size_t NUM_OF_THREADS=4; const std::size_t NUM_CALCS=10000000; const std::size_t NSIMULATION=NUM_CALCS*NUM_OF_THREADS; base_generator_type generator(42u); std::cout <<NSIMULATION <<" samples of a normal distribution :\n"; boost::normal_distribution<> norm_dist(0,1); NormalRandomNumber normalRN(generator, norm_dist); std::cout.setf(std::ios::fixed); // You can now retrieve random numbers from that distribution by means // of a STL Generator interface, i.e. calling the generator as a zero- // argument function. boost::timer cTime; for(int i = 0; i < NSIMULATION; i++) mean+=normalRN(); std::cout <<"estimated mean is: " <<mean/NSIMULATION << '\n'; std::cout<<"time to elapsed:"<<cTime.elapsed()<<"\n"; std::cout <<NSIMULATION <<" samples of a normal distribution with "<<NUM_OF_THREADS<<" threads:\n"; cTime.restart(); double results[NUM_OF_THREADS]; boost::thread_group thrds; for (int i=0; i < NUM_OF_THREADS; ++i) thrds.create_thread(SimClass(results[i],normalRN,NUM_CALCS)); thrds.join_all(); double result=0.0; for(std::size_t i=0;i<NUM_OF_THREADS;++i) result+=results[i]; std::cout<<"estimated mean is: "<<result/NSIMULATION<<"\n"; std::cout<<"time to elapsed:"<<cTime.elapsed()<<"\n"; return 0; }
Kim Kuen tang <kuentang@vodafone.de> writes:
i've written a small code for estimating the mean from a series of random numbers. The random numbers are generated with and without boost::thread.
The simulation shows me that with boost::thread you only get a small performance.
The problem is not with boost::thread, but with false sharing. The results[] array is all in a single cache line, so that cache line has to bounce between the threads with each write. If you change the SimClass operator() to operate on a local variable and then store the final result in data_ you should get a much better performance improvement.
So my question is, how much performance can you get for simulation with boost::thread?
If you've got too many threads, and too much synchronization or false sharing, then you can get a performance drop. If you do things right, you can get almost an N* performance gain, where N is the number of hardware threads in your machine (typically num cores per processor * num processors) Anthony -- Anthony Williams | Just Software Solutions Ltd Custom Software Development | http://www.justsoftwaresolutions.co.uk Registered in England, Company Number 5478976. Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL
participants (2)
-
Anthony Williams
-
Kim Kuen tang