Kim Kuen tang <kuentang@vodafone.de> writes:
i've written a small code for estimating the mean from a series of random numbers. The random numbers are generated with and without boost::thread.
The simulation shows me that with boost::thread you only get a small performance.
The problem is not with boost::thread, but with false sharing. The results[] array is all in a single cache line, so that cache line has to bounce between the threads with each write. If you change the SimClass operator() to operate on a local variable and then store the final result in data_ you should get a much better performance improvement.
So my question is, how much performance can you get for simulation with boost::thread?
If you've got too many threads, and too much synchronization or false sharing, then you can get a performance drop. If you do things right, you can get almost an N* performance gain, where N is the number of hardware threads in your machine (typically num cores per processor * num processors) Anthony -- Anthony Williams | Just Software Solutions Ltd Custom Software Development | http://www.justsoftwaresolutions.co.uk Registered in England, Company Number 5478976. Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL