
I have been testing thread performance on Linux and Mac. My Linux system has two dual-core processors and my Mac has one dual-core processor. Both are intel chips. For the code snippet given below, the execution time should ideally decrease as the number of threads increases. However, the opposite trend is observed. For example, using -O3 flags on my Linux desktop produces the following timings: 1 Thread: 0.66 sec 2 Threads: 0.9 sec 3 Threads: 1.2 sec 4 Threads: 1.4 sec I do not have a lot of experience with threads, and was wondering if this result surprises anyone? James ----- #include <boost/thread.hpp> #include <iostream> #include <vector> #include <time.h> struct MyStruct { explicit MyStruct(const int i) : tag(i) {} void operator()() const { const int n = 100; std::vector<int> nums(n,0); for( int j=0; j<1000000; ++j ) for( int i=0; i<n; ++i ) nums[i] = i+tag; } private: int tag; }; int main() { using namespace std; const int nTasks = 12; const int nThreads = 4; assert( nTasks%nThreads == 0 ); assert( nThreads<=nTasks ); cout << "Executing " << nTasks << " using " << nThreads << " threads." << endl; time_t t1 = clock(); for( int itask=0; itask<nTasks; ++itask ){ boost::thread_group threads; for( int i=0; i<nThreads; ++i ){ threads.create_thread( MyStruct(itask++ + 100) ); } threads.join_all(); } t2 = clock(); cout << "time: " << difftime( t2,t1)/CLOCKS_PER_SEC << endl; } ----