Using boost::thread to keep all cores busy

I have a new 2 core CPU and I've been experimenting with boost::threads by attempting to keep both cores busy independent of each other. Say I have a queue of 4 tasks to do. Each task will be handled by one thread. Some tasks will take longer to complete than others so I can't use thread_group with 2 threads at a time and join_all() as one core may finish before the other and set idle until the second thread completes. I use boost::thread::hardware_concurrency() to determine how many CPU cores I have available. My problem is *knowing* when a thread/CPU core finishes a task and is available for another task. Here is some sample code. I commented what I think would be an approach. Any suggestions or hints? std::queuestd::string string_queue() { std::queuestd::string sq; sq.push("AAA"); sq.push("BBB"); sq.push("CCC"); sq.push("DDD"); return sq; } void worker(std::string& S) { std::cout << S << std::endl; } int main() { std::cout << "Number of Cores: " << boost::thread::hardware_concurrency() << std::endl; std::queuestd::string strings = string_queue(); while ( !strings.empty() ) { // if ( a CPU core is available ) // { boost::thread t(boost::bind(&worker, strings.front())); t.join(); strings.pop(); // } // else // { // wait awhile and check if a CPU core is available again. // } } return 0; } _________________________________________________________________ Hotmail: Trusted email with Microsoft’s powerful SPAM protection. https://signup.live.com/signup.aspx?id=60969

On 07/26/2010 03:00 PM, Internet Retard wrote:
I have a new 2 core CPU and I've been experimenting with boost::threads by attempting to keep both cores busy independent of each other.
using multiple threads in your application it not assure you that both your cores will be used, most likely yes but not 100% sure, unless: 1) The threads do not interfere each other 2) You did a CPU affinity for each your thread.
Say I have a queue of 4 tasks to do. Each task will be handled by one thread. Some tasks will take longer to complete than others so I can't use thread_group with 2 threads at a time and join_all() as one core may finish before the other and set idle until the second thread completes. I use boost::thread::hardware_concurrency() to determine how many CPU cores I have available. My problem is *knowing* when a thread/CPU core finishes a task and is available for another task. Here is some sample code. I commented what I think would be an approach. Any suggestions or hints?
what you can do is the following pattern: 1) Insert your jobs in a thread safe container (or do a wrapper around an std one). 2) Launch *all* your threads giving them a reference to the above container. 3) Then each thread have to do in the main loop (pseudo code here): Job* job = NULL; while( job = container.extract_job() ) { work on job; } Regards Gaetano Mendola

using multiple threads in your application it not assure you that both your cores will be used, most likely yes but not 100% sure, unless: 1) The threads do not interfere each other 2) You did a CPU affinity for each your thread.
Yes. The threads will not share data and will not interact in any way.
what you can do is the following pattern:
1) Insert your jobs in a thread safe container (or do a wrapper around an std one). 2) Launch *all* your threads giving them a reference to the above container.
All at once? What if I have millions of tasks rather than a few? I want to be working on 2 tasks all the time. If my queue is not empty, then as a thread ends, it sees there are more tasks in the queue and gets one.
3) Then each thread have to do in the main loop (pseudo code here): Job* job = NULL; while( job = container.extract_job() ) { work on job;
_________________________________________________________________ Hotmail: Trusted email with Microsoft’s powerful SPAM protection. https://signup.live.com/signup.aspx?id=60969

On 7/26/2010 9:37 AM, Internet Retard wrote:
All at once? What if I have millions of tasks rather than a few? I want to be working on 2 tasks all the time. If my queue is not empty, then as a thread ends, it sees there are more tasks in the queue and gets one.
You only launch *two* tasks. They keep going back to the queue to get new work when they finished their current item.

You only launch *two* tasks. They keep going back to the queue to get new work when they finished their current item. --- Perhaps something like this then: while ( !jobs.empty() ) { if ( thread_count < cores ) { boost::thread t(boost::bind(&worker, jobs.front())); jobs.pop(); t.join(); } } What is the correct way to determine the number of running/busy threads (thread_count)? _________________________________________________________________ Hotmail: Powerful Free email with security by Microsoft. https://signup.live.com/signup.aspx?id=60969

What is the correct way to determine the number of running/busy threads (thread_count)? I think you need to go buy a basic book on threaded programming. This list is not that book. ---- Not sure the reason for that comment. Mine seems a logical approach and is almost working. My only other experience with threading was with Python and it had a similar method called threading.active_count() So my logic (while maybe wrong) is to: 1. Get the number of cores using boost::thread::hardware_concurrency() 2. Fill a std::queue container with jobs for the threads. 3. While the container is not empty and there are less active threads than boost::thread::hardware_concurrency(), then start a new thread. Again, not sure why that would be deserving of a condescending response, it seems logical. _________________________________________________________________ Hotmail: Trusted email with Microsoft’s powerful SPAM protection. https://signup.live.com/signup.aspx?id=60969

On 7/26/2010 11:10 AM, Internet Retard wrote:
1. Get the number of cores using boost::thread::hardware_concurrency()
Correct.
2. Fill a std::queue container with jobs for the threads.
Correct.
3. While the container is not empty and there are less active threads than boost::thread::hardware_concurrency(), then start a new thread.
Incorrect. In the main routine: Just start N threads, where N is the number retrieved from (1). Inside those threads, loop, pulling a job from the queue and processing it. Return when the queue is empty Inside the main routine, meanwhile, just call join() on each thread object. When all thread are joined you are done. There's no point in continually creating and destroying threads (which is expensive), since you only and always ever only need 2.

On 07/26/2010 05:38 PM, Internet Retard wrote:
Perhaps something like this then:
while ( !jobs.empty() ) { if ( thread_count < cores ) { boost::thread t(boost::bind(&worker, jobs.front())); jobs.pop(); t.join(); } }
this approach is incorrect, look what can happen if for example the jobs container has last job to do: thread A performs the check jobs.empty() and it pass, before the thread A executes the job.front() another thread B does the check jobs.empty() and it pass as well, then thread A does the jobs.front() and then when thread B does a jobs.front() as well. that's why I suggested to do: while( job = jobs.front() ) { work on job } then jobs.front() will have to return a null pointer, throw an exception, whatever. And as suggested by others, instead to use this list as reference to learn the multi-threading programming, buy a good book about concurrency programming. Believe me, multi threading is very subtle. Regards Gaetano Mendola

this approach is incorrect, look what can happen if for example the jobs container has last job to do:
thread A performs the check jobs.empty() and it pass, before the thread A executes the job.front() another thread B does the check jobs.empty() and it pass as well, then thread A does the jobs.front() and then when thread B does a jobs.front() as well.
Thanks Gaetano and others for the helpful comments. I have working thread safe code now using boost threads. I learned a lot from your feedback. You are right, threads are subtle and require careful thought and attention. _________________________________________________________________ Hotmail: Free, trusted and rich email service. https://signup.live.com/signup.aspx?id=60969

On Mon, Jul 26, 2010 at 8:41 AM, Eric J. Holtman
On 7/26/2010 9:37 AM, Internet Retard wrote:
All at once? What if I have millions of tasks rather than a few? I want to be working on 2 tasks all the time. If my queue is not empty, then as a thread ends, it sees there are more tasks in the queue and gets one.
You only launch *two* tasks. They keep going back to the queue to get new work when they finished their current item.
You might want to look at the threadpool project: http://threadpool.sourceforge.net/ This is built on boost::thread and has a very nice implementation for queuing work for threads. It has various types of queues (priority, fifo, etc).
participants (4)
-
Eric J. Holtman
-
Gaetano Mendola
-
Internet Retard
-
James C. Sutherland