boost multithread use all cores in parallel ?

Hi, How to make sure that a multithreaded C++ program is run by ALL cores on a multicore server to do computing in parallel physically (not logically). I have a multicore server, which has 24 CPUs, each of which has 6 cores. It is Intel Xeon X5650 2.67GHz cpu cores : 6 (support 6 threads) cpu MHz : 1596.000 totally, I have 24 * 6 = 144 cores. I designed a multithreaded C++ program with boost/thread. How to make sure that my program is run by all 144 cores ? Any help is really appreciated. thanks

How to make sure that a multithreaded C++ program is run by ALL cores on a multicore server to do computing in parallel physically (not logically).
I have a multicore server, which has 24 CPUs, each of which has 6 cores.
It is Intel Xeon X5650 2.67GHz cpu cores : 6 (support 6 threads) cpu MHz : 1596.000
totally, I have 24 * 6 = 144 cores.
I designed a multithreaded C++ program with boost/thread.
How to make sure that my program is run by all 144 cores ?
It's OS thread scheduler who cares about that. But your program has to have 144 threads to utilise all the 144 cores.

thanks for your reply. My program needs to run many (about 10,000+) computing tasks. Each task's run time is very very short (< 0.1 second or even less). At each iteration, all these tasks are run in parallel. Some of them need to exchange some data (it is very small) and then go on. I want to keep all 144 cores as busy as possible so that my program can be done as fast as possible. So, I want to associate each task with a distinct thread and schedule threads as many as possible. Also, try yo make the workload balance among these cores. How can I do that from the point of programming ? Any help is really appreciated. thanks
From: boost.lists@gmail.com Date: Sun, 14 Aug 2011 20:29:13 +0300 To: boost-users@lists.boost.org Subject: Re: [Boost-users] boost multithread use all cores in parallel ?
How to make sure that a multithreaded C++ program is run by ALL cores on a multicore server to do computing in parallel physically (not logically).
I have a multicore server, which has 24 CPUs, each of which has 6 cores.
It is Intel Xeon X5650 2.67GHz cpu cores : 6 (support 6 threads) cpu MHz : 1596.000
totally, I have 24 * 6 = 144 cores.
I designed a multithreaded C++ program with boost/thread.
How to make sure that my program is run by all 144 cores ?
It's OS thread scheduler who cares about that. But your program has to have 144 threads to utilise all the 144 cores. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Il 08/14/2011 07:38 PM, Jack Bryan ha scritto:
thanks for your reply. [...snip...] So, I want to associate each task with a distinct thread and schedule threads as many as possible. Also, try yo make the workload balance among these cores.
How can I do that from the point of programming ?
Hi, I doubt that you can do this kind of things at application (programming) level. Ensuring that all cores are used at best is the responsibility of the operating system in the person of its scheduler. This is because only the O.S. can be aware of everything running on the system beside your program. Thus I believe the answer to your question is not. AFAIK the kernel scheduler in Linux does a god job, even if I never had the occasion of using it on that many cores :) -- Leo Cacciari Aliae nationes servitutem pati possunt populi romani est propria libertas

From "top" command on linux, I only know the CPU utilization percentage. But, I do not how many cores are used in parallel. Any help is really appreciated.
Thanks Are there linux commands that can show all cores utilization ? thanks
Date: Sun, 14 Aug 2011 20:00:55 +0200 From: leo.cacciari@gmail.com To: boost-users@lists.boost.org Subject: Re: [Boost-users] boost multithread use all cores in parallel ?
Il 08/14/2011 07:38 PM, Jack Bryan ha scritto:
thanks for your reply. [...snip...] So, I want to associate each task with a distinct thread and schedule threads as many as possible. Also, try yo make the workload balance among these cores.
How can I do that from the point of programming ?
Hi, I doubt that you can do this kind of things at application (programming) level. Ensuring that all cores are used at best is the responsibility of the operating system in the person of its scheduler. This is because only the O.S. can be aware of everything running on the system beside your program. Thus I believe the answer to your question is not. AFAIK the kernel scheduler in Linux does a god job, even if I never had the occasion of using it on that many cores :)
-- Leo Cacciari Aliae nationes servitutem pati possunt populi romani est propria libertas _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Hi, On my Netbook running Ubuntu NBR, there is an app called "System Monitor" which shows CPU History, Memory and Swap History, Network History. I usually invoke it from the GNOME GUI And here is the command from bash:- ian@Hawking:~$ gnome-system-monitor & HTH, Ian -- -- ACCU - Professionalism in programming - http://www.accu.org/

On 14/08/11 02:04 PM, Jack Bryan wrote:
Thanks
Are there linux commands that can show all cores utilization ?
From "top" command on linux, I only know the CPU utilization percentage. But, I do not how many cores are used in parallel.
Any help is really appreciated.
thanks
Hi Use "htop". It shows each core separately. Arash

On Sun, Aug 14, 2011 at 10:38 AM, Jack Bryan
thanks for your reply.
My program needs to run many (about 10,000+) computing tasks.
Each task's run time is very very short (< 0.1 second or even less).
At each iteration, all these tasks are run in parallel. Some of them need to exchange some data (it is very small) and then go on.
I want to keep all 144 cores as busy as possible so that my program can be done as fast as possible.
So, I want to associate each task with a distinct thread and schedule threads as many as possible. Also, try yo make the workload balance among these cores.
I don't think Boost has anything that will facilitate efficient use of this hardware. You may wish to take a look at Intel's Thread Building Blocks as a starting point, but I'm not certain if even that is built to scale to 144 cores. -- Cory Nelson http://int64.org

On Aug 14, 2011, at 1:38 PM, Jack Bryan
My program needs to run many (about 10,000+) computing tasks.
Each task's run time is very very short (< 0.1 second or even less).
At each iteration, all these tasks are run in parallel. Some of them need to exchange some data (it is very small) and then go on.
I want to keep all 144 cores as busy as possible so that my program can be done as fast as possible.
So, I want to associate each task with a distinct thread and schedule threads as many as possible. Also, try yo make the workload balance among these cores.
How can I do that from the point of programming ?
A "thread pool" is the usual solution. You might want to try this one (based on boost): http://threadpool.sourceforge.net/ Haven't tried it myself. If you mean the tasks need to exchange data *with each other*, then that's getting more complicated and you might need to look at mpi or something. However if you mean just block for I/O then you might be able to run more threads than cores. Cheers Gordon

----- Original message -----
On Aug 14, 2011, at 1:38 PM, Jack Bryan
wrote: My program needs to run many (about 10,000+) computing tasks.
I want to keep all 144 cores as busy as possible so that my program can be done as fast as possible.
How can I do that from the point of programming ?
A "thread pool" is the usual solution. You might want to try this one (based on boost):
I've tried it some time ago and wasn't too happy with it. I have to admit I didn't try really hard, but I usually had cores unused. I'm not sure if this was a problem of the threadpool or of my code, but anyway, from browsing their CVS repository on SF the project looks pretty dead, since the last change was two years ago. With that number of cores it might be really worth trying Intel TBB. For my part I'm only working on Windows and with normal desktop CPUs, so I'm currently using Microsoft's Concurrency Runtime and Parallel Patterns Library, which works well enough for my tasks. Norbert

On Aug 15, 2011, at 5:23 AM, Norbert Wenzel
I've tried it some time ago and wasn't too happy with it. I have to admit I didn't try really hard, but I usually had cores unused. I'm not sure if this was a problem of the threadpool or of my code, but anyway, from browsing their CVS repository on SF the project looks pretty dead, since the last change was two years ago.
Hmm, thanks for the report. Must admit I've seen not-great performance from Boost.Threads synchronization too. Just to steer this back toward topics Boost, note there is also Oliver Kowalke's Task library in the works. Since it uses Context it should be lighter weight. Sounds like TBB is the way to go for now though. Cheers Gordon

GCC's OpenMP implementation + placing Boost::mutex on shared variables
worked great for me (8 cores fully loaded).
If your code relies on STL algorithms, could be a way to go.
Nil
On Mon, Aug 15, 2011 at 7:27 PM, Gordon Woodhull
On Aug 15, 2011, at 5:23 AM, Norbert Wenzel
wrote: http://threadpool.sourceforge.net/
I've tried it some time ago and wasn't too happy with it. I have to admit I didn't try really hard, but I usually had cores unused. I'm not sure if this was a problem of the threadpool or of my code, but anyway, from browsing their CVS repository on SF the project looks pretty dead, since the last change was two years ago.
Hmm, thanks for the report. Must admit I've seen not-great performance from Boost.Threads synchronization too. Just to steer this back toward topics Boost, note there is also Oliver Kowalke's Task library in the works. Since it uses Context it should be lighter weight. Sounds like TBB is the way to go for now though. Cheers Gordon
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

I'll second Cory's advise to look at Intel's TBB library; seems like
you're already mostly in a position to make use of it, if you're
already thinking at the level of tasks. The boost thread library
doesn't really provide the constructs for doing efficient parallel
programming, in my experience.
On Sun, Aug 14, 2011 at 10:13 AM, Jack Bryan
Hi,
How to make sure that a multithreaded C++ program is run by ALL cores on a multicore server to do computing in parallel physically (not logically).
I have a multicore server, which has 24 CPUs, each of which has 6 cores.
It is Intel Xeon X5650 2.67GHz cpu cores : 6 (support 6 threads) cpu MHz : 1596.000
totally, I have 24 * 6 = 144 cores.
I designed a multithreaded C++ program with boost/thread.
How to make sure that my program is run by all 144 cores ?
Any help is really appreciated.
thanks
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (10)
-
Arash Abghari
-
Cory Nelson
-
Gordon Woodhull
-
Ian Bruntlett
-
Igor R
-
Jack Bryan
-
Leo Cacciari
-
Nil Geisweiller
-
Norbert Wenzel
-
Oliver Seiler