General design question about threading/concurrency
Hello, I am writing a program that needs to run N simulations with N >10 000. Obviously, I cannot instantiate N threads at the same time but I cannot run the simulations sequentially either. Can anyone please advise me how to design my program using the boost threading library and concepts please? Thanks in advance, Julien.
On 24 Oct 2010, at 22:44, Julien Martin wrote:
Hello, I am writing a program that needs to run N simulations with N >10 000. Obviously, I cannot instantiate N threads at the same time but I cannot run the simulations sequentially either. Can anyone please advise me how to design my program using the boost threading library and concepts please?
Hi Julien, if you have an 8-core machine you could, e.g. create 8 threads and always run 8 simulations to an end. If the simulations are independent I would rather suggest that you use Boost.MPI and distribute it on a cluster. A 10000-core cluster can then allow you to run all 10000 simulations simultaneously. We are routinely doing that with several thousand simulations. Matthias
I'm not aware of any task management stuff that comes with the boost
libraries. boost::thread is pretty much a straightforward wrapper to
lower-level threading concepts.
I would usually build up some kind of task queue/stack that is either
locking or concurrent and non-locking that manages the tasks. Then
you can start M threads where M is the number of cores on your
machine. These threads can then "steal" the work from the task queue,
and/or add new tasks to the task queue until the tasks are finished.
TBB (intel thread building blocks) has some of this already in place
to make it easier to write these kinds of programs, but I typically
roll my own and use boost threads directly as I have seen better
performance.
Brian
On Sun, Oct 24, 2010 at 1:44 PM, Julien Martin
Hello, I am writing a program that needs to run N simulations with N >10 000. Obviously, I cannot instantiate N threads at the same time but I cannot run the simulations sequentially either. Can anyone please advise me how to design my program using the boost threading library and concepts please? Thanks in advance, Julien.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
I'm not aware of any task management stuff that comes with the boost libraries. boost::thread is pretty much a straightforward wrapper to lower-level threading concepts.
I would usually build up some kind of task queue/stack that is either locking or concurrent and non-locking that manages the tasks.
FWIW, Boost.Thread has now futures: http://www.boost.org/doc/libs/1_44_0/doc/html/thread/synchronization.html#th... and there's ASIO, which already provides a kind of queue that you're talking about.
I'm not aware of any task management stuff that comes with the boost libraries. boost::thread is pretty much a straightforward wrapper to lower-level threading concepts.
I am writing a program that needs to run N simulations with N >10
Obviously, I cannot instantiate N threads at the same time but I cannot run the simulations sequentially either. Can anyone please advise me how to design my program using the boost threading library and concepts please?
Try boost::asio, it has the concept of a server and of reusable worker threads. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Hi Julien, On Oct 24, 2010, at 2:44 PM, Julien Martin wrote:
I am writing a program that needs to run N simulations with N >10 000.
This sounds a lot like Monte Carlo. Are each of the N simulations essentially independent of each other, until perhaps all N complete and then some additional processing occurs? If so, then instead of using threads,
Obviously, I cannot instantiate N threads at the same time but I cannot run the simulations sequentially either. Can anyone please advise me how to design my program using the boost threading library and concepts please?
I'd use MPI if you can, but I don't know enough about your application domain. -- Noel
Thanks all for your replies,
Yes. It is a Monte-Carlo simulation I am trying to build up with all
simulations being independent of each other. I am going to look into all the
boost libraries you advised especially the MPI one.
I'll post further questions if required.
Regards,
Julien.
2010/10/25 Belcourt, Kenneth
Hi Julien,
On Oct 24, 2010, at 2:44 PM, Julien Martin wrote:
I am writing a program that needs to run N simulations with N >10 000.
This sounds a lot like Monte Carlo. Are each of the N simulations essentially independent of each other, until perhaps all N complete and then some additional processing occurs? If so, then instead of using threads,
Obviously, I cannot instantiate N threads at the same time but I cannot run the simulations sequentially either. Can anyone please advise me how to design my program using the boost threading library and concepts please?
I'd use MPI if you can, but I don't know enough about your application domain.
-- Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Hello Noel,
Yes, I am indeed running a monte carlo simulation which simulates N paths
and then, when all are finished that takes action based upon the results.
What class, method or concepts should I look for in MPI please?
Thanks,
J.
2010/10/25 Belcourt, Kenneth
Hi Julien,
On Oct 24, 2010, at 2:44 PM, Julien Martin wrote:
I am writing a program that needs to run N simulations with N >10 000.
This sounds a lot like Monte Carlo. Are each of the N simulations essentially independent of each other, until perhaps all N complete and then some additional processing occurs? If so, then instead of using threads,
Obviously, I cannot instantiate N threads at the same time but I cannot run the simulations sequentially either. Can anyone please advise me how to design my program using the boost threading library and concepts please?
I'd use MPI if you can, but I don't know enough about your application domain.
-- Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Hi Julien, On Oct 25, 2010, at 4:55 AM, Julien Martin wrote:
Yes, I am indeed running a monte carlo simulation which simulates N paths and then, when all are finished that takes action based upon the results. What class, method or concepts should I look for in MPI please?
So are you new to MPI and Threading? If you don't have at least some experience with parallel code development you could be jumping into the deep end (but don't let me discourage you)! Here's some random questions you might consider before adopting a particular approach. How long does each independent simulation run (seconds, minutes, hours)? Is there a throughput or turn around time requirement? What kinds of hardware are you going to run on (smp dual, quad, hex)? Do you have access to other compute machines on the network where you'll run? If distributed machines are available, can you ssh into them, are the machines homogeneous or heterogenous, do they have shared file systems? -- Noel
To Noel,
I use a single 4-core machine for now but could experiment a cluster with my
second machine. The two machines are heterogeneous. No shared file system...
I used parallel_for with success but don't know yet how to tune TBB. I am
going to read the TBB docs.
I am going to have a detailed look at the code you provided too.
To Matthias,
What you say about clustering is interesting. I might try that actually.
Thanks,
J.
2010/10/25 Belcourt, K. Noel
Hi Julien,
On Oct 25, 2010, at 4:55 AM, Julien Martin wrote:
Yes, I am indeed running a monte carlo simulation which simulates N paths
and then, when all are finished that takes action based upon the results. What class, method or concepts should I look for in MPI please?
So are you new to MPI and Threading? If you don't have at least some experience with parallel code development you could be jumping into the deep end (but don't let me discourage you)! Here's some random questions you might consider before adopting a particular approach.
How long does each independent simulation run (seconds, minutes, hours)? Is there a throughput or turn around time requirement? What kinds of hardware are you going to run on (smp dual, quad, hex)? Do you have access to other compute machines on the network where you'll run? If distributed machines are available, can you ssh into them, are the machines homogeneous or heterogenous, do they have shared file systems?
-- Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
On Oct 24, 2010, at 2:44 PM, Julien Martin wrote:
I am writing a program that needs to run N simulations with N >10
On Sunday, October 24, 2010 6:08 PM, Belcourt, Kenneth wrote: 000.
This sounds a lot like Monte Carlo. Are each of the N simulations essentially independent of each other, until perhaps all N complete and then some additional processing occurs? If so, then instead of using threads,
Obviously, I cannot instantiate N threads at the same time but I
cannot
run the simulations sequentially either. Can anyone please advise me how to design my program using the boost threading library and concepts please?
I'd use MPI if you can, but I don't know enough about your application domain.
I know this isn't a boost solution, but have you looked into Intel's Threading Building Blocks? (found at http://www.threadingbuildingblocks.org/ )
If u stay on the same computer (i.e. plain x86 PC with multi cores or multi processors), I would recommend Intel's TBB, which splits up tasks automatically and tries to keep the processor busy. It would be nice if Boost / std C++ would have such a parallellism library because it is an addition to using threads.
Thanks gast128,
I just installed TBB. Can you please tell me what to look for (which class,
method or concept) in TBB? It is pretty vast...
J.
2010/10/25 gast128
If u stay on the same computer (i.e. plain x86 PC with multi cores or multi processors), I would recommend Intel's TBB, which splits up tasks automatically and tries to keep the processor busy.
It would be nice if Boost / std C++ would have such a parallellism library because it is an addition to using threads.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Julien Martin
Thanks gast128,I just installed TBB. Can you please tell me what to look for (which class, method or concept) in TBB? It is pretty vast...J. 2010/10/25 gast128
If u stay on the same computer (i.e. plain x86 PC with multi cores or multi processors), I would recommend Intel's TBB, which splits up tasks automatically and tries to keep the processor busy. It would be nice if Boost / std C++ would have such a parallellism library because it is an addition to using threads.
Hello, tbb::parallel_for is probably where u will be interested in if the individual tasks are truely independent.
Ok. I will try that!
Thank you,
J.
2010/10/25 gast128
Julien Martin
writes: Thanks gast128,I just installed TBB. Can you please tell me what to look for (which class, method or concept) in TBB? It is pretty vast...J. 2010/10/25 gast128
If u stay on the same computer (i.e. plain x86 PC with multi cores or multi processors), I would recommend Intel's TBB, which splits up tasks automatically and tries to keep the processor busy. It would be nice if Boost / std C++ would have such a parallellism library because it is an addition to using threads. Hello,
tbb::parallel_for is probably where u will be interested in if the individual tasks are truely independent.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Hi Julien, On Oct 25, 2010, at 4:53 AM, Julien Martin wrote:
Thanks gast128, I just installed TBB. Can you please tell me what to look for (which class, method or concept) in TBB? It is pretty vast...
Start with the Intel TBB book, it's quite good and walks you through the product in fairly detailed fashion. In our application we do this. In a top-level routine (like main). empty_task *root = 0; task_scheduler_init *init = 0; if (1 < n_tasks) { init = new task_scheduler_init(); root = new (tbb::task::allocate_root()) empty_task; root->set_ref_count(n_tasks+1); } We're telling TBB thread manager how many TBB::threads we expect to create (so n_tasks is the number of smp cores). Next we round up the number of realizations we're going to run to load balance equally across the system. The number of epistemic realizations is your N. if (1 < n_tasks && n_epistemic_realizations % n_tasks) { n_epistemic_realizations += n_tasks - (n_epistemic_realizations % n_tasks); } Our epistemic class inherits from tbb::thread so this code constructs a thread for each core (task). unsigned int n = n_epistemic_realizations / n_tasks; epistemic* ep = new (root->allocate_child()) epistemic(n); For 8 core blade (n_tasks = 8) with 10^4 realizations (N = 10000), n is 1250. Note that depending how long your discrete simulations run, you may eventually want to use hybrid parallelism (MPI to leverage resources on your networks and threads for each local smp machine, though you can also use just straight MPI to manage both within and cross box parallelism. -- Noel
participants (9)
-
Andrew Holden
-
Belcourt, K. Noel
-
Belcourt, Kenneth
-
Brian Budge
-
gast128
-
Igor R
-
Julien Martin
-
Matthias Troyer
-
Ray Burkholder