graph theoretical problem
Dear All, does the boost graph library provide a solution to the problem I'm trying to solve: I want to distribute a set of tasks to multiple CPUs/Threads. The tasks have predecessor and successor dependencies. There are no loops. More often than not, a task consists of only very few operations. So if I brute force program this, the resulting code is busy only with scheduling -- the real code does not show up in the profiling results. I think this can be solved by collapsing some tasks into a single task to minimize scheduling/communication. I've got a cost function associated with every task. Is this some well known algorithm which is already available in boost graph? Kind Regards Peter Foelsche
On Fri, 2 Aug 2013, Peter Foelsche wrote:
Dear All,
does the boost graph library provide a solution to the problem I'm trying to solve:
I want to distribute a set of tasks to multiple CPUs/Threads. The tasks have predecessor and successor dependencies. There are no loops. More often than not, a task consists of only very few operations. So if I brute force program this, the resulting code is busy only with scheduling -- the real code does not show up in the profiling results. I think this can be solved by collapsing some tasks into a single task to minimize scheduling/communication. I've got a cost function associated with every task.
Is this some well known algorithm which is already available in boost graph?
How important is a good solution compared to doing the scheduling quickly? One option you may have is work stealing, which is designed for dynamic tasks being spread across threads; one implementation that allows dependencies is PFunc (https://projects.coin-or.org/PFunc). An optimal solution will be hard to find, since the problem (with arbirary dependencies) is NP-complete even for scheduling onto one thread. Do you have further restrictions on the dependencies (at most one predecessor, at most one successor, etc.)? -- Jeremiah Willcock
On Mon, 5 Aug 2013, Jeremiah Willcock wrote: thanks!
How important is a good solution compared to doing the scheduling
quickly? One option you may have is work stealing, which is designed
there will be many executions of the optimized graph -- of course the optimization must complete in a reasonable time
for dynamic tasks being spread across threads; one implementation that allows dependencies is PFunc (https://projects.coin-or.org/PFunc). An optimal solution will be hard to find, since the problem (with arbirary dependencies) is NP-complete even for scheduling onto one thread. Do you have further restrictions on the dependencies (at most one predecessor, at most one successor, etc.)?
no -- there might be zero or more (including more than one) predecessors or successors for every task Peter
On Mon, Aug 5, 2013 at 10:20 AM, Peter Foelsche
On Mon, 5 Aug 2013, Jeremiah Willcock wrote:
thanks!
How important is a good solution compared to doing the scheduling
quickly? One option you may have is work stealing, which is designed
there will be many executions of the optimized graph -- of course the optimization must complete in a reasonable time
for dynamic tasks being spread across threads; one implementation that allows dependencies is PFunc (https://projects.coin-or.org/PFunc). An optimal solution will be hard to find, since the problem (with arbirary dependencies) is NP-complete even for scheduling onto one thread. Do you have further restrictions on the dependencies (at most one predecessor, at most one successor, etc.)?
no -- there might be zero or more (including more than one) predecessors or successors for every task
Have you already tried the following? 0. spawn threads. all threads grab data off of a queue as it becomes available. I recommend boost::lockfree::queue for this (with some minor caveats) 1. topological sort 2. for each dependency level, submit all work for the level to the queue and wait for that work to be completed before moving to the next dependency level I understand that there will be overhead for very tiny tasks. A possible further optimization is to group items within a single dependency level and allow each queued task to actually be several of these smaller tasks. Brian
On Mon, Aug 5, 2013, Brian Budge wrote:
1. topological sort
Thanks! Just had this idea: Once you do a topological sort, one can combine tasks, which are behind each other but do not depend on each other. For these tasks, the local sorting order does not matter. One can look at each of these tasks with the intention of putting them into separate threads. If the cost of one of such tasks is not sufficient, one could try to combine it with one of its dependent tasks which only depend on the task currently being considered. If by combining tasks, one gets tasks, which are sufficiently expensive, they can be distributed on separate threads, otherwise, one needs to consider combining them with the tasks at the current level of the sorting order. Is this it?! Peter
On Aug 14, 2013 4:18 PM, "Peter Foelsche"
On Mon, Aug 5, 2013, Brian Budge wrote:
1. topological sort
Thanks!
Just had this idea:
Once you do a topological sort, one can combine tasks, which are behind
For these tasks, the local sorting order does not matter. One can look at each of these tasks with the intention of putting them into separate threads. If the cost of one of such tasks is not sufficient, one could try to combine it with one of its dependent tasks which only depend on the task currently being considered. If by combining tasks, one gets tasks, which are sufficiently expensive,
each other but do not depend on each other. they can be distributed on separate threads, otherwise, one needs to consider combining them with the tasks at the current level of the sorting order.
Is this it?! Peter
Pretty much. Also anything within that dependency level can be multithreaded. It is a greedy algorithm and not optimal, but I would bet it works pretty darned well. Brian
participants (3)
-
Brian Budge
-
Jeremiah Willcock
-
Peter Foelsche