Re: [boost] [BGL] Multithread problem inserting edges and exploring the same graph

21 Sep 2009

      On Sep 21, 2009, at 10:59 AM, Cosimo Calabrese wrote:
...
...
Here are some ideas:
Do you need all of the properties in the explorer threads? maybe you
can construct a somewhat cheap copy of the graph of just vertices,
edges and weights?
Unfortunately no, my graph has 5 million of vertices and 10 million  
of edges, and the properties RAM cost is very lower confronted to  
the graph structure. The graph structure takes approximately the  
same RAM quantity of the graph + properties.
Moreover I've tried to make a graph copy, but it takes about 30  
seconds on my machine (Pentium D 2.8GHz, 3Gb RAM, WinXP); it's too  
much confronted to the Dijkstra's time (about 6-7 seconds).
...
When each explorer thread runs an instance of Dijkstra you might
consider a parallel version of Dijkstra [1,2] working on disjoint
graph regions? That might allow you to cut the graph into pieces and
assigning a 'local' version of the graph to each explorer thread and
joining them later on?
These arguments are relatively new to me... do you think that's the  
way?
I've take a look to PBGL and MTGL:
https://software.sandia.gov/trac/mtgl/
http://www.osl.iu.edu/research/pbgl/
Maybe I can use them to resolve my problem?
Having written significant portions of one and used/hacked on the  
other I can say with some certainty that neither one of them will  
solve your problems out-of-the-box (at least not in the way I  
propose).  Firstly, the problem seems somewhat ill defined so I've got  
a few questions to nail down the particulars:

1. When does the shortest paths solution have to be correct (i.e.  
match the graph exactly)?
2. Are there any constraints as to when your 'integrator' thread can  
call add_edge() (this relates to 1. pretty directly)

Now the solution I propose, since you have a monotonically increasing  
edge set you can actually compute an incremental solution to the SSSP  
problem rather than completely re-running Dijkstra.  Basically you can  
patch up the SSSP solution when you do the edge addition in O(N) time  
(expected time should be much lower but depends on parameters of your  
graph that I don't have so I won't try to explain in detail).  Either  
way this is either somewhat better, or much better than O(N log N) to  
re-run Dijkstra.  All you do is push the new weight onto the Dijkstra  
queue and run Dijkstra as normal.

The key question is does your SSSP solution have to be exact?  If not  
you can simply spin off the 'patching up the SSSP' work to the  
'Explorer' thread and get a solution that's reasonably likely to be  
accurate, otherwise it should be faster than completely re-running  
Dijkstra to patch up the solution.  If you really do need concurrent  
access to the adjacency list then I'd just wrap a readers-writers lock  
around each vertex's adjacencies.

I'm actually working on incremental algorithms in parallel and will  
likely end up adding some sequential counterparts as well.  Dijkstra  
is a pretty simple one so if you get a hold of me with the particulars  
of your problem and it fits with the stuff I was going to do anyway I  
may be able to write what you need fairly quickly.

On the 'partitioning/distributing the graph' note, if you want to go  
that route there's a lot of support for it in PBGL.  You'd probably  
want to rip out all the MPI stuff and put in a shared memory process  
group.  You could run MPI over SM, but if the preliminary data I've  
run on the matter is any indication performance will be abysmal and  
memory consumption will go way up.  Also, you're still going to have  
to deal with the concurrent access to the adjacency-list issue.  I've  
adopted the general rule that partitioning data structures is only  
appropriate when 1. you don't have a shared address space, or 2. you  
have hideous contention and no other way to alleviate it.  That's just  
my $0.02 though, and subject to change at any time.

Cheers,
Nick
...
Thanks,
Cosimo Calabrese.
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] [BGL] Multithread problem inserting edges and exploring the same graph

Nick Edmonds