[BGL] Performance improvement minimum spanning tree

A while ago I had a problem with the MST algorithm in boost. Essentially profiling reviled that the MST algorithm was a bottleneck for the computation I was performing. Thus I had to investigate the performance bottlenecks in boost mst and with this mail I want to investigate the potential for improving current boost implementation. What is the proposed way of getting the prim algorithm updated in boost? First let me argue that there is room for improvement. I have made a test program that constructs 1000 random complete graphs with 1000 nodes and the execution time is as follows actually implementing jarniks algorithm prim_minimum_spanning_tree 47.585s current prim (djikstra wrapper) 3m2638s (338% slower) kruskal_minimum_spanning_tree 40m57s (5063% slower) The 47.585s performance is achieved by just implementing Jarnik/Prims algorithm using the relaxed_heap (that is already in boost, but code is available if necessary). As a side effect it seems to be a good test for the relaxed_heap, given that it revealed problems with the Fibonacci heap[1]. 1. http://lists.boost.org/boost-users/2006/07/20876.php I also want to take the opportunity to ask about the meaning of the minimum spanning tree algorithm as such, since efficient implementations in the future could be helped a lot by knowing exactly what the problem that should be solved is. Given the hardness to understand the actual generality of the MST problem the current implementation tries to solve, I find it difficult to see how efficient algorithms can be heuristically selected. I.e. can you use num_edges(g) as an indicator of sparseness? -- Magnus Ekdahl

On Aug 24, 2006, at 4:49 AM, Magnus Ekdahl wrote:
A while ago I had a problem with the MST algorithm in boost. Essentially profiling reviled that the MST algorithm was a bottleneck for the computation I was performing. Thus I had to investigate the performance bottlenecks in boost mst and with this mail I want to investigate the potential for improving current boost implementation.
What is the proposed way of getting the prim algorithm updated in boost?
Just sent a note to the Boost mailing list, as you've done.
First let me argue that there is room for improvement. I have made a test program that constructs 1000 random complete graphs with 1000 nodes and the execution time is as follows
actually implementing jarniks algorithm prim_minimum_spanning_tree 47.585s
current prim (djikstra wrapper) 3m2638s (338% slower)
kruskal_minimum_spanning_tree 40m57s (5063% slower)
Those times seem... horrible. What compiler are you using? With which optimization flags? Can you post the sample code, so we can see what is going wrong?
I also want to take the opportunity to ask about the meaning of the minimum spanning tree algorithm as such, since efficient implementations in the future could be helped a lot by knowing exactly what the problem that should be solved is.
Wikipedia has a decent definition: http://en.wikipedia.org/wiki/Minimum_spanning_tree
Given the hardness to understand the actual generality of the MST problem the current implementation tries to solve, I find it difficult to see how efficient algorithms can be heuristically selected. I.e. can you use num_edges(g) as an indicator of sparseness?
We would need to run some benchmarks to determine what heuristic to use to dynamically select the most efficient algorithm. We haven't really done this yet for the BGL, but we should. MST is one good example; all-pairs shortest-paths is another good example, because, e.g., the Floyd-Warshall algorithm is much better for dense graphs while Johnson's algorithm is much better for sparse graphs. I imagine that we could use some kind of threshold for num_edges(g)/ (num_vertices(g)*num_vertices(g)) to decide whether a graph is sparse or dense. Doug

Douglas Gregor wrote:
On Aug 24, 2006, at 4:49 AM, Magnus Ekdahl wrote:
First let me argue that there is room for improvement. I have made a test program that constructs 1000 random complete graphs with 1000 nodes and the execution time is as follows
actually implementing jarniks algorithm prim_minimum_spanning_tree 47.585s
current prim (djikstra wrapper) 3m2638s (338% slower)
kruskal_minimum_spanning_tree 40m57s (5063% slower)
Those times seem... horrible. What compiler are you using?
g++ (GCC) 4.0.4 20060630 (prerelease) (Debian 4.0.3-5)
With which optimization flags?
g++ -O3 -g0 -mtune=pentium4
Can you post the sample code, so we can see what is going wrong?
attached :)
I also want to take the opportunity to ask about the meaning of the minimum spanning tree algorithm as such, since efficient implementations in the future could be helped a lot by knowing exactly what the problem that should be solved is.
Wikipedia has a decent definition:
Nice, the key here is undirected graph, which in the boost context would mean no parallel edges and self loops I guess.
Given the hardness to understand the actual generality of the MST problem the current implementation tries to solve, I find it difficult to see how efficient algorithms can be heuristically selected. I.e. can you use num_edges(g) as an indicator of sparseness?
We would need to run some benchmarks to determine what heuristic to use to dynamically select the most efficient algorithm. We haven't really done this yet for the BGL, but we should. MST is one good example; all-pairs shortest-paths is another good example, because, e.g., the Floyd-Warshall algorithm is much better for dense graphs while Johnson's algorithm is much better for sparse graphs. I imagine that we could use some kind of threshold for num_edges(g)/ (num_vertices(g)*num_vertices(g)) to decide whether a graph is sparse or dense.
I'll take this into account. For MST the relaxed_heap is surprisingly fast though, so it will be an issue when something faster is found. My original intension was to submit an O(|E|) array implementation for the complete graph case. But it is for current test cases slower than prim/relaxed so it have to be improved first (if at all possible). -- Magnus Ekdahl #ifdef HAVE_CONFIG_H #include <config.h> #endif #include <iostream> #include <cstdlib> #include<string> #include<list> #include <iostream> #include "mst.h" #include <boost/graph/adjacency_list.hpp> #include <boost/graph/kruskal_min_spanning_tree.hpp> #include <boost/graph/prim_minimum_spanning_tree.hpp> #include <iostream> #include <vector> #include <boost/graph/random.hpp> #include <boost/random/mersenne_twister.hpp> #include <algorithm> #include <boost/pending/fibonacci_heap.hpp> #include <boost/graph/graph_utility.hpp> #include <boost/pending/indirect_cmp.hpp> #include <boost/pending/relaxed_heap.hpp> using namespace std; using namespace boost; typedef adjacency_list < vecS, vecS, undirectedS, no_property, property < edge_weight_t, double > > Graph; typedef graph_traits < Graph >::edge_descriptor Edge; typedef graph_traits < Graph >::vertex_descriptor Vertex; typedef std::pair<int, int> E; void initGraph(Graph& g,const unsigned size) { for(unsigned i=0;i<size;i++) { for(unsigned j=i+1;j<size;j++) add_edge(i, j, g); } } void randGraph(Graph& g, const unsigned size) { property_map<Graph, edge_weight_t>::type weightmap = get(edge_weight, g); graph_traits<Graph>::edge_iterator ei, eiend; for (boost::tie(ei, eiend) = edges(g); ei != eiend; ++ei) { weightmap[*ei] = rand(); } } int main(int argc, char* argv[]) { srand(0); const unsigned NR_TESTS=1000; const unsigned size=1000; vector < Edge > spanning_tree_mwst; Graph g(size); std::vector < graph_traits < Graph >::vertex_descriptor > p(num_vertices(g)); initGraph(g,size); cout << "Generating randum graphs with weights from 0 to "<< RAND_MAX << " of size "<< size <<"\n"; for(unsigned i=0;i<NR_TESTS;i++) { randGraph(g,size); //prim_minimum_spanning_tree2(g,std::back_inserter(spanning_tree_mwst)); prim_minimum_spanning_tree(g,&p[0]); spanning_tree_mwst.clear(); } } /*************************************************************************** * Copyright (C) 2006 by Magnus Ekdahl * * maguno@ludd.luth.se * * * * Permission is hereby granted, free of charge, to any person obtaining * * a copy of this software and associated documentation files (the * * "Software"), to deal in the Software without restriction, including * * without limitation the rights to use, copy, modify, merge, publish, * * distribute, sublicense, and/or sell copies of the Software, and to * * permit persons to whom the Software is furnished to do so, subject to * * the following conditions: * * * * The above copyright notice and this permission notice shall be * * included in all copies or substantial portions of the Software. * * * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. * * IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR * * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * * OTHER DEALINGS IN THE SOFTWARE. * ***************************************************************************/ #ifndef __MST_HH__ #define __MST_HH__ #include <map> #include <functional> #include <boost/graph/adjacency_list.hpp> #include <boost/pending/indirect_cmp.hpp> #include <boost/pending/relaxed_heap.hpp> #include <iostream> namespace boost { namespace detail { // This is Prim's algorithm to calculate the Minimum Spanning Tree // for an undirected graph with weighted edges. template <class Graph, class OutputIterator, class Weight> inline void prim_mst_impl(const Graph& G, OutputIterator spanning_tree_edges,Weight weight) { function_requires<VertexListGraphConcept<Graph> >(); typedef typename graph_traits<Graph>::edge_descriptor Edge; function_requires<ReadablePropertyMapConcept<Weight, Edge> >(); typedef typename property_traits<Weight>::value_type W_value; function_requires<ComparableConcept<W_value> >(); if (num_vertices(G) <= 1) return; // Nothing to do in this case typedef typename graph_traits<Graph>::vertices_size_type size_type; typedef typename boost::property_map<Graph, vertex_index_t>::type IndexMap; IndexMap index = get(vertex_index, G); const size_type size = num_vertices(G); W_value* the_weights = new W_value[size]; Edge* the_edges = new Edge[size]; bool* vertices_in_mwst = new bool[size]; bool* discovered_vertices = new bool[size]; typedef indirect_cmp<W_value*,std::less<W_value> > ICmp; ICmp cmp(&the_weights[0], std::less<W_value>()); relaxed_heap<size_type, ICmp> Q(size, cmp); Q.push(0); while(Q.empty()!=true) { const size_type u=Q.top(); Q.pop(); vertices_in_mwst[u]=true; typename graph_traits<Graph>::out_edge_iterator ei, eiend; for (boost::tie(ei,eiend)=out_edges(u,G); ei != eiend; ++ei) { const size_type v = index[target(*ei,G)]; if (vertices_in_mwst[v]==false) { W_value w = get(edge_weight, G,*ei); if(discovered_vertices[v]==false) { discovered_vertices[v]=true; the_weights[v]=w; the_edges[v]=*ei; Q.push(v); } else if(w<the_weights[v]) { the_weights[v]=w; the_edges[v]=*ei; Q.update(v); } } } } for(size_type i=1;i<size;++i) *spanning_tree_edges++ = the_edges[i]; delete the_weights; delete the_edges; delete vertices_in_mwst; delete discovered_vertices; } } template <class VertexListGraph, class PredecessorMap> inline void prim_minimum_spanning_tree2 (const VertexListGraph& g, PredecessorMap p_map) { detail::prim_mst_impl(g, p_map,get(edge_weight, g)); } } #endif
participants (2)
-
Douglas Gregor
-
Magnus Ekdahl