Problems with Reverse Cuthill-Mckee?
Hello, I'm a fairly new Boost user, so please forgive me if I'm completely out of whack here. I'm presently using Boost's reverse Cuthill-Mckee ordering to reduce the bandwidth of some very large symmetric, positive definite sparse matrices that I'm investigating. Generally, the cuthill_mckee_ordering() has worked quite well, although a short while ago, I began noticing that it occasionally gave an invalid ordering (i.e., the permutations spit out by the algorithm would have multiple vertexes all the same, which, when used to permute my matrices, would destroy my problem). It took me a while to figure out what was going on, but I've finally distilled a small enough example that exhibits the problem. Consider the following matrix: Matrix: 6x6 [ [3.66667,0,-0.5,0,0,0], [0,2.66667,0,-0.222222,0,0], [-0.5,0,1.66667,0,0,-0.222222], [0,-0.222222,0,3.33333,-2.22222,0], [0,0,0,-2.22222,4.66667,0], [0,0,-0.222222,0,0,4.66667] ] Using fairly straightforward code, I loop through the non-zeros, and add edges to an undirected Boost graph (basically the same as the cuthill_mckee_ordering.cpp example in the docs). Here is the graph that gets created for the above problem, in Graphviz format: graph G { 0; 1; 2; 3; 4; 5; 0--0 ; 0--2 ; 1--1 ; 1--3 ; 2--2 ; 2--5 ; 3--3 ; 3--4 ; 4--4 ; 5--5 ; } And, if I run this example using the cuthill_mckee_ordering technique (or if I modify cuthill_mckee_ordering.cpp to use these vertices and edges), I get the following permutation (using the pseudo-peripheral heuristic): Inv_perm: 0 0 0 0 2 5 You can see, from the Graphviz output, that this graph has two disconnected sets of vertices, which is where I think the problem lies. It may be a problem in my understanding of how to setup the Boost graph, or perhaps a problem in the reordering technique. You'll notice that, in my example, I've eliminated the duplicate edges (such as 2-5 being the same as 5-2) -- I've tried adding them in, too, but it hasn't changed the result. I've noticed that, if one of my matrices produces a graph in which there are no disconnected sets of vertices, the RCM ordering works fine. It's only when the graph is disconnected in this fashion that the ordering fails. I've also noticed that, when using this kind of disconnected graph, if I sometimes select a random starting vertex (instead of the pseudo-peripheral pair heuristic), the cuthill_mckee_ordering segfaults. I've attached some sample code below that seems to exhibit the problem. I'm certainly open to the possibility that it's a problem in my understanding of how to use the library. Also, my Boost version is: 1.30.2 (Redhat RPM) Any thoughts would be appreciated. regards, Kris ============================================================== (borrowed and slightly modified from cuthill_mckee_ordering.cpp) #include <boost/config.hpp> #include <vector> #include <iostream> #include <boost/graph/adjacency_list.hpp> #include <boost/graph/cuthill_mckee_ordering.hpp> #include <boost/graph/properties.hpp> #include <boost/graph/bandwidth.hpp> int main( int, char *[] ) { using namespace boost; using namespace std; typedef adjacency_list < vecS, vecS, undirectedS, property < vertex_color_t, default_color_type, property < vertex_degree_t, int > > > Graph; typedef graph_traits < Graph >::vertex_descriptor Vertex; typedef graph_traits < Graph >::vertices_size_type size_type; typedef std::pair < std::size_t, std::size_t > Pair; //graph G {0;1;2;3;4;5;0--0 ;0--2 ;1--1 ;1--3 ;2--2 ;2--5 ;3--3 ;3--4 ;4--4 ;5--5 ;} Pair edges[] = { Pair( 0, 0 ), Pair( 0, 2 ), Pair( 1, 1 ), Pair( 1, 3 ), Pair( 2, 2 ), Pair( 2, 5 ), Pair( 3, 3 ), Pair( 3, 4 ), Pair( 4, 4 ), Pair( 5, 5 )}; Graph G( 6 ); for( int i = 0; i < 10; ++i ) { std::cout << edges[i].first << ", " << edges[i].second << std::endl; add_edge( edges[i].first, edges[i].second, G ); } graph_traits < Graph >::vertex_iterator ui, ui_end; property_map < Graph, vertex_degree_t >::type deg = get( vertex_degree, G ); for( boost::tie( ui, ui_end ) = vertices( G ); ui != ui_end; ++ui ) { deg[*ui] = degree( *ui, G ); } property_map < Graph, vertex_index_t >::type index_map = get( vertex_index, G ); std::cout << "original bandwidth: " << bandwidth( G ) << std::endl; std::vector < Vertex > inv_perm( num_vertices( G ) ); std::vector < size_type > perm( num_vertices( G ) ); // reverse cuthill_mckee_ordering cuthill_mckee_ordering( G, inv_perm.rbegin(), get( vertex_color, G ), make_degree_map( G ) ); cout << "Reverse Cuthill-McKee ordering:" << endl; cout << " "; for( std::vector < Vertex >::const_iterator i = inv_perm.begin(); i != inv_perm.end(); ++i ) { cout << index_map[*i] << " "; } cout << endl; for( size_type c = 0; c != inv_perm.size(); ++c ) { perm[index_map[inv_perm[c]]] = c; } std::cout << " bandwidth: " << bandwidth( G, make_iterator_property_map( &perm[0], index_map, perm[0] ) ) << std::endl; return 0; }
Hello, After more digging into this matter, I'm fairly sure that the problem lies in the fact that disjoint sets of nodes aren't looped over in cuthill_mckee_ordering.hpp. Consequently, the out-edges of the nodes are examined, but if the out-edges never connect to another set of nodes, then only a portion of each disjointed graph is ever considered for reordering. This can be somewhat disastrous, because the ordering results will vary immensely depending on which starting vertex was used (and, by extension, which disjoint set was reordered). This seems equivalent to only scanning through one of the sets of level sets in the RCM algorithm. After a very quick glance at the minimal_degree_ordering.hpp and sloan_ordering.hpp code, I suspect that they also won't work quite right with disjoint graphs. I don't know if this is intended behaviour or not, as the documentation didn't really seem to hint toward how these kinds of graphs should be handled. In the meantime, I started to hack together a solution for my problem, but have found it to be rather "unpretty". (Basically, I use the Boost connected_components function to find disconnected components, then I loop through each disconnected component, find a good pseudo-peripheral starting vertex, and do the reordering just for the vertices in that component. I'm not even really sure that this is the best way to fix the RCM code.) Anyway, given my lack of experience with Boost, my hack is very ugly, but I thought I'd let ppl know where I think the problem lies, just in case they were having similar issues. regards, Kris On Sun, 2004-01-11 at 11:28, Kris Vorwerk wrote:
Generally, the cuthill_mckee_ordering() has worked quite well, although a short while ago, I began noticing that it occasionally gave an invalid ordering (i.e., the permutations spit out by the algorithm would have multiple vertexes all the same, which, when used to permute my matrices, would destroy my problem). [snip]
Hi Kris, I wish I had time to look at your problem and solution... alas I'm swamped with work on iterator adaptor stuff for the C++ standard TR. -Jeremy ---------------------------------------------------------------------- Jeremy Siek http://php.indiana.edu/~jsiek/ Ph.D. Student, Indiana Univ. B'ton email: jsiek@osl.iu.edu C++ Booster (http://www.boost.org) office phone: (812) 856-1820 ----------------------------------------------------------------------
I wish I had time to look at your problem and solution... alas I'm swamped with work on iterator adaptor stuff for the C++ standard TR.
No prob -- I totally understand :) (In fact, I'm swamped with trying to get my grad school stuff working, which is why I've resorted to hacking together code to make Boost work for me :) The following is rough and not very elegant, but it seems to work, I believe. (Well, it sort of works. I will explain what I mean after the code, below.) // BEGIN CODE SNIPPET ------------------------------------------- // This is just the code that I changed... (The detail code // remains the same.) namespace boost { // Reverse Cuthill-McKee algorithm with a given starting Vertex. // // This algorithm requires user to provide a starting vertex to // compute RCM ordering. template <class Graph, class OutputIterator, class ColorMap, class DegreeMap> OutputIterator cuthill_mckee_ordering(Graph& g, std::deque< typename graph_traits<Graph>::vertex_descriptor > vertex_queue, OutputIterator inverse_permutation, ColorMap color, DegreeMap degree) { typedef typename property_traits<DegreeMap>::value_type DS; typedef typename property_traits<ColorMap>::value_type ColorValue; typedef color_traits<ColorValue> Color; typedef typename graph_traits<Graph>::vertex_descriptor Vertex; typename graph_traits<Graph>::vertex_iterator ui, ui_end; for (tie(ui, ui_end) = vertices(g); ui != ui_end; ++ui) put(color, *ui, Color::white()); typedef indirect_cmp<DegreeMap, std::greater<DS> > Compare; Compare comp(degree); boost::queue<Vertex> bfs_queue; std::priority_queue<Vertex, std::vector<Vertex>, Compare> degree_queue(comp); Vertex u, v; // The vertex_queue contains the starting vertices that we will consider in // our RCM ordering. There should be one starting vertex for each disjoint // forest in the graph. while( !vertex_queue.empty() ) { Vertex s = vertex_queue.front(); vertex_queue.pop_front(); // Like BFS, except the adjacent vertices are visited in increasing // order of degree. put(color, s, Color::gray()); bfs_queue.push(s); while (! bfs_queue.empty()) { u = bfs_queue.top(); bfs_queue.pop(); *inverse_permutation++ = u; typename graph_traits<Graph>::out_edge_iterator ei, ei_end; for (tie(ei, ei_end) = out_edges(u, g); ei != ei_end; ++ei) { v = target(*ei, g); if (get(color, v) == Color::white()) { put(color, v, Color::gray()); degree_queue.push(v); } } while (!degree_queue.empty()) { v = degree_queue.top(); degree_queue.pop(); bfs_queue.push(v); } put(color, u, Color::black()); } // while } return inverse_permutation; } template < class Graph, class OutputIterator, class Color, class Degree > inline OutputIterator cuthill_mckee_ordering(Graph& G, OutputIterator inverse_permutation, Color color, Degree degree) { typedef typename boost::graph_traits<Graph>::vertex_descriptor Vertex; typedef typename boost::graph_traits<Graph>::vertex_iterator VerIter; VerIter ri = vertices(G).first; Vertex r = *ri; Vertex s; // Check the number of connected components in G. std::vector<int> c( num_vertices( G ) ); std::deque<Vertex> vertex_queue; int num = boost::connected_components(G, &c[0]); // Common case: we only have one set. if( num <= 1 ) { s = find_starting_node(G, r, color, degree); vertex_queue.push_front( s ); } else { // We seem to have more than one disjoint set. So, find good // starting nodes within each of the subgraphs, and then add // these starting nodes to our vertex queue. int num_considered = 0; std::vector<int> sets_considered( num ); std::fill( sets_considered.begin(), sets_considered.end(), 0 ); // Sanity for( unsigned int i = 0; i < c.size(); ++i ) { // If it's the first time we've considered this set, // then find a good pseudo peripheral node for it. // Otherwise, keep going until we've considered all of // the sets. if( sets_considered[c[i]] == 0 ) { ++num_considered; s = find_starting_node(G, i, color, degree); assert( c[s] == c[i] ); vertex_queue.push_back( s ); if( num_considered >= num ) { break; } } ++(sets_considered[c[i]]); } } return cuthill_mckee_ordering(G, vertex_queue, inverse_permutation, color, degree); } } // namespace boost // END CODE SNIPPET -------------------------------------------- Okay, so this code basically identifies disconnected components in the graph, and loops over the components. The pseudoperipheral pair code doesn't need to be changed. And, these changes seem to give good reorderings. I'm sure that the code can be cleaned up, so this is really just an example of how I've tackled the problem... One caveat that I'd like to point out, however, and what I casually reference above, is that in making this change, I think that I discovered another potentially more serious problem which occurs when I feed in quite a large graph derived from one of my matrix problems. (The graph has about 272000 vertices -- a graphviz format file representing it takes about 1.8MB ;). When I feed this fairly large graph into connected_components(), as you know, connected_components() does a DFS, but somewhere along the line, it segfaults in the adjacency_list<> code while it's DFSing through the graph. (I've run it through 'ddd', and have identified where the segfault occurs, but it will take me a while to figure out why this is happening.) I'm not sure how much time I have to debug this problem, so I think that I will just have to modify the cuthill_mckee_ordering to work on my MTL sparse matrix instead of relying on Boost graph calls. If possible, I will try to write some short sample code that recreates the segfault. (Then, since I'm not sure if this mailing list can handle compressed attachments, I'll put the sample graph data on my website or something so that developers can take a look at it.) I will post a description of the bug in a new message, as soon as I'm able to better describe why the bug's happening. (I mean, who knows, maybe I'm incompetent and building an invalid graph or something.) -kris
participants (2)
-
Jeremy Siek
-
Kris Vorwerk