[BGL] Threshold for betweenness_centrality_clustering()

Hi, I would like to use boost::betweenness_centrality_clustering() to search a graph for specific clusters. The documentation talks about a "Done" object, which I assume is a functor, similar to the predicates used in filtered_graph(). So in order to decide whether the algorithm should stop the clustering, it probably needs a threshold to decide if it should further iterate and remove edges to decompose the graph. My questions now are: - Does someone perhaps have an example for the usage of this algorithm? I only found documentation on "betweenness.centrality.clustering {RBGL}" for Gnu R, which doesn't bring me very far. - Is there some sort of rule-of-thumb to start with for finding this termination threshold or if there is some more-or-less methodical way of finding out a reasonable threshold? Best, Cedric

On Thu, 20 Jan 2011, Cedric Laczny wrote:
Hi,
I would like to use boost::betweenness_centrality_clustering() to search a graph for specific clusters. The documentation talks about a "Done" object, which I assume is a functor, similar to the predicates used in filtered_graph(). So in order to decide whether the algorithm should stop the clustering, it probably needs a threshold to decide if it should further iterate and remove edges to decompose the graph.
My questions now are: - Does someone perhaps have an example for the usage of this algorithm? I only found documentation on "betweenness.centrality.clustering {RBGL}" for Gnu R, which doesn't bring me very far. - Is there some sort of rule-of-thumb to start with for finding this termination threshold or if there is some more-or-less methodical way of finding out a reasonable threshold?
I do not know too much about the algorithm myself, but would the documentation at <URL:http://www.boost.org/doc/libs/1_45_0/libs/graph/doc/bc_clustering.html> help you? -- Jeremiah Willcock

On Thursday, 20. January 2011 19:36:05 Jeremiah Willcock wrote:
On Thu, 20 Jan 2011, Cedric Laczny wrote:
Hi,
I would like to use boost::betweenness_centrality_clustering() to search a graph for specific clusters. The documentation talks about a "Done" object, which I assume is a functor, similar to the predicates used in filtered_graph(). So in order to decide whether the algorithm should stop the clustering, it probably needs a threshold to decide if it should further iterate and remove edges to decompose the graph.
My questions now are: - Does someone perhaps have an example for the usage of this algorithm? I only found documentation on "betweenness.centrality.clustering {RBGL}" for Gnu R, which doesn't bring me very far. - Is there some sort of rule-of-thumb to start with for finding this termination threshold or if there is some more-or-less methodical way of finding out a reasonable threshold?
I do not know too much about the algorithm myself, but would the documentation at <URL:http://www.boost.org/doc/libs/1_45_0/libs/graph/doc/bc_clustering.html
help you?
Nice idea, but unfortunately not. I thoroughly read this, but besides the general characteristics of this calculation, there was no "rule-of-thumb" to be found. I came up with the idea of simply letting it run once one my network and then get the "highest" results to see what they would look like. Of course this is not very agile, error-prone (as I expereienced myself) and highly static. But that's probably simply a general problem of these "supervised" approaches. However, I have come up with some difficulties in working out how to use the algorithm. First, the implementation seems to need an edge_index as long as no external EdgeCentralityMap is provided. This should be definitely a thing to include in the documentation, especially as, for clustering, this map is not mandatory, oppositely to brandes_betweenness_centrality(). This is indeed problematic if the majority of graphs are based on unique edges, thus setS. I don't know if people more often use vecS or other containers. In this context I would like to provide the solution I used to create the EdgeCentralityMap when using an OutEdgeList different from vecS: #include <boost/graph/iteration_macros.hpp> typedef std::map<edge_descriptor, int> MyEdgeIndexMap; MyEdgeIndexMap my_edge_index; typedef associative_property_map< MyEdgeIndexMap > EdgeIndexMap; EdgeIndexMap edge_index(my_edge_index); int i = 0; BGL_FORALL_EDGES(edge, tmp_g, graph_type) { my_edge_index.insert(std::pair< edge_descriptor, int >( edge, i)); ++i; } // Define EdgeCentralityMap std::vector< double > e_centrality_vec(num_edges(tmp_g), 0.0); // Create the external property map iterator_property_map< std::vector< double >::iterator, EdgeIndexMap > e_centrality_map(e_centrality_vec.begin(), edge_index); Maybe there are simpler solutions. Second, it was really confusing that the EdgeCentralityMap is based on absolute centrality-values, while the termination-criterion ("Done") defaults to relative by normalising the edge-centrality values. Especially for large graphs, it can be quite frustrating to realize that the chosen threshold was > 1 and thus "per se" bigger and the algorithm will directly fail, but not after having computed the betweenness-centrality one time. One might then also wonder why it took its time but the resulting graph was the same as the original. But I think this is rather a problem in understanding and might be clarified by updating the documentation. To summarize my experience with the bc-clustering, I was actually a little bit disappointed. This might very well be as this clustering technique is not directly suitable for my needs, also because my graphs are not really small (20k vertices, 300k edges). It would be nice to see more clustering techniques directly in the BGL. I do know that implementations of these techniques are probably not the easiest tasks and probably the need is also rather low, as otherwise they maybe would have already been integrated into BGL. Nevertheless, thank you for your help.
-- Jeremiah Willcock
Best, Cedric

Hi, Does Boost have B++ trees implemented? If not, could I get a recommendation for an open source impl? Thanks so much, Anant Rao The information contained in this email message and its attachments is intended only for the private and confidential use of the recipient(s) named above, unless the sender expressly agrees otherwise. Transmission of email over the Internet is not a secure communications medium. If you are requesting or have requested the transmittal of personal data, as defined in applicable privacy laws by means of email or in an attachment to email, you must select a more secure alternate means of transmittal that supports your obligations to protect such personal data. If the reader of this message is not the intended recipient and/or you have received this email in error, you must take no action based on the information in this email and you are hereby notified that any dissemination, misuse or copying or disclosure of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by email and delete the original message.

On Thu, Jan 20, 2011 at 11:26 AM, Rao, Anant <Anant.Rao@ironmountain.com> wrote:
Hi,
Does Boost have B++ trees implemented? If not, could I get a recommendation for an open source impl?
I'll assume you mean a B+tree. Beman started some work on one and it's pretty far along but still missing some things. See https://github.com/Beman/Boost-Btree -- Cory Nelson http://int64.org
participants (4)
-
Cedric Laczny
-
Cory Nelson
-
Jeremiah Willcock
-
Rao, Anant