[graph] Is there a way to detect circles in graphs using boost?
Hi, I would like to know if there is an easy/straight-forward way to get all circles of a graph using the boost::graph facilities? And if so where to look or how to do? Thanks a lot, Martin
Martin Mann wrote:
Hi,
I would like to know if there is an easy/straight-forward way to get all circles of a graph using the boost::graph facilities?
And if so where to look or how to do?
Thanks a lot,
Martin
Are you sure you want to find out _all_ the cycles in the graph or something else? Finding all cycles is NP-hard, because it implies finding the Hamilton tour and for complete graphs for example the amount of cycles is exponential. Most probably you don't need all the cycles and something in P suffices for you. What is the problem you want to solve? If you _really_ want everything (for very small graphs), then you must either cook up your own code or do stuff slowly. A very simple algorithm for this stuff is in Liu, Wang: A new way to enumerate cycles in graph. Basically you extend paths from every node and check if some of them creates a cycle. A quick and dirty approach would run separate DFS from every node and use the visitor to check if one of the neighbours has already been visited (except the one we came from). niko
Hi Niko, thanks for your answer!
Martin Mann wrote:
Hi,
I would like to know if there is an easy/straight-forward way to get all circles of a graph using the boost::graph facilities?
And if so where to look or how to do?
Thanks a lot,
Martin
Are you sure you want to find out _all_ the cycles in the graph or something else? Finding all cycles is NP-hard, because it implies finding the Hamilton tour and for complete graphs for example the amount of cycles is exponential.
Right. Currently, I want to detect all (simple) cycles (without repeated nodes) in a graph, but the graphs size is small (less than a hundred nodes) with low connectivity. Therefore I hope to achieve reasonable results even with exhaustive solutions.
Most probably you don't need all the cycles and something in P suffices for you. What is the problem you want to solve?
Maybe you are right. What I am currently working on is an easy transformation of graphs describing molecules into the so called SMILES string representation. This includes the detection and handling of cycles so I came upon that problem. But I have to check if I "only" need to break all cycles that is much easier than enumeration of all of them. If the only breaking is neccessary, I would perform DFS with a customized visitor to detect a cycle via a back_edge. Than remove the edge to break the cycle and repeat the procedure until all cycles are broken. Just as a first brute force idea. Maybe there are better ways to do so.
If you _really_ want everything (for very small graphs), then you must either cook up your own code or do stuff slowly. A very simple algorithm for this stuff is in Liu, Wang: A new way to enumerate cycles in graph. Basically you extend paths from every node and check if some of them creates a cycle.
Yes, I found the paper yesterday night but have not checked about the details. Another approach I found was by Horton. It is the initial step (to enumerate all simple cycles) to construct the optimal cycle basis of a graph. As a sketch it works like: for all edges (u,v) { for all nodes x != u,v { p1 = shortest_path(x,u); p2 = shortest_path(x,v); if (intersection(p1,p2) == x) { report_cycle(p1 + invert(p2)); } } } I like this method due to its simplicity and the possibility to reuse e.g. BFS for shortest_path. So the implementation will stay very small. Furthermore, one can add some tweaks like only nodes with degree > 1 are checked etc. But if somebody knows a better approach or has some templated function for boost::graphs, POST IT! ;)
A quick and dirty approach would run separate DFS from every node and use the visitor to check if one of the neighbours has already been visited (except the one we came from).
Is this approach ensuring to find all cycles only once? Think I will find some cycles several times or am I just wrong? But I will check that! Thanks! So far, Martin
Martin Mann wrote:
Hi Niko,
thanks for your answer!
Martin Mann wrote:
Hi,
I would like to know if there is an easy/straight-forward way to get all circles of a graph using the boost::graph facilities?
And if so where to look or how to do?
Thanks a lot,
Martin
Are you sure you want to find out _all_ the cycles in the graph or something else? Finding all cycles is NP-hard, because it implies finding the Hamilton tour and for complete graphs for example the amount of cycles is exponential.
Right. Currently, I want to detect all (simple) cycles (without repeated nodes) in a graph, but the graphs size is small (less than a hundred nodes) with low connectivity. Therefore I hope to achieve reasonable results even with exhaustive solutions.
Most probably you don't need all the cycles and something in P suffices for you. What is the problem you want to solve?
Maybe you are right. What I am currently working on is an easy transformation of graphs describing molecules into the so called SMILES string representation. This includes the detection and handling of cycles so I came upon that problem. But I have to check if I "only" need to break all cycles that is much easier than enumeration of all of them. If the only breaking is neccessary, I would perform DFS with a customized visitor to detect a cycle via a back_edge. Than remove the edge to break the cycle and repeat the procedure until all cycles are broken. Just as a first brute force idea. Maybe there are better ways to do so.
If you _really_ want everything (for very small graphs), then you must either cook up your own code or do stuff slowly. A very simple algorithm for this stuff is in Liu, Wang: A new way to enumerate cycles in graph. Basically you extend paths from every node and check if some of them creates a cycle.
Yes, I found the paper yesterday night but have not checked about the details.
Another approach I found was by Horton. It is the initial step (to enumerate all simple cycles) to construct the optimal cycle basis of a graph. As a sketch it works like:
for all edges (u,v) { for all nodes x != u,v { p1 = shortest_path(x,u); p2 = shortest_path(x,v); if (intersection(p1,p2) == x) { report_cycle(p1 + invert(p2)); } } }
I like this method due to its simplicity and the possibility to reuse e.g. BFS for shortest_path. So the implementation will stay very small. Furthermore, one can add some tweaks like only nodes with degree > 1 are checked etc.
But if somebody knows a better approach or has some templated function for boost::graphs, POST IT! ;)
A quick and dirty approach would run separate DFS from every node and use the visitor to check if one of the neighbours has already been visited (except the one we came from).
Is this approach ensuring to find all cycles only once? Think I will find some cycles several times or am I just wrong? But I will check that! Thanks!
Are you absolutely sure that a simple spanning tree wouldn't be enough? At least to me it seems that you need a spanning tree and some housecleaning after that. This would be a very simple and quick task. I wouldn't dare to try enumeration of cycles for graphs more than ten nodes, it can so easily explode and give billions of cycles, the complexity is actually much more than exponential. I checked some info on SMILES, it looks like 1. Remove hydrogens 2. Create some spanning tree 3. Traverse the tree to mark up broken cycles niko
Hi Niko, thanks very much for your search on the SMILES problem. And yes you are absolutely right that for SMILES encoding no full cycle enumeration is needed. Actually a DFS is sufficient and David showed how to do. Independently, I could leave the problem of writing an algorithm for full cycle enumeration. :) Today, I have implemented the Horton algorithm of my last posting using BFS and a customized bfs_visitor. It is able to identify all cycles (at least for the small examples I tried) but as expected one has to worry about multiple identifications of the same cycles. But this is easy to solve by "normalizing" the cycles to start with the smallest index. At least it is working and I have learned some algorithmic things and BGL usage! ;) Next I want to implement your DFS suggestion and do a small comparison of the performance of both approaches. Think one can tweak this approach too if only cycle closing back edges to indices lower than the start vertex are used. This might increase performance. I will have a look. :) Finally, I might check the paper you mentioned (downloaded already) to see if a fast BGL implementation can be done and how it performs in comparison to the first two. We will see how much time I will spent on that! ;) So thanks again for the dedicated help! Martin Niko Vuokko wrote:
Are you absolutely sure that a simple spanning tree wouldn't be enough? At least to me it seems that you need a spanning tree and some housecleaning after that. This would be a very simple and quick task.
I wouldn't dare to try enumeration of cycles for graphs more than ten nodes, it can so easily explode and give billions of cycles, the complexity is actually much more than exponential.
I checked some info on SMILES, it looks like 1. Remove hydrogens 2. Create some spanning tree 3. Traverse the tree to mark up broken cycles
niko
Martin Mann wrote:
Maybe you are right. What I am currently working on is an easy transformation of graphs describing molecules into the so called SMILES string representation. This includes the detection and handling of cycles so I came upon that problem. But I have to check if I "only" need to break all cycles that is much easier than enumeration of all of them. If the only breaking is neccessary, I would perform DFS with a customized visitor to detect a cycle via a back_edge. Than remove the edge to break the cycle and repeat the procedure until all cycles are broken. Just as a first brute force idea. Maybe there are better ways to do so.
Hi Martin, To generate SMILES strings, you don't need to enumerate the cycles. For an undirected graph, all you have to do is a depth first search. Every back edge (that is not also a tree edge) is the closing of a cycle. You won't actually need to remove any edges since the depth first search generates a tree (of the tree edges) which is exactly what a SMILES string is. The depth first search also gives a list of the cycles that were "broken" by the set difference of the back edges and the tree edges. The trickiest part that I found was the need to add the cycle labels to previously visited atom labels. I ended up doing two depth first searches: the first was to generate the list of pairs of verticies that were part of "broken" cycles, and the second was to actually generate the string. David
Hi David, thanks a lot for the detailed help on the SMILES conversion! Great. Seems that the encoding is straight forward in BGL and easy to do. I will have a look at that soon and think it wont be a big deal. Do you have any code fragment for the encoding at hand that I might use as a guide or that can be directly be reused? If not no problem at all! You helped already a lot! Another thing that I am facing as soon as the SMILES encoding is working is the generation of "canonical SMILES" such that one gets a unique encoding of the graph independently of the node the DFS is started from. Do you have any solution for that too or worked on that? So once again, many thanks for your help, Martin David Walthall wrote:
Hi Martin,
To generate SMILES strings, you don't need to enumerate the cycles. For an undirected graph, all you have to do is a depth first search. Every back edge (that is not also a tree edge) is the closing of a cycle. You won't actually need to remove any edges since the depth first search generates a tree (of the tree edges) which is exactly what a SMILES string is. The depth first search also gives a list of the cycles that were "broken" by the set difference of the back edges and the tree edges. The trickiest part that I found was the need to add the cycle labels to previously visited atom labels. I ended up doing two depth first searches: the first was to generate the list of pairs of verticies that were part of "broken" cycles, and the second was to actually generate the string.
David
participants (3)
-
David Walthall
-
Martin Mann
-
Niko Vuokko