
On Thu, 15 Jul 2010, Eric Wolf wrote:
Dear Jeremiah,
thanks for the fast response.
1) I'll give some general clarifications on the algorithm, then 2) I want to talk (read/write) about how to implement it and your very valueable suggestions.
1) First of all some clarification (i'm not sure you need them though :) )
Jeremiah Willcock <jewillco@osl.iu.edu> writes:
On Wed, 14 Jul 2010, Eric Wolf wrote:
Jeremiah Willcock <jewillco@osl.iu.edu> writes:
Why are you computing the transitive closure explicitly when it is not part of the algorithm's documented output? It there a reason that you can't use the SCC structure (and the connections between them) directly to compute the reduction without storing the full closure? Is there a reason you're not using the current transitive_closure algorithm in BGL?
An already computed transitive closure can't be used, as there is a need to compute it in all cases, as some times it must be checked, whether an edge is already in the to be computed closure.
I don't understand what you're saying, and it seems to be contradicted by your next part about using Vladimir's transitive_closure code.
I will modify transitive_closure from Vladimir Prus and use successor sets. This will give better time complexity and better memory usage.
So the transitive closure is stored in the successor sets in what I imagine to be the next version of transitive_reduction.hpp. No explicit transitive closure needed any more.
That's just a transitive closure stored as an adjacency list, though, right? Why not just use that rather than an adjacency_matrix, or make it configurable by the user (since looking up whether edges are in the closure is faster with a matrix when the closure is dense).
I cite some text, which would came later in your email:
If I understand correctly, though, you always need to make some kind of transitive closure graph internally, even if the user doesn't want it;
Correct. Thats what I will use successor sets for in my next imagined version.
OK.
I don't remember if that was just for the condensation graph or if it was a transitive closure of the whole input graph.
Just for the condensation graph.
That's what I thought. BTW, did you see <boost/graph/create_condensation_graph.hpp>? Is anything in there useful to you?
You also need an explicit condensation graph as temporary storage for doing the reduction, too, or am I mistaken?
I'm not sure about this. The transitive_reduction of the condensation graph is only touched to insert edges. It is not read in the “core“ algorithm. It is needed, to construct the transitive_reduction of the original graph I think. (If the reduction of the original graph is stored as an explicit graph, or if I write vertex pairs to an output iterator isn't important though.)
So you don't need to read back any of the transitive reduction of the condensation graph as part of creating it?
2) Now I answer the other points in your response:
OK. Please try to contribute that as a separate algorithm so other people can use it as well.
Ehh ... I don't understand, what you mean. A separate algorithm to the previous version? Who would want the previous version (of transitive_reduction)?
I think that was referring to the successor-set-based transitive closure algorithm, but check my previous email to be sure.
Why are there functions that do the transitive closure as well as the reduction? Is having both of them a common use case? It appears that there is a lot of redundant code between the reduction-with-closure and reduction-only functions.
It's the same algorithm. The algorithm for computing the closure and the reduction is essentially the same. So you could have the other one for free if you generate one of them. I doubt that it is a common use case, I admit, but I thought one should have the possibility to get both in one run, if its for free. So the reduction-with-closure and reduction-only functions are essentially duplicates of each other.
OK. Is there a way to factor out the common code?
** long lines below **
Several ways to accomplish that come to my mind, but I'm unsure which route to go and ask you for some advice.
The closure and reduction will always have the same number of vertices as the original graph, right?
yes
They just add or remove some edges, if I understand correctly.
yes
What if you didn't output a graph at all, just a list of edges (pairs of vertices, since the edge may not exist in the original graph)? Then a user could easily ignore the list, or feed it back into a graph class's constructor to create a result graph. You could then use named parameters and make the default be to ignore the outputs.
That's a great idea. But I couldn't get the grasp of bgl named parameters until now. I thought bgl would use the boost::parameter library, but in my perception the documentation and what's done in named_parameters.hpp don't fit.
I think BGL predates Boost.Parameter. In new algorithms, what we do is convert the BGL named parameter structure to a Boost.Parameter one and then use Boost.Parameter (with a bunch of BGL-specific wrapper functions) to work with it. Look at <boost/graph/random_spanning_tree.hpp> and the trunk/1.44 version of <boost/graph/astar_search.hpp> for examples.
If I understand correctly, though, you always need to make some kind of transitive closure graph internally, even if the user doesn't want it; I don't remember if that was just for the condensation graph or if it was a transitive closure of the whole input graph. You also need an explicit condensation graph as temporary storage for doing the reduction, too, or am I mistaken?
The reduction could be easily written to an output iterator I imagine, but the computation of the transitive closure of the condensation graph must be done in all cases and stored somehow. Now in the successor sets.
OK.
So here are some other options I've thought of; if some of the stuff I wrote in this paragraph is wrong, that might change which ones are possible.
1 (easy). Accept an Output Iterator that will get the reduction as a list of edges, built using the vertices from the original graph. Do the same for the closure. Use named parameters to make both of these outputs ignored by default, but they can also be used to create new graphs.
Even it the reduction is not used or wanted by the user, it still would be computed from the reduction of the condensation graph. But maybe it is an accepable annoyance?
I thought you could save some time by computing only the closure; isn't a reduction on even the condensation graph extra work to compute? Remember, too, you can probably assume that code like: if (!boost::is_same<ReductionOutput, dummy_iterator>::value) { compute reduction } would be handled well by compilers.
2 (harder, but not because of template tricks). Create two semi-implicit graphs, one for the closure and one for the reduction; return both from your algorithm. In this case, your graphs would only store a map from each vertex to its component (i.e., vertex in the condensation graph), which vertex is the representative of its component (for reductions), and the condensation graph (either its closure or reduction). You would then need to do the reduction even if the user doesn't need it, but you would not need to store the reduction or closure of anything but the condensation graph. You could also just have two variants of the algorithm, one that creates the condensation graph and its transitive closure, and another that creates both the condensation graph, its closure, and its reduction. That will likely save you code still, and named parameters can be used to determine in one function whether the reduction is necessary to compute (and this can probably be done with a run-time if statement on a property of a template argument, without needing any fancy metaprogramming beyond type_traits).
And a two functions to construct the reduction and the closure from the cond.gr.rd. and the cond.gr.cl. and the mappings. Yeah, thats a possible way.
I don't mean two functions. Here's what I had in mind in more detail: Assume: - component == SCC - G is the graph - .+ is the transitive closure operator - .- is the transitive closure/reduction operator - C is the condensation graph - comp(v) maps from a vertex in G to its corresponding component in C - next(v) maps from a vertex in G to another vertex in the same component (creating a loop in each component) - rep(v) is a bit flag that is true for one vertex in G in each component Now, as far as I understand, an edge (v, w) is in G+ iff (comp(v), comp(w)) is an edge in C+. An edge (v, w) is in G- iff either w = next(v) or rep(v) and rep(w) and (comp(v), comp(w)) is an edge in C-. You can define a graph type based on those computations. For the closure, you would store C+ and comp() explicitly, and then compute things like out_edges(v, G+) based on those using the formula above. For the reduction, you would store C-, comp(), next(), and rep(), then use the formulas to compute out_edges(v, G-). The other graph operations like vertices() would just forward to the original G. Using that kind of approach, you can create graph objects (things that model the graph concepts) that can be used to get the transitive closure and reduction of G while only storing information about the closure and reduction of C and a couple of other property maps on G.
Your options:
1) Define a Graph type which absorbs all add_vertex and add_edge operations like /dev/null in unix, then write one implementation function like ...
I don't like this option because I don't think the reduction can be removed completely, since I believe you need an explicit graph for the reduction when you are computing (you use already filled-in parts of it to create other parts, right?).
Nope. The reduction of the condensation graph is only written, not read in the core algorithm. It gets read if one tries to build the transitive reduction of the whole graph.
But I don't like that option, too. It is ugly.
OK. When do you need to read it back to produce the closure of the full graph? Don't you just need equivalents of the comp(), next(), and rep() maps that I talked about above, plus a way to emit the edges of the reduction of the condensation graph?
2) Use boost::enable_if like that: (In the following ... denotes an ellipsis of zero or more parameters, not varargs.) ...
This option is feasible and probably easier than you think it is; named parameters will take care of the dispatching, so you probably won't need enable_if.
If you think it is feasible, then I would like to take that route. One remedy still remains in my perception. If the user doesn't want the transitive reduction, an unused local variable remains in the main implementation function ...
Can that be removed somehow by fancy template stuff?
Something simple like the if statement I showed above should be enough to deal with that issue. If not, it's pretty easy to use mpl::if_ to remove the reduction code more completely.
What's your favorite choice?
I like the implicit graph idea because of its elegance and reduction in output size, but it is more work to implement (the formulas I put down above are fairly simple but turning them into iterators is more work; if you want to go this way, I can give more detailed pseudocode closer to what you actually need to write to model BGL concepts). Otherwise, either streaming edges to an output iterator (my option 1) or one of your options is reasonable. Your option 1 is probably equivalent to your option 2, since you can special-case the code based on your black-hole graph type and so not actually do the graph insertions when it is used; remember too that named parameters can be used so the user doesn't actually see the black-hole graph. -- Jeremiah Willcock