Re: [boost] [SoC] BGL project idea

16 Jul 2010

      On Thu, 15 Jul 2010, Eric Wolf wrote:
...
Dear Jeremiah,
thanks for the fast response.
1) I'll give some general clarifications on the algorithm, then
2) I want to talk (read/write) about how to implement it and your
  very valueable suggestions.
1) First of all some clarification (i'm not sure you need
them though :) )
Jeremiah Willcock <jewillco@osl.iu.edu> writes:
...
On Wed, 14 Jul 2010, Eric Wolf wrote:
...
Jeremiah Willcock <jewillco@osl.iu.edu> writes:
...
...
...
Why are you computing the transitive closure explicitly when it is not
part of the algorithm's documented output?  It there a reason that you
can't use the SCC structure (and the connections between them)
directly to compute the reduction without storing the full closure?
Is there a reason you're not using the current transitive_closure
algorithm in BGL?
An already computed transitive closure can't be used, as there is a need
to compute it in all cases, as some times it must be checked, whether an
edge is already in the to be computed closure.
I don't understand what you're saying, and it seems to be contradicted by 
your next part about using Vladimir's transitive_closure code.
...
...
...
I will modify transitive_closure from Vladimir Prus and use successor 
sets. This will give better time complexity and better memory usage.
So the transitive closure is stored in the successor sets in what I 
imagine to be the next version of transitive_reduction.hpp. No explicit 
transitive closure needed any more.
That's just a transitive closure stored as an adjacency list, though, 
right?  Why not just use that rather than an adjacency_matrix, or make it 
configurable by the user (since looking up whether edges are in the 
closure is faster with a matrix when the closure is dense).
...
I cite some text, which would came later in your email:
...
If I understand correctly, though, you always need
to make some kind of transitive closure graph internally, even if the
user doesn't want it;
Correct. Thats what I will use successor sets for in my next imagined
version.
OK.
...
...
I don't remember if that was just for the condensation graph or if it
was a transitive closure of the whole input graph.
Just for the condensation graph.
That's what I thought.  BTW, did you see 
<boost/graph/create_condensation_graph.hpp>?  Is anything in there useful 
to you?
...
...
You also need an explicit condensation graph as temporary storage for
doing the reduction, too, or am I mistaken?
I'm not sure about this. The transitive_reduction of the condensation
graph is only touched to insert edges. It is not read in the “core“
algorithm.  It is needed, to construct the transitive_reduction of the
original graph I think. (If the reduction of the original graph is
stored as an explicit graph, or if I write vertex pairs to an output
iterator isn't important though.)
So you don't need to read back any of the transitive reduction of the 
condensation graph as part of creating it?
...
2) Now I answer the other points in your response:
...
OK.  Please try to contribute that as a separate algorithm so other
people can use it as well.
Ehh ... I don't understand, what you mean. A separate algorithm to the
previous version? Who would want the previous version (of
transitive_reduction)?
I think that was referring to the successor-set-based transitive closure 
algorithm, but check my previous email to be sure.
...
...
...
...
...
...
Why are there functions that do the transitive closure as well as the
reduction?  Is having both of them a common use case?  It appears that
there is a lot of redundant code between the reduction-with-closure
and reduction-only functions.
It's the same algorithm. The algorithm for computing the closure and the
reduction is essentially the same. So you could have the other one for
free if you generate one of them. I doubt that it is a common use case,
I admit, but I thought one should have the possibility to get both in
one run, if its for free. So the reduction-with-closure and
reduction-only functions are essentially duplicates of each other.
OK.  Is there a way to factor out the common code?
** long lines below **
Several ways to accomplish that come to my mind, but I'm unsure which
route to go and ask you for some advice.
The closure and reduction will always have the same number of vertices
as the original graph, right?
yes
...
They just add or remove some edges, if I understand correctly.
yes
...
What if you didn't output a graph at all, just a list of edges (pairs
of vertices, since the edge may not exist in the original graph)?
Then a user could easily ignore the list, or feed it back into a graph
class's constructor to create a result graph.  You could then use
named parameters and make the default be to ignore the outputs.
That's a great idea. But I couldn't get the grasp of bgl named
parameters until now. I thought bgl would use the boost::parameter
library, but in my perception the documentation and what's done in
named_parameters.hpp don't fit.
I think BGL predates Boost.Parameter.  In new algorithms, what we do is 
convert the BGL named parameter structure to a Boost.Parameter one and 
then use Boost.Parameter (with a bunch of BGL-specific wrapper functions) 
to work with it.  Look at <boost/graph/random_spanning_tree.hpp> and the 
trunk/1.44 version of <boost/graph/astar_search.hpp> for examples.
...
...
If I understand correctly, though, you always need to make some kind
of transitive closure graph internally, even if the user doesn't want
it; I don't remember if that was just for the condensation graph or if
it was a transitive closure of the whole input graph.  You also need
an explicit condensation graph as temporary storage for doing the
reduction, too, or am I mistaken?
The reduction could be easily written to an output iterator I imagine,
but the computation of the transitive closure of the condensation graph
must be done in all cases and stored somehow. Now in the successor sets.
OK.
...
...
So here are some other options I've thought of; if some of the stuff I
wrote in this paragraph is wrong, that might change which ones are
possible.
1 (easy). Accept an Output Iterator that will get the reduction as a
list of edges, built using the vertices from the original graph.  Do
the same for the closure.  Use named parameters to make both of these
outputs ignored by default, but they can also be used to create new
graphs.
Even it the reduction is not used or wanted by the user, it still would
be computed from the reduction of the condensation graph. But maybe it
is an accepable annoyance?
I thought you could save some time by computing only the closure; isn't a 
reduction on even the condensation graph extra work to compute?  Remember, 
too, you can probably assume that code like:

if (!boost::is_same<ReductionOutput, dummy_iterator>::value) {
   compute reduction
}

would be handled well by compilers.
...
...
2 (harder, but not because of template tricks). Create two
semi-implicit graphs, one for the closure and one for the reduction;
return both from your algorithm.  In this case, your graphs would only
store a map from each vertex to its component (i.e., vertex in the
condensation graph), which vertex is the representative of its
component (for reductions), and the condensation graph (either its
closure or reduction).  You would then need to do the reduction even
if the user doesn't need it, but you would not need to store the
reduction or closure of anything but the condensation graph.  You
could also just have two variants of the algorithm, one that creates
the condensation graph and its transitive closure, and another that
creates both the condensation graph, its closure, and its reduction.
That will likely save you code still, and named parameters can be used
to determine in one function whether the reduction is necessary to
compute (and this can probably be done with a run-time if statement on
a property of a template argument, without needing any fancy
metaprogramming beyond type_traits).
And a two functions to construct the reduction and the closure from
the cond.gr.rd. and the cond.gr.cl. and the mappings. Yeah, thats a
possible way.
I don't mean two functions.  Here's what I had in mind in more detail:

Assume:
- component == SCC
- G is the graph
- .+ is the transitive closure operator
- .- is the transitive closure/reduction operator
- C is the condensation graph
- comp(v) maps from a vertex in G to its corresponding component in C
- next(v) maps from a vertex in G to another vertex in the same component 
(creating a loop in each component)
- rep(v) is a bit flag that is true for one vertex in G in each component

Now, as far as I understand, an edge (v, w) is in G+ iff (comp(v), 
comp(w)) is an edge in C+.  An edge (v, w) is in G- iff either w = next(v) 
or rep(v) and rep(w) and (comp(v), comp(w)) is an edge in C-.  You can 
define a graph type based on those computations.  For the closure, you 
would store C+ and comp() explicitly, and then compute things like 
out_edges(v, G+) based on those using the formula above.  For the 
reduction, you would store C-, comp(), next(), and rep(), then use the 
formulas to compute out_edges(v, G-).  The other graph operations like 
vertices() would just forward to the original G.  Using that kind of 
approach, you can create graph objects (things that model the graph 
concepts) that can be used to get the transitive closure and reduction of 
G while only storing information about the closure and reduction of C and 
a couple of other property maps on G.
...
...
Your options:
...
1) Define a Graph type which absorbs all add_vertex and add_edge
operations like /dev/null in unix, then write one implementation
function like ...
I don't like this option because I don't think the reduction can be
removed completely, since I believe you need an explicit graph for the
reduction when you are computing (you use already filled-in parts of
it to create other parts, right?).
Nope. The reduction of the condensation graph is only written, not read
in the core algorithm. It gets read if one tries to build the transitive
reduction of the whole graph.
But I don't like that option, too. It is ugly.
OK.  When do you need to read it back to produce the closure of the full 
graph?  Don't you just need equivalents of the comp(), next(), and rep() 
maps that I talked about above, plus a way to emit the edges of the 
reduction of the condensation graph?
...
...
...
2) Use boost::enable_if like that: (In the following ... denotes an
ellipsis of zero or more parameters, not varargs.) ...
This option is feasible and probably easier than you think it is;
named parameters will take care of the dispatching, so you probably
won't need enable_if.
If you think it is feasible, then I would like to take that route. One
remedy still remains in my perception. If the user doesn't want the
transitive reduction, an unused local variable remains in the main
implementation function ...
Can that be removed somehow by fancy template stuff?
Something simple like the if statement I showed above should be enough to 
deal with that issue.  If not, it's pretty easy to use mpl::if_ to remove 
the reduction code more completely.
...
What's your favorite choice?
I like the implicit graph idea because of its elegance and reduction in 
output size, but it is more work to implement (the formulas I put down 
above are fairly simple but turning them into iterators is more work; if 
you want to go this way, I can give more detailed pseudocode closer to 
what you actually need to write to model BGL concepts).  Otherwise, either 
streaming edges to an output iterator (my option 1) or one of your options 
is reasonable.  Your option 1 is probably equivalent to your option 2, 
since you can special-case the code based on your black-hole graph type 
and so not actually do the graph insertions when it is used; remember too 
that named parameters can be used so the user doesn't actually see the 
black-hole graph.

-- Jeremiah Willcock