
On Wed, 13 Nov 2013, Sensei wrote:
On 11/11/13 14:34, Jeremiah Willcock wrote:
OK, so you can iterate them, just not store them. One approach you might consider is to:
1. Create an iterator range containing all possible (source, target) pairs in sorted order.
2. Apply a boost::filter_iterator or similar that computes your predicate.
3. Use one of the edges_are_sorted_t constructors to do a single copy from the filtered iterator range; those constructors only make one pass over the range for a directed graph, so the fact that filter_iterator recomputes the predicate for each pass over the sequence is not a problem.
Actually, I must admit I'm quite new to boost :(
You are suggesting me to take a look to boost.range, and to boost.iterator libraries, am I correct?
You don't need Boost.Range -- you can just create your own range (which is just a pair of iterators) and use Boost.Iterator's filter_iterator on top of it. Boost.Range (or Oven, a third-party library that extends it -- see the comprehension range in http://p-stade.sourceforge.net/oven/doc/html/index.html) might make it easier to iterate through all vertex pairs and Boost.Range includes its own filtering functionality so you don't need to work with Boost.Iterator directly.
Then, after creating these magical objects (I assume they're like functors that will be computed at runtime), the graph constructor will do the rest.
Did I interpret this correctly?
Yes.
Since this is a N^2 problem, and I must apply a function f(vert1, vert2) in order to create or not an edge, I was in the process of parallelizing the edge construction. Note that computations are completely independent from one node to another.
This would be a great thing, as you can imagine. I read that BGL does not support concurrency, so if you have any suggestion, I would greatly appreciate that :)
The version you are working with is the sequential version. There is a Parallel Boost Graph Library (in Boost) plus a new version of it that we will likely be releasing soon (outside Boost), but that's not necessary for you. If you need parallel generation, it might make the most sense to use the source and target vector constructor, building each of those by concatenating per-thread vectors (or using MPI_Gatherv if you are using distributed memory).
Yes, but this is C++03 code, so I didn't have move construction to work with. Thus, you pass in your own vectors by reference and they are given back empty.
Thanks for your pointer. However, I feel something is weird in my code: the targets get deallocated, while sources remain the same.
Yes -- I forgot that it may not necessarily clear things. Basically, what the code is doing is sorting the two (or three) vectors together in-place, then compressing the source vector into a new vector. It keeps the target and property ones for itself, but doesn't need the original source vector (but it modifies it from what you passed in, so you can't rely on it staying the same).
The first two lines are cool, from zero to num_verts, but the last one puzzles me. Inspecting the RAM allocation, I have 66 MB allocated before reserving my vectors, 81 MB after generating the fake edges, and 89 after creating the graph.
Am I doing something wrong with my code?
That's about right -- with the settings you used, compressed_sparse_row_graph will take 8 * (num_vertices + num_edges + 1) (+ the sizes of your properties) bytes of memory, plus some constant overhead. A coordinate-format pair of vectors will be 16 * num_edges, but if you clear/shrink-to-fit your vectors after construction, you should get the rest of that storage back. -- Jeremiah Willcock