[Graph] Static allocation of nodes and edges
Dear all, I need to optimize my code with regards to both memory and time. The problem is simple: pre-allocating the adjacency matrix. The graph may have a huge number of nodes N, but the number of edges K per node is limited. While N can be in the order of one hundred million, K is at most 10. The graph will be directed, so no symmetry is needed. Do you know if I can pre-allocate such an adjacency matrix? If not, I suppose I must use an adjacency list in order to minimize memory consumption, am I right? Thanks!
On Fri, 8 Nov 2013, Sensei wrote:
Dear all,
I need to optimize my code with regards to both memory and time. The problem is simple: pre-allocating the adjacency matrix.
The graph may have a huge number of nodes N, but the number of edges K per node is limited. While N can be in the order of one hundred million, K is at most 10. The graph will be directed, so no symmetry is needed.
Do you know if I can pre-allocate such an adjacency matrix?
If not, I suppose I must use an adjacency list in order to minimize memory consumption, am I right?
If you have so few edges per vertex, you definitely don't want to use an adjacency matrix. If your graph will be read-only (in terms of structure, not properties), use compressed_sparse_row_graph; be sure to change the vertex and edge count sizes to uint32_t if you know your graph size will fit into that. If you are going to be doing modifications, use adjacency_list instead. Adjacency matrices use N^2 storage for N vertices, regardless of the number of edges, so they are unlikely to fit into memory for your problem size. -- Jeremiah Willcock
On 11/8/13, 3:15pm, Jeremiah Willcock wrote:
If you have so few edges per vertex, you definitely don't want to use an adjacency matrix. If your graph will be read-only (in terms of structure, not properties), use compressed_sparse_row_graph; be sure to change the vertex and edge count sizes to uint32_t if you know your graph size will fit into that. If you are going to be doing modifications, use adjacency_list instead. Adjacency matrices use N^2 storage for N vertices, regardless of the number of edges, so they are unlikely to fit into memory for your problem size.
Actually, I would create at run-time a (huge) graph, and after that it will be read-only. The construction will have a constant nnumber of nodes (so I can use this in the constructor), while edges must be calculated at run-time, and I don't see any facility for doing this. So I think I'm out of luck: I'm bound to the adjacency_list. Am I right? Cheers & Thanks!
On Fri, 8 Nov 2013, Sensei wrote:
On 11/8/13, 3:15pm, Jeremiah Willcock wrote:
If you have so few edges per vertex, you definitely don't want to use an adjacency matrix. If your graph will be read-only (in terms of structure, not properties), use compressed_sparse_row_graph; be sure to change the vertex and edge count sizes to uint32_t if you know your graph size will fit into that. If you are going to be doing modifications, use adjacency_list instead. Adjacency matrices use N^2 storage for N vertices, regardless of the number of edges, so they are unlikely to fit into memory for your problem size.
Actually, I would create at run-time a (huge) graph, and after that it will be read-only. The construction will have a constant nnumber of nodes (so I can use this in the constructor), while edges must be calculated at run-time, and I don't see any facility for doing this.
You can pass in a sequence (iterator range) of edges and the compressed_sparse_row_graph code will automatically count and sort them. Can you get that as a single entity (or a way to compute the edges), at least? -- Jeremiah Willcock
On Friday, November 8, 2013, Jeremiah Willcock wrote:
You can pass in a sequence (iterator range) of edges and the compressed_sparse_row_graph code will automatically count and sort them. Can you get that as a single entity (or a way to compute the edges), at least?
Well, yes. Edges are computationally intensive, but can be computed. I'd like to avoid multiple copies of nodes/edges, since the graph can easily occupy 10 gb of ram. I have the list of nodes in a std container, for completeness of information. The edges are calculated by a functional on node pairs (ie, I need n^2 operations). I could create a list of pairs of nodes for edges, but I think it would defeat the purpose of using a graph, at least, if creating the graph involves copying all edges. If there were a sort of "moving semantics" with this respect, it would help a lot. Cheers!
On Fri, 8 Nov 2013, Sensei wrote:
On Friday, November 8, 2013, Jeremiah Willcock wrote: You can pass in a sequence (iterator range) of edges and the compressed_sparse_row_graph code will automatically count and sort them. Can you get that as a single entity (or a way to compute the edges), at least?
Well, yes. Edges are computationally intensive, but can be computed. I'd like to avoid multiple copies of nodes/edges, since the graph can easily occupy 10 gb of ram.
OK.
I have the list of nodes in a std container, for completeness of information. The edges are calculated by a functional on node pairs (ie, I need n^2 operations).
Even to find just n*k edges?
I could create a list of pairs of nodes for edges, but I think it would defeat the purpose of using a graph, at least, if creating the graph involves copying all edges. If there were a sort of "moving semantics" with this respect, it would help a lot.
If you can get aligned vectors of edge sources, targets, and edge properties (if you have them), there is a constructor that effectively moves out of those. -- Jeremiah Willcock
On 08/11/13 22:13, Jeremiah Willcock wrote:
I have the list of nodes in a std container, for completeness of information. The edges are calculated by a functional on node pairs (ie, I need n^2 operations).
Even to find just n*k edges?
Unfortunately, it derives from a NP-hard problem. If I could find a heuristics, I will use it, but right now I'm just with a not-so-cool n^2 computations.
I could create a list of pairs of nodes for edges, but I think it would defeat the purpose of using a graph, at least, if creating the graph involves copying all edges. If there were a sort of "moving semantics" with this respect, it would help a lot.
If you can get aligned vectors of edge sources, targets, and edge properties (if you have them), there is a constructor that effectively moves out of those.
You mean these two?
// In-place unsorted edge list constructors (directed only)
template<typename InputIterator>
compressed_sparse_row_graph(construct_inplace_from_sources_and_targets_t,
std::vector
On Mon, 11 Nov 2013, Sensei wrote:
On 08/11/13 22:13, Jeremiah Willcock wrote:
I have the list of nodes in a std container, for completeness of information. The edges are calculated by a functional on node pairs (ie, I need n^2 operations).
Even to find just n*k edges?
Unfortunately, it derives from a NP-hard problem. If I could find a heuristics, I will use it, but right now I'm just with a not-so-cool n^2 computations.
OK, so you can iterate them, just not store them. One approach you might consider is to: 1. Create an iterator range containing all possible (source, target) pairs in sorted order. 2. Apply a boost::filter_iterator or similar that computes your predicate. 3. Use one of the edges_are_sorted_t constructors to do a single copy from the filtered iterator range; those constructors only make one pass over the range for a directed graph, so the fact that filter_iterator recomputes the predicate for each pass over the sequence is not a problem.
I could create a list of pairs of nodes for edges, but I think it would defeat the purpose of using a graph, at least, if creating the graph involves copying all edges. If there were a sort of "moving semantics" with this respect, it would help a lot.
If you can get aligned vectors of edge sources, targets, and edge properties (if you have them), there is a constructor that effectively moves out of those.
You mean these two?
Yes, those two.
// In-place unsorted edge list constructors (directed only) template<typename InputIterator> compressed_sparse_row_graph(construct_inplace_from_sources_and_targets_t, std::vector
& sources, std::vector & targets, vertices_size_type numverts, const GraphProperty& prop = GraphProperty()); template<typename InputIterator> compressed_sparse_row_graph(construct_inplace_from_sources_and_targets_t, std::vector
& sources, std::vector & targets, std::vector<EdgeProperty>& edge_props, vertices_size_type numverts, const GraphProperty& prop = GraphProperty()); So, you say I would be able to create two temporary vectors of src/dest and those constructors won't create a copy? Isn't the move constructor actually defined as constructor(type &&v)? I might be missing something here...
Yes, but this is C++03 code, so I didn't have move construction to work with. Thus, you pass in your own vectors by reference and they are given back empty. -- Jeremiah Willcock
On 11/11/13 14:34, Jeremiah Willcock wrote:
OK, so you can iterate them, just not store them. One approach you might consider is to:
1. Create an iterator range containing all possible (source, target) pairs in sorted order.
2. Apply a boost::filter_iterator or similar that computes your predicate.
3. Use one of the edges_are_sorted_t constructors to do a single copy from the filtered iterator range; those constructors only make one pass over the range for a directed graph, so the fact that filter_iterator recomputes the predicate for each pass over the sequence is not a problem.
Actually, I must admit I'm quite new to boost :( You are suggesting me to take a look to boost.range, and to boost.iterator libraries, am I correct? Then, after creating these magical objects (I assume they're like functors that will be computed at runtime), the graph constructor will do the rest. Did I interpret this correctly? Since this is a N^2 problem, and I must apply a function f(vert1, vert2) in order to create or not an edge, I was in the process of parallelizing the edge construction. Note that computations are completely independent from one node to another. This would be a great thing, as you can imagine. I read that BGL does not support concurrency, so if you have any suggestion, I would greatly appreciate that :)
Yes, but this is C++03 code, so I didn't have move construction to work with. Thus, you pass in your own vectors by reference and they are given back empty.
Thanks for your pointer. However, I feel something is weird in my code:
the targets get deallocated, while sources remain the same. This is my
testing code, just for fun:
// IN THE HEADER:
// typedef boost::compressed_sparse_row_graphboost::directedS
boost_graph;
long num_verts = 1000000;
std::vectorstd::size_t orig, dest;
orig.reserve(num_verts * 16);
dest.reserve(num_verts * 16);
std::cout << "reserved " << num_verts * 16 << std::endl;
std::cout << "orig " << orig.size() << " dest " << dest.size() <<
std::endl;
std::default_random_engine generator;
std::uniform_int_distributionstd::size_t distribution(0, num_verts);
auto rng = std::bind(distribution, generator);
for (long i = 0; i < num_verts; i++)
{
std::size_t s = rng();
std::size_t e = rng() ;
orig.push_back(s);
dest.push_back(e);
//std::cout << s << " - " << e << std::endl;
}
std::cout << "orig " << orig.size() << " dest " << dest.size() <<
std::endl;
graph_ = std::unique_ptr
On Wed, 13 Nov 2013, Sensei wrote:
On 11/11/13 14:34, Jeremiah Willcock wrote:
OK, so you can iterate them, just not store them. One approach you might consider is to:
1. Create an iterator range containing all possible (source, target) pairs in sorted order.
2. Apply a boost::filter_iterator or similar that computes your predicate.
3. Use one of the edges_are_sorted_t constructors to do a single copy from the filtered iterator range; those constructors only make one pass over the range for a directed graph, so the fact that filter_iterator recomputes the predicate for each pass over the sequence is not a problem.
Actually, I must admit I'm quite new to boost :(
You are suggesting me to take a look to boost.range, and to boost.iterator libraries, am I correct?
You don't need Boost.Range -- you can just create your own range (which is just a pair of iterators) and use Boost.Iterator's filter_iterator on top of it. Boost.Range (or Oven, a third-party library that extends it -- see the comprehension range in http://p-stade.sourceforge.net/oven/doc/html/index.html) might make it easier to iterate through all vertex pairs and Boost.Range includes its own filtering functionality so you don't need to work with Boost.Iterator directly.
Then, after creating these magical objects (I assume they're like functors that will be computed at runtime), the graph constructor will do the rest.
Did I interpret this correctly?
Yes.
Since this is a N^2 problem, and I must apply a function f(vert1, vert2) in order to create or not an edge, I was in the process of parallelizing the edge construction. Note that computations are completely independent from one node to another.
This would be a great thing, as you can imagine. I read that BGL does not support concurrency, so if you have any suggestion, I would greatly appreciate that :)
The version you are working with is the sequential version. There is a Parallel Boost Graph Library (in Boost) plus a new version of it that we will likely be releasing soon (outside Boost), but that's not necessary for you. If you need parallel generation, it might make the most sense to use the source and target vector constructor, building each of those by concatenating per-thread vectors (or using MPI_Gatherv if you are using distributed memory).
Yes, but this is C++03 code, so I didn't have move construction to work with. Thus, you pass in your own vectors by reference and they are given back empty.
Thanks for your pointer. However, I feel something is weird in my code: the targets get deallocated, while sources remain the same.
Yes -- I forgot that it may not necessarily clear things. Basically, what the code is doing is sorting the two (or three) vectors together in-place, then compressing the source vector into a new vector. It keeps the target and property ones for itself, but doesn't need the original source vector (but it modifies it from what you passed in, so you can't rely on it staying the same).
The first two lines are cool, from zero to num_verts, but the last one puzzles me. Inspecting the RAM allocation, I have 66 MB allocated before reserving my vectors, 81 MB after generating the fake edges, and 89 after creating the graph.
Am I doing something wrong with my code?
That's about right -- with the settings you used, compressed_sparse_row_graph will take 8 * (num_vertices + num_edges + 1) (+ the sizes of your properties) bytes of memory, plus some constant overhead. A coordinate-format pair of vectors will be 16 * num_edges, but if you clear/shrink-to-fit your vectors after construction, you should get the rest of that storage back. -- Jeremiah Willcock
participants (2)
-
Jeremiah Willcock
-
Sensei