On Fri, 27 Sep 2013, Dominique Devienne wrote:
I'm sorting a set of SQL tables with foreign-key constraints. I'm currently using BGL's topological_sort for that, but struggled quite a bit because: * I didn't manage to adapt my own data-structure, so ended-up sorting indices to an array of my entries, and generated the edges as indice pairs on in this array (like the doc example). * Discovered via unit testing topological_sort didn't like vertices with no associated edges (a "free standing" table, with no relationships), so I'm "manually" removing those. * Discovered via unit testing topological_sort didn't like "duplicate" edges (a table with several columns referencing the same parent table), so I'm also manually removing those too.
I confess to not trying to fully understand the BGL, nor to have read the book. Given what I was trying to do, as outlined above, was using topological_sort really the best tool for the job? What are the alternatives, if any?
For my own education, why "singleton" vertices and duplicate edges make topological_sort break?
What behavior do you get from them? What is broken in those two cases?
But what brings me here today is to ask about a "stable" topo-sort. When I have several vertices which are all "unrelated" and thus sort at the same "level", topological_sort does not preserve the relative order of those "sibling" vertices, and I'd like something fully deterministic that I can count on.
It is probably deterministic (looking at the code), assuming your graph data structure preserves edge order.
What are my options? Does BGL provide something like this?
As a concrete example, given tables A, B, C, with A depending on both B, C, topological_sort gives me the C, B, A order (or adding the "levels" (C, B), A), while I'd like B, C, A, i.e. preserving the relative order of B, C.
Is the order of outgoing edges always backwards from what you want (I suspect it is)? If so, reversing the order when putting the edges into the graph would be an easy fix to the issue.
Thanks for any insight. --DD
PS: What about getting a topo-sort, but which also gives you the "levels", so that you can schedule those independent tasks (on each level) in parallel for example? Does BGL provide something like this? I've done it in the past with the help of the Sedgewick book, but I see a single topo-sort in BGL.
The right way to do that, most likely, is to use a proper multi-processor scheduling algorithm (not straight topological sort, which only works ideally for one processor) to assign tables to processors. Work-stealing is likely to behave well, also, is simple, and can handle the case where task times are unpredictable. -- Jeremiah Willcock