Re: [Boost-users] All orderings of boost::graph::topological_sort()
Excuse me for the double-post: my previous post had a bad subject line... Thank you very much for your prompt reply, Stephen. On Fri, Feb 22, 2008 at 12:51, Stephen Nuchia wrote:
For any significant-sized input the number of orderings will be too large to do anything useful with.
At the moment my input consists of about 32 vertices, although I'd like not to count on it to remain this way. I can probably assume the number of nodes will remain in the order of magnitude of hundreds or less. I'm talking about a schedule generating program for an undergraduate student. It generates a schedule for a semester for a student, given some constraints (time of day, courses done already and so on) and the course dependency (this course must not be taken before that one and so on). I would very much like to support other complexities such as some courses may be taken while a certain dependent is being taken, while others must be taken before any of their dependents may be taken. But I leave such complexities as worries for future versions. For the time being, all I want is to be able to generate all the different schedules (i.e., all various topo sorts). Indeed, something like std::next_permutation() would make a lot of sense. It would be fast and flexible. My next task would then be to find an algorithm to quickly estimate the number of different orderings. :)
For an input small enough that the set of feasible orderings will fit in memory I would simply generate all permutations and screen for feasibility rather than attempting to create an enumeration algorithm.
I don't follow. Can I change the ordering of vertices without ruining the graph's integrity? How? Would this affect the toposort output?
Create a comparability matrix and use it to prune the permutation tree, perhaps? N^2 bits doesn't sound prohibitive.
I don't follow...
This has probably been studied but I'm at work so I'll speculate. For larger inputs with many constraints it may be worthwhile to create an enumeration algorithm.
For purposes of empirical testing of build-ordering systems it might be sufficient to go into the algorithm and randomize the selection processes whenever it makes arbitrary choices. This may or may not be easily achieved by randomly pre-sorting the input, depending on the internals of the algorithm. If the sort is stable that will work.
That is the reason for my query on my OP in the first place. I don't quite understand how manipulate the input vertices of the depth_first_search(). For what I understand, depth_first_search() works on a set of start vertices (which have no incoming edges), and for each of these the algorith dives in to the depth of the vertex's tree. Permuting the order of these vertices would give me advance me a lot, but I don't know how to do that.
If there is enough interest to justify adding an efficient deterministic enumerator for poset orders I'll write one. Graph-theoretical algorithms are a fun way to spend the weekend :-)
I personally think the topological_sort() algorithm begs for an algorithm that would allow iterating over all unique orderings. As I stated above, I think something of the sort of std::next_permutation() would make a lot of sense, especially because graph algorithms, and finding all different toposorts in particular, tend to be expensive operations. Simply asking for one ordering at a time gives a lot of flexibility to the user with the added benefit of efficiency and integration with the STL's way of doing things. -- Matan
I'm talking about a schedule generating program for an undergraduate student. It generates a schedule for a semester for a student, given some constraints (time of day, courses done already and so on) and the
I think you'll find the number of feasible schedules to be much larger than you have time to enumerate in many cases. Whether for build scheduling, course scheduling, or analysis of protocols and distributed algorithms, the basic topo sort setup can be applied to finding feasible parallel schedules (without resource constraints) by converting all non-atomic events into begin < end pairs. There may be a trick for extending it to resource constraints (e.g. number of CPUs) but I can't think of one right now. The variations on this theme are well studied; scheduling is an ancient application of automatic data processing machinery. I suggest we leave off discussing applications (interesting though they are) and think about whether and how the existing library might be extended or adapted to the task of enumerating feasible orderings of a poset.
For the time being, all I want is to be able to generate all the different schedules (i.e., all various topo sorts). Indeed, something like std::next_permutation() would make a lot of sense. It would be fast and flexible. My next task would then be to find an algorithm to quickly estimate the number of different orderings. :)
You seem to be convinced that your scheduling problem can be represented as a topological sorting problem. I am skeptical of this but, as mentioned above, the application is not really the point. Heuristics probably exist, I can think of a few that might have some value. There are N! permutations on N elements and every independent edge cuts the space in half. If you can find a set of k independent single-edge subgraphs you know there are no more than N!/2^k feasible permutations. Wherever you have mutually incomparable chains you can compute the number of feasible interleavings. I'm not sure but I suspect computing the number exactly will require the same machinery as enumerating them. If we're interested in truncated permutations (or general projections of permutations) counting gets much harder.
For an input small enough that the set of feasible orderings will fit in memory I would simply generate all permutations and screen for feasibility rather than attempting to create an enumeration algorithm.
I don't follow. Can I change the ordering of vertices without ruining the graph's integrity? How? Would this affect the toposort output?
I'd have to look at the representation details and I haven't yet. In general, the abstract structure of the input to any algorithm is encoded in some physical way and that physical encoding may contain information beyond that in the abstract data structure it represents. The example to which I was alluding is from conventional sorting. When the key fields of two input records are equivalent under the rules of a sort, the algorithm has to choose which will be appear first in the output. The choice may be arbitrary -- either random or effectively random -- or it may be based on the input in some direct way. One choice is to use non-key fields to extend the key. Another is to use the positions of the records in the input stream to extend the keys. A sort that implements this last policy is said to be "stable" because it keeps records with equivalent keys in their original order. I don't know if it is standard terminology but I would call a topological sort stable if it kept incomparable elements in their original order -- assuming that has meaning. The input to a topological sort has to identify the elements of the poset by some runtime representation: an object pointer, a number, a string, whatever. And the data specifying the poset has to be organized in the computer's storage somehow; there is usually some flexibility in this arrangement. If you inspect the core of the topological sort algorithm you will find that manipulating these abstractly irrelevant details of the physical representation of the input will influence the "free" choices it makes when the output is not fully determined by the abstract input.
Create a comparability matrix and use it to prune the permutation tree, perhaps? N^2 bits doesn't sound prohibitive.
I don't follow...
That's probably not standard terminology. Create a square matrix with one bit per entry, indexed on rows and columns by poset element identifiers. Mark a 1 in Mij if Vi <= Vj (say) is in the input. Then iterate adding the transitive implications of the input constraints until M contains an explicit representation of the partial order relation. Details left as an exercise involving bitwise-or and looking it up in a book. Now if you use the standard permutation generator you can devise an algorithm to screen its output for consistency with the ordering relation using the matrix. But you could also pass the matrix in to the permutation generator and it could save a lot of time by not generating BACD... and BADC... if it knows that A<=B is in the relation.
depth_first_search(). For what I understand, depth_first_search() works on a set of start vertices (which have no incoming edges), and for each of these the algorith dives in to the depth of the vertex's
The abstract algorithm works on a set. Any concrete realization of the algorithm operates on representations of sets. Control the input and you control the output :-)
I personally think the topological_sort() algorithm begs for an algorithm that would allow iterating over all unique orderings. As I
My biases say that in interesting cases you don't have time to iterate over them. But that just reveals a difference in what we find interesting. Any set you can enumerate before the heat death of the universe is uninteresting. ;-) In contrast, the set of "traces" (feasible interleavings of atomic events) for a distributed reactive system is quite interesting. The set of 4-threaded build schedules for a software system with 15000 source files is fairly interesting. A variation on the permutation enumerator that takes some kind of feasibility information and uses it to prune the tree (the logical tree of outputs that it is virtually traversing) makes sense to me. A partial order specification would be the obvious parameter and might be logically equivalent to any other reasonable parameterization but I'd need to think about that. It might be feasible to design the parameterization interface a bit more generally and admit things like resource constraints that are hard to encode in the poset abstraction alone. And if we're going to implement tree pruning we should provide an interface for a figure-of-merit function so we can also prune out feasible but unprofitable branches. If all you want is a solution to your problem as stated, code it in Prolog. The interactive interpreter will keep on generating sequences as long as you tell it "no" in response to each one. The only reason to add this to the library would be if it would be used to advantage in a variety of C++ programs. I can think of a whole raft of good uses for a constrained combination/permutation generator with general tree pruning. On the other hand, a whole lot of scheduling and optimization code already exists and it would be wise to inventory that before we start hacking. -steve
participants (2)
-
Matan Nassau
-
Stephen Nuchia