[gsoc] Interest in BGL v2?

Hello all, My name is Michael and, if there is enough interest from the mailing list, I will be submitting an application for GSoC 2011 to work on the BGL v2. I have been working on a project called Origin ( http://code.google.com/p/origin/), when time permits, that houses the code that I intend to integrate into boost. The project was started by Andrew Sutton and I have taken an active role in its creation. The new library focuses on ease of use, replacement of property maps for labels, data structures with more concise and clear semantics, a new graph concept hierarchy, implementing graph algorithms to align with STL abstractions, and, finally, integrate C++0x into the BGL. I have already worked on the existing BGL back in 2009 for GSoC. While working on the BGL, I started seeing opportunities to bring the BGL more in-line with modern techniques and practices. I found that the library, though robust and quite mature, was lacking in some areas. The most unavoidable being C++0x support. The other design concern was that some of the data structures felt more like meta-programs. This is no good since a meta-program generates types that may model one of many concepts. Another feature that has been on BGL's want list is the implementation of algorithm objects. Some of the algorithm objects are already present in Origin. I realize that this is not a 3 month project. I fully understand that. However, I plan to continue maintaining this library in Origin, as I have been, until it is mature enough for a release and integration into other libraries. Hopefully there is enough interest to get this off of the ground. Please respond with your interests, questions and/or doubts. Michael Lopez

At Mon, 28 Mar 2011 14:12:46 -0400, Michael Lopez wrote:
Sounds exciting!
If you're referring to adjacency_list, then I can only say that generative data structures pretty much always feel like metaprograms... because they _are_ metaprograms. Are you planning to move away from generative programming in BGLv2?
This is no good since a meta-program generates types that may model one of many concepts.
In what way does that make it "no good?"
Another feature that has been on BGL's want list is the implementation of algorithm objects.
For example? -- Dave Abrahams BoostPro Computing http://www.boostpro.com

With regards to the adjacency list, that's the plan for Origin. Whether or not the rest of the Boost community likes the approach remains to be seen :) It wouldn't be hard to write a generative wrapper around a set of specific data structures to achieve a similar affect. If it was felt that such a wrapper was needed. It's not high on the list of my priorities.
I think that's a typo, "not good", which is somehow less condemning than "no good" :) The argument comes from a usability perspective. If I want to use a data structure, I expect it to work a certain way---to have a specific interface with guaranteed behaviors. Generative data structures like the adjacency list don't fit that model. All of the generated implementations may be AdjacencyLists, but they're all refinements in one way or another. One side effect of this is conditional document dependent on the choice of selector arguments. Basically, it's a metaprogram masquerading as a data structure, which I perceive as not great. I'm sure there are plenty of people who will disagree.
Just a class that wraps the algorithm. There are several in the BGL. The technique offers a little more flexibility when the number of parameters is large. Andrew

On Mon, 28 Mar 2011, Andrew Sutton wrote: (snip)
I think the original intent of algorithm objects for BGL is inversion of control flow: instead of something like BFS calling visitor methods at event points, you have a BFS object that suspends itself where it would have called the visitor, and then is continued by surrounding code. This makes it easier to do things like interleave two BFS runs at the same time, which you can't do with the existing model (without threads or coroutines). -- Jeremiah Willcock

On 29/03/2011 02:20, Jeremiah Willcock wrote:
To me that would be the most valuable aspect of algorithm classes. For the purpose of suspending and restarting the Dijkstra shortest path algorithm I built a class around existing BGL code that works perfectly fine for me. I am happy to share it, even though it is surely not up to Boost standards. Alex

Is there a reference to that description? The algorithm objects in the BGL seem to be giant function objects. I think our major use case was to use these objects as an alternative method to calling functions with large numbers of input and output parameters. The idea was to create an algorithm object over its required or optional associated data and then call it using a limited number of inputs (a graph, a vertex, etc.). After termination, the associated data is also available through the object. We've also written BFS (and soon DFS) as Range models, which offer a limited form of inversion. This approach also supports interleaving 2 BF traversals. I haven't actually tried it out. Earlier designs of these classes had more control points, we weren't entirely sure what the use cases were, so we simplified them. It wouldn't be hard to bring previous designs back. One concern with this approach is that adds overhead to the of the algorithm. I didn't want to force users to use a slower algorithm when there wasn't need. Also, they're a lot harder to write, and it's not always straightforward to recall Andrew

On Tue, 29 Mar 2011, Andrew Sutton wrote:
There is an example at http://www.springerlink.com/content/20vy0y0rq5mtcqmb/ (summary also at http://www.cs.rpi.edu/~musser/gp/dagstuhl/gpdag_28.html). I didn't know there were any algorithm objects at all in BGL, at least according to the definition I have in mind.
Yes, they are definitely harder to write and can be slower (because of the state saving), so they should not be the default mode. I think they are useful in special cases, though, and so that's why they are on the to-do list. -- Jeremiah Willcock

At Mon, 28 Mar 2011 18:45:26 -0500, Andrew Sutton wrote:
I was never very fond of generative programming (despite the iterator library), so not I.
Sure; it's easier to carry state across function calls. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Mar 29, 2011, at 9:49 AM, Dave Abrahams wrote:
I won't argue that today, although I'm going in the opposite direction with my Metagraph project. I'm making the pro- argument in my upcoming BoostCon paper/talk on MPL.Graph. I wonder if anyone can say something about the new graph concept hierarchy? I don't see any origin docs yet. I'd like to stay compatible and I'm also curious what was found lacking in the old graph concepts. The unifying graph concepts in BGLv1 are really impressive for making all graph libraries adaptable to its interface & algos. Thanks, Gordon

At Thu, 31 Mar 2011 03:26:05 -0400, Gordon Woodhull wrote:
+1 -- Dave Abrahams BoostPro Computing http://www.boostpro.com

The BGL concept hierarchy is being re-evaluated based on the rewrite of the data structures and algorithms, meaning that the concepts come after we implement most of the data structures and algorithms. We will then analyze the library and define the concepts based on the types and algorithms we have in place. After constructing the concept hierarchy, we will refine the graph types, then the concepts, etc. This iterative method is the approach we've chosen for Origin. There was a time when we started to build the concept hierarchy before the data structures were in place and even at the same time the data structures were being implemented, but that wasn't working. We found that building data structures to model concepts, at least initially, restricted us to the design decisions we made with the concepts.

The BGL concept hierarchy didn't address graph mutability very well. There aren't a lot of algorithms in the library that require a lot of add/remove functions, so it's not terribly surprising. I would also have liked to see more semantic concepts: directed and undirected graphs, weighted graphs, simple graphs, multigraphs. All of these concepts show up as requirements in various places in the library, but aren't expressed conceptually. I disagree with the statement that the *concepts* make other graph libraries adaptable to its interface. Specializations of the graph_traits class and a ton of overloaded functions accomplish that adaptation. I suspect that you can achieve exactly the same effect using wrapper classes (and tons of overloaded functions). The concepts give the requirements for adaptation. Andrew

On Mar 31, 2011, at 9:01 AM, Michael Lopez <mlopez7@kent.edu> wrote:
My understanding, from talking with Jeremy Siek back in the day (when I had a different competing graph library ;) was that the original graph concepts had been derived by looking at all existing libraries and without any one implementation in mind. To state my question more directly: is compatibility with other graph libraries a goal of the Origin concepts? I suppose it needn't be, because it will surely be possible to adapt origin to BGLv1 and that could remain the lingua franca. Cheers Gordon

My understanding, from talking with Jeremy Siek back in the day (when I had a different competing graph library ;) was that the original graph concepts had been derived by looking at all existing libraries and without any one implementation in mind.
That's right, but the concepts aren't the adaptors, which is how I read your previous comment.
To state my question more directly: is compatibility with other graph libraries a goal of the Origin concepts?
I don't think it would be in our interest to exclude the possibility---I'm not even sure how you would define concepts to do so. In other words, you should be able to "easily" write an adaptor for LEDA, SGB, or whatever. We're just not planning to write them. I will say that our design does take a slightly more object-oriented approach (g.size(), for example), but that hardly precludes adaptation. Adaptors would have to be written as classes, but that shouldn't be a big deal.
I suppose it needn't be, because it will surely be possible to adapt origin to BGLv1 and that could remain the lingua franca.
That's not going to happen. Andrew

At Thu, 31 Mar 2011 08:53:02 -0500, Andrew Sutton wrote:
I think you underestimate the cost of adaptation-via-wrapper and urge you to stick with free functions unless you have a really good reason to do otherwise. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

It's a balancing act. Operations that we know are being used in generic libraries typically are get free functions (out_edges() being the most important so far). Otherwise, the data structures are "self-contained". The approach more or less follows the std containers. Now that I've said that, I think I need to write vertices(g) and edges(g) :) Andrew

At Thu, 31 Mar 2011 09:43:52 -0500, Andrew Sutton wrote:
IMO it should be: "operations that are used in generic algorithms get free functions." -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Mar 31, 2011, at 3:45 PM, Andrew Sutton wrote:
It sounds to me like you are going to end up keeping the BGLv1 functional graph concepts for compatibility, and also providing STL-style OO concepts (I prefer this aesthetic too). I'll strive to be compatible with both, if possible. I don't see how the old graph concepts are confining in any way, if anything they are sometimes too loose, e.g. when a node object is self-sufficient and you don't need the graph to lookup its edges. Not that I'm complaining; I understand why they are this way. On Mar 31, 2011, at 9:21 AM, Andrew Sutton <asutton.list@gmail.com> wrote:
+1 on all this - I'll be watching with rapt attention, and following your lead. Thanks, Gordon

It sounds to me like you are going to end up keeping the BGLv1 functional graph concepts for compatibility, and also providing STL-style OO concepts (I prefer this aesthetic too). I'll strive to be compatible with both, if possible.
There are going to be some direct analogs, but they won't be the "same". Different names, (hopefully) clarified semantics, etc. For example, vertex_descriptor and edge_descriptor simply become vertex and edge and are required to be convertible to bool (so you can test to see if they're valid or not). The edge function is replaced by get_edge, with adjacency matrices (eventually) supporting m(u, v). AdjacencyGraph is probably going to go away since adjacency is just an abstraction over incidence. Oh... we don't support "edgeless" graphs, but the BGL doesn't really do that either.
I don't see how the old graph concepts are confining in any way, if anything they are sometimes too loose, e.g. when a node object is self-sufficient and you don't need the graph to lookup its edges. Not that I'm complaining; I understand why they are this way.
They're confining in the sense that trying to model them would actually constrain the design to matching those semantics. We don't want to follow suit. We want to reevaluate and redesign. On a side note, we actually went through a design iteration where we considered making nodes (and edges) self-sufficient. It ends up requiring avoidable overhead for vertex and edge objects in the general case. We didn't want to mandate the additional memory costs, so we're stuck with the same kinds of interface. You need the graph for all of your interesting calls. You can wrap a vertex (or edge)/graph pair to create reliably self-sufficient abstractions. We haven't really explored this design avenue, though.
+1 on all this - I'll be watching with rapt attention, and following your lead.
Maybe we'll have a user in the future? :) Andrew

Thanks for the overview of changes, this is helpful and reassuring. On Mar 31, 2011, at 6:23 PM, Andrew Sutton wrote:
Sounds reasonable.
The edge function is replaced by get_edge, with adjacency matrices (eventually) supporting m(u, v).
Not sure what you mean here - just a rename?
AdjacencyGraph is probably going to go away since adjacency is just an abstraction over incidence.
I still have to look up the difference between adjacency list and incidence list, every time; not sure why those always sound like the same thing. But you're talking about AdjacencyGraph - yeah, I never saw the point of iterating over adjacent vertices instead of the edges that lead there.
Oh... we don't support "edgeless" graphs, but the BGL doesn't really do that either.
You mean ForwardSequence of nodes? :-D
I still don't see it, since the BGL concepts supported all known graph interfaces.. but I'll wait and see what you come up with.
Yes, my LGraph library has those annoying graph_ pointers per node/edge. I only support the dynamic case, so it's not a big deal for me.
+1 on all this - I'll be watching with rapt attention, and following your lead.
Maybe we'll have a user in the future? :)
I doubt that. The problem is still the same as a decade ago: BGL doesn't really support higher-order graphs. BGL's sort of halfway-there subgraph support was an attempt to bring me and Dynagraph into the fold. Now it seems that our designs are diverging even further. Metagraph attempts to tackle the general higher-order graph problem, and that definitely requires generative programming. And BGL is moving toward straight templated classes, it sounds like. I'd argue it's a good thing to have some friendly competition, like Polygon vs Geometry, Statechart vs MSM, Regex vs Xpressive. It'll keep us both honest and hopefully lead to an improved design for both libraries. Cheers, Gordon

More or less. It also allows us to syntactically differentiate adjacency matrices from other graph models, which means you can write a type trait for it. is_adjacency_matrix is (will be) true when m(u, v) is a valid expression for the types of m, u, and v. This further means that we can use enable_if to select algorithms based on that property.
An IncidenceGraph gives you out_edges(g). An AdjacencyGraph gives you adjacenct_vertices(g). This function is almost universally (within the BGL) implemented using an iterator adaptor over an out edge iterator. Our view is that adjacency is exactly that: an adapted view of incidence. You're right, though. I think there are close to 0 algorithms that actually use adjacent_vertices.
Yup :) I thought about what it would take integrate this kind of graph into the library, but punted on it :) It's kind of a weird concept when you're used to thinking about G = (V, E).
I doubt that. The problem is still the same as a decade ago: BGL doesn't really support higher-order graphs. BGL's sort of halfway-there subgraph support was an attempt to bring me and Dynagraph into the fold.
Now it seems that our designs are diverging even further. Metagraph attempts to tackle the general higher-order graph problem, and that definitely requires generative programming. And BGL is moving toward straight templated classes, it sounds like.
Fair enough :) I actually don't know much about high order graphs or their applications, so I'll have to take a look a good look at the metagraph library. When we eventually get around to defining the notion of subgraphs, I may ping you for some design criteria. I'm not at all sure what motivated the BGL solution. That won't be for quite a while.
I'd argue it's a good thing to have some friendly competition, like Polygon vs Geometry, Statechart vs MSM, Regex vs Xpressive. It'll keep us both honest and hopefully lead to an improved design for both libraries.
I like to phrase this argument as "software biodiversity" :) On a side note, I tried to encourage this perspective with this year's GSoC effort. I'm not sure it worked as well as I would have liked. There seems to be a "there can be only one" mentality, and I think that hurts both our ability to attract and retain student developers and generate meaningful data points to drive design discussions. Andrew

On Fri, 1 Apr 2011, Andrew Sutton wrote:
I will make one correction on that -- the compressed_sparse_row_graph has a custom implementation of adjacent_vertices() that is much simpler than the adaptor approach. I believe grid_graph might have something similar. -- Jeremiah Willcock

On Mar 31, 2011, at 10:20 PM, Mostafa <mostafa_working_away@yahoo.com> wrote:
Can you elaborate?
Thanks Mostafa. I have the same Q. Back in the day Jeremy told me he thought wrapper classes could be as efficient as function wrappers, but he couldn't convince others, thus BGL is functional. But I do remember seeing some convincing arguments against wrappers since then. Also, do concept maps (RIP) fix this? I seem to remember some blurring between OO and functional there. Thanks, Gordon

I would be very surprised if wrapper classes imposed any additional overhead (unless they're using virtual functions). I think Dave's argument comes from a perspective of supporting broader applicability. If a generic algorithm requires x.foo(), then you have a very strong requirement on the type of x, and you're forced to use wrapper classes for adaptation. If you write foo(x), then you just required some overload of foo for x. It gives you greater freedom to adapt the implementation to your type. Origin has a tendency to use a more OO-style for defining individual data structures, but uses the functional style in generic contexts. The concepts will be defined in terms of the free functions and not the data structures themselves. We'd like to support the idea writing g.size() should be the same as size(g). This lets users use OO-style when they know the type being operated on or the functional style when they only know the concept it models.
Also, do concept maps (RIP) fix this? I seem to remember some blurring between OO and functional there.
It would be interesting to see what people would do with concept maps. At the very least it adds another choice for adaptation. Free functions and traits, wrapper classes, and now concept maps (assuming they show up again). Andrew

At Thu, 31 Mar 2011 19:20:47 -0700, Mostafa wrote:
Briefly: if you have to wrap things to get them to conform to a concept, then they are no longer the original thing anymore. For example: T x, y; Wrapper w(y); swap(x,w); // <=== compilation fails A bigger issue is the question of whether the wrapper has value semantics or not: do you pay to copy y into w, or do you merely refer to y from w? The former can be costly, but the latter can be wrong. I believe other cases of this general problem will crop up repeatedly in real code. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

At Thu, 31 Mar 2011 09:01:01 -0400, Michael Lopez wrote:
Good going. That's one of the hardest lessons to learn in Generic Programming, but also one of the most important. -- Dave Abrahams BoostPro Computing http://www.boostpro.com
participants (7)
-
Alex Hagen-Zanker
-
Andrew Sutton
-
Dave Abrahams
-
Gordon Woodhull
-
Jeremiah Willcock
-
Michael Lopez
-
Mostafa