Re: Re: [BGL] forced client complexity

Yea, but first the original poster should indicate if this is what he's looking for -- I never needed this myself.
What I was looking for, really, was dialog on existing design strategies. It sounds like your issues with property maps mirror my confusion with the library design, though. While I can fully appreciate the desire to write generic code, it feels like BGL has pushed the complexity of creating and enforcing coupling between the generic graph/node/visitor/algorithm/etc onto the client. At a basic conceptual level, I think of a graph as a data structure composed of nodes and edges. If I'm modeling a family tree, I expect the nodes to be objects of some family-member class. C++ exposes many robust and elegant methods for modeling the family-member - why should I, as a client, have to use property-maps at all? If the node's "index" is related to the object, why would it not be a natural member of the node-type's class? If, instead, it is a synthesized construct for communication between a graph and the algorithm operating over it, why should the client be responsible for its maintenance instead of the framework? What are the advantages of this approach? What are the disadvantages of simplifying things for the client? This is the type of design issue that I intended to question with my initial post. -prs __________________________________ Do you Yahoo!? Yahoo! Small Business $15K Web Design Giveaway http://promotions.yahoo.com/design_giveaway/

Provisional Dissonance wrote:
Yea, but first the original poster should indicate if
this is what he's
looking for -- I never needed this myself.
Hi, (btw, if possible, it would be nice to know your real name, so that I could write "Hi <your real name>")
At a basic conceptual level, I think of a graph as a data structure composed of nodes and edges. If I'm modeling a family tree, I expect the nodes to be objects of some family-member class. C++ exposes many robust and elegant methods for modeling the family-member - why should I, as a client, have to use property-maps at all?
Hmm, interesting question. In fact, something like typedef graph<City, Road> Road_map; looks quite reasonable to me.
If the node's "index" is related to the object, why would it not be a natural member of the node-type's class?
I don't think "index" should be type of 'City' class.
If, instead, it is a synthesized construct for communication between a graph and the algorithm operating over it, why should the client be responsible for its maintenance instead of the framework? What are the advantages of this approach? What are the disadvantages of simplifying things for the client? This is the type of design issue that I intended to question with my initial post.
I think the biggest problem is: how do you identify vertices? If you use "City&" you'd have some problems: 1. How vertices are really identified? Using operator==? Is it possible to insert two 'City' objects which are equal? If not, how to insure uniqueness? Do we really want 0(log n) lookup on each 'add_vertex'. 2. Even if 'City' objects in a graph are unique, how will add_edge be implemented? Do you need O(log n) lookup to find internal adjacency lists? 3. What's edge, and how it is identified? For example, 'Road' might not contain source/target vertices, but still we need to be able to find source/target vertices. One possible solution that I see is the make vertex_descriptor be always pointer type. It will point at some internal structure but also provide operator* which will return 'City'. Or, maybe, it should be possible to iterate over values of 'City', not only over values of vertex_descriptor. I still don't know if using City to identifiy vertices is feasible. - Volodya

On Friday 02 April 2004 07:35 am, Vladimir Prus wrote:
One possible solution that I see is the make vertex_descriptor be always pointer type. It will point at some internal structure but also provide operator* which will return 'City'. Or, maybe, it should be possible to iterate over values of 'City', not only over values of vertex_descriptor.
I still don't know if using City to identifiy vertices is feasible.
Why not make vertices derive from (or be aggregated with and convertable to) City? It has a whole ton of advantages: 1) The user can just think of nodes as if they were City objects, including easy access to City members. 2) Can trivially build a property map from a member pointer; wouldn't it be nice to say: dijsktra_shortest_paths(g, v, distance_map(&City::distance)) ? 3) graph<Vertex, Edge, Rep> can have Rep=adjacency_list<whatever> as the default representation, so we lose no genericity. Oh, how I wish I had free time :) Doug

Douglas Gregor wrote:
I still don't know if using City to identifiy vertices is feasible.
Why not make vertices derive from (or be aggregated with and convertable to) City? It has a whole ton of advantages:
1) The user can just think of nodes as if they were City objects, including easy access to City members. 2) Can trivially build a property map from a member pointer; wouldn't it be nice to say: dijsktra_shortest_paths(g, v, distance_map(&City::distance))
This would be nice, really!
3) graph<Vertex, Edge, Rep> can have Rep=adjacency_list<whatever> as the default representation, so we lose no genericity.
I see only one problem. If vertex== some class derived from City, then the vertex itself, as well as vector<vertex> might be very large and current BGL passes/stores vertices by value almost everywhere. The point about operator* is that it allows access to City (though less convenient), but does significally affect performance. - Volodya

On Monday 05 April 2004 02:01 am, Vladimir Prus wrote:
3) graph<Vertex, Edge, Rep> can have Rep=adjacency_list<whatever> as the default representation, so we lose no genericity.
I see only one problem. If vertex== some class derived from City, then the vertex itself, as well as vector<vertex> might be very large and current BGL passes/stores vertices by value almost everywhere.
The point about operator* is that it allows access to City (though less convenient), but does significally affect performance.
Right... we might still be able to "fake" enough to get the nice syntax working, the difference being that one would need to say "v->name" instead of "v.name". In 1.5 weeks, I would love to look into this :) Doug

Hi Guys, Here's some issues with this idea: Currently the vertex_descriptor for adjacency_list has no property-related type stuff in it, which allows adjacency_list_traits to work... that is, it is possible to determine the vertex_descriptor type without knowing the type of the vertex properties. This is important, for example, if you want an internal property containing a vertex descriptor (this is a circular dependency). vertex_descpriptor's are handles... passed by value. You would not want the City object copied, so the vertex descriptor would be a pointer to some object that inherits from City. However, dereferencing this pointer will add overhead to the execution time, especially in the case when a vertex_descriptor would have just been an int. Do you see solutions to these problems? Cheers, Jeremy On Apr 5, 2004, at 1:01 AM, Vladimir Prus wrote:
Douglas Gregor wrote:
I still don't know if using City to identifiy vertices is feasible.
Why not make vertices derive from (or be aggregated with and convertable to) City? It has a whole ton of advantages:
1) The user can just think of nodes as if they were City objects, including easy access to City members. 2) Can trivially build a property map from a member pointer; wouldn't it be nice to say: dijsktra_shortest_paths(g, v, distance_map(&City::distance))
This would be nice, really!
3) graph<Vertex, Edge, Rep> can have Rep=adjacency_list<whatever> as the default representation, so we lose no genericity.
I see only one problem. If vertex== some class derived from City, then the vertex itself, as well as vector<vertex> might be very large and current BGL passes/stores vertices by value almost everywhere.
The point about operator* is that it allows access to City (though less convenient), but does significally affect performance.
- Volodya
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Jeremy Siek <jsiek@osl.iu.edu> http://www.osl.iu.edu/~jsiek Ph.D. Student, Indiana University Bloomington Graduating in August 2004 and looking for work C++ Booster (http://www.boost.org) _______________________________________________

Hi Jeremy, On Mon, 5 Apr 2004, Jeremy Siek wrote:
Hi Guys,
Here's some issues with this idea:
Currently the vertex_descriptor for adjacency_list has no property-related type stuff in it, which allows adjacency_list_traits to work... that is, it is possible to determine the vertex_descriptor type without knowing the type of the vertex properties. This is important, for example, if you want an internal property containing a vertex descriptor (this is a circular dependency).
If we ensure that naming an adjacency_list does not require instantiation of the City class, there is no problem, e.g., class City; class Highway; typedef digraph<City, Highway> Map; // or an adjacency_list; no matter class City { public: graph_traits<Map>::vertex_descriptor rival; }; class Highway { graph_traits<Map>::edge_descriptor other_direction; }; Since the descriptors will be pointers, I _think_ we can be careful not to instantiate City and Highway within graph_traits. Worst case, we end up using a scoped_ptr<adjacency_list<...> > so that instantiating digraph<...> does not instantiate adjacency_list.
vertex_descpriptor's are handles... passed by value. You would not want the City object copied, so the vertex descriptor would be a pointer to some object that inherits from City. However, dereferencing this pointer will add overhead to the execution time, especially in the case when a vertex_descriptor would have just been an int.
Do you see solutions to these problems?
Not this one. Do you really think this extra overhead is going to sink the idea? Doug

Hi Doug, On Apr 5, 2004, at 11:57 AM, Douglas Paul Gregor wrote:
Hi Jeremy,
On Mon, 5 Apr 2004, Jeremy Siek wrote:
Hi Guys,
Here's some issues with this idea:
Currently the vertex_descriptor for adjacency_list has no property-related type stuff in it, which allows adjacency_list_traits to work... that is, it is possible to determine the vertex_descriptor type without knowing the type of the vertex properties. This is important, for example, if you want an internal property containing a vertex descriptor (this is a circular dependency).
If we ensure that naming an adjacency_list does not require instantiation of the City class, there is no problem, e.g.,
class City; class Highway;
typedef digraph<City, Highway> Map; // or an adjacency_list; no matter
class City { public: graph_traits<Map>::vertex_descriptor rival; };
class Highway { graph_traits<Map>::edge_descriptor other_direction; };
Since the descriptors will be pointers, I _think_ we can be careful not to instantiate City and Highway within graph_traits. Worst case, we end up using a scoped_ptr<adjacency_list<...> > so that instantiating digraph<...> does not instantiate adjacency_list.
Okay, I can see how that might work.
vertex_descpriptor's are handles... passed by value. You would not want the City object copied, so the vertex descriptor would be a pointer to some object that inherits from City. However, dereferencing this pointer will add overhead to the execution time, especially in the case when a vertex_descriptor would have just been an int.
Do you see solutions to these problems?
Not this one. Do you really think this extra overhead is going to sink the idea?
Well, it's a pretty major overhead. We're talking an extra load from memory every time you want to do anything with a vertex descriptor. For the slow graph structures this is not significant, but this would change the fast graph structures into slow ones. Another idea I had was to make vertex_descriptor some kind of pointer proxy that defined operator-> to return City*. Cheers, Jeremy _______________________________________________ Jeremy Siek <jsiek@osl.iu.edu> http://www.osl.iu.edu/~jsiek Ph.D. Student, Indiana University Bloomington Graduating in August 2004 and looking for work C++ Booster (http://www.boost.org) _______________________________________________

Jeremy, On Mon, 5 Apr 2004, Jeremy Siek wrote:
On Apr 5, 2004, at 11:57 AM, Douglas Paul Gregor wrote:
Not this one. Do you really think this extra overhead is going to sink the idea?
Well, it's a pretty major overhead. We're talking an extra load from memory every time you want to do anything with a vertex descriptor. For the slow graph structures this is not significant, but this would change the fast graph structures into slow ones.
Another idea I had was to make vertex_descriptor some kind of pointer proxy that defined operator-> to return City*.
If you still want convertability to an (unsigned) int, you'll need to either store that in the class derived from City or store it directly, so we'll probably need a proxy in any case. Personally, I'd be willing to pay for the few extra memory ops (who knows? they may get optimized away) in most of my work. I think having more user-friendly graph adaptors graph<Vertex, Edge> and digraph<Vertex, Edge> would help the usability of the library; adjacency_list can be for those that need maximal performance (although I suspect that they'd roll their own graph data structures/algorithms anyway). Doug

Hi Doug, On Apr 5, 2004, at 1:50 PM, Douglas Paul Gregor wrote:
Another idea I had was to make vertex_descriptor some kind of pointer proxy that defined operator-> to return City*.
If you still want convertability to an (unsigned) int, you'll need to either store that in the class derived from City or store it directly, so we'll probably need a proxy in any case.
Personally, I'd be willing to pay for the few extra memory ops
I should have made my main point clearer: using a proxy with an int member would not necessarily require extra memory ops to get the index, as would using an actual pointer. Here's a sketch of the idea: template <class T> class vertex_descriptor { T* m_property; unsigned int m_index; public: vertex_descriptor(T* prop) : m_property(prop) { } T* operator->() const { return m_property; } unsigned int get_index() const { return m_index; } // or perhaps operator unsigned int() const { return m_index; } };
(who knows? they may get optimized away)
Probably not. Today's optimizing compilers very rarely touch pointers and heap operations. It's that whole aliasing problem.
in most of my work. I think having more user-friendly graph adaptors graph<Vertex, Edge> and digraph<Vertex, Edge> would help the usability of the library; adjacency_list can be for those that need maximal performance (although I suspect that they'd roll their own graph data structures/algorithms anyway).
I agree that if we can't find a way to make adjacency_list easier to use, we should provide another class type. However, it may be possible to make adjacency_list easier to use. Cheers, Jeremy _______________________________________________ Jeremy Siek <jsiek@osl.iu.edu> http://www.osl.iu.edu/~jsiek Ph.D. Student, Indiana University Bloomington Graduating in August 2004 and looking for work C++ Booster (http://www.boost.org) _______________________________________________

Ran into another difficulty for supporting stuff like v->color (through a pointer or a proxy). The current BGL design places the sole responsibility of constness on the property maps. Thus we don't have a const_vertex_descriptor, or const_out_edge_iterator, or const_vertex_iterator, etc. -Jeremy _______________________________________________ Jeremy Siek <jsiek@osl.iu.edu> http://www.osl.iu.edu/~jsiek Ph.D. Student, Indiana University Bloomington Graduating in August 2004 and looking for work C++ Booster (http://www.boost.org) _______________________________________________

On Monday 05 April 2004 11:11 pm, Jeremy Siek wrote:
Ran into another difficulty for supporting stuff like v->color (through a pointer or a proxy). The current BGL design places the sole responsibility of constness on the property maps. Thus we don't have a const_vertex_descriptor, or const_out_edge_iterator, or const_vertex_iterator, etc.
We discussed this offline a bit, and found that we could resolve this by creating a graph operation that maps from a graph and vertex/edge descriptor to the class that stores the properties of that descriptor. For instance: struct city { string name; }; struct highway { int speed_limit; double distance; }; typedef adjacency_list<listS, vecS, bidirectionalS, city, highway> graph; graph g; // put some nodes/edges into g graph_traits<graph>::vertex_descriptor v = vertices(g).first; g[v].name = "Foo"; graph_traits<graph>::edge_descriptor e = edges(g).first; g[e].speed_limit = 65; The vertex_descriptor/edge_descriptor types will need to be different, of course, but otherwise this is easy to do and the constness will get picked up from the graph itself. The second part of this is to be able to create a property map easily. Here's one potential syntax: member_property_map(&highway::distance, g); Or, if we're being cute, g ->* &highway::distance I personally like the overloaded ->* syntax :) Doug

Hi Jeremy,
vertex_descpriptor's are handles... passed by value. You would not want the City object copied, so the vertex descriptor would be a pointer to some object that inherits from City. However, dereferencing this pointer will add overhead to the execution time, especially in the case when a vertex_descriptor would have just been an int.
It's not always overhead. Say vertex_descriptor is int. When you want to traverse out edges you need to access element of the vector which stores adjacency structure and then iterate over it. If vertex_descriptor is pointer, you'd just directly get adjacency structure, and that's likely to be faster. In the case of accessing property map by index, we indeed get some overhead for dereferencing. But really, we've got only two variants: 1. Use pointers and bear this overhead 2. Store cached value of index inside vertex_descriptor, as you suggest in the other post. In this case we have space overhead. So, it seems we can't have nice-for-users 'v->value' or *v syntax without overhead. It's the question which kind of overhead is better. BTW, what about vertex_descriptor stability. If we use vector for storage, it means the addresses can change when we *add* vertex, which would invalidate already stored vertex descriptors. Does this mean we need to use list/deque for the 'easy-to-use' graph type? - Volodya

Hi Vladimir, On Apr 6, 2004, at 1:07 AM, Vladimir Prus wrote:
But really, we've got only two variants: 1. Use pointers and bear this overhead 2. Store cached value of index inside vertex_descriptor, as you suggest in the other post. In this case we have space overhead.
True, though vertex_descriptor's are not stored in memory very often... usually appear as local variables, so the space is just consuming more registers.
So, it seems we can't have nice-for-users 'v->value' or *v syntax without overhead. It's the question which kind of overhead is better.
BTW, what about vertex_descriptor stability. If we use vector for storage, it means the addresses can change when we *add* vertex, which would invalidate already stored vertex descriptors. Does this mean we need to use list/deque for the 'easy-to-use' graph type?
That's a good question. Invalidation is certainly not user friendly. BTW, there's another issue with the idea of automatically assigning indices when VertexList!=vecS. If we want to make sure the indices stay in the range [0,num_vertices(g)) in the face of adding and removing vertices, then there will be a price to pay in terms of renumbering vertices or recycling indices. Cheers, Jeremy _______________________________________________ Jeremy Siek <jsiek@osl.iu.edu> http://www.osl.iu.edu/~jsiek Ph.D. Student, Indiana University Bloomington Graduating in August 2004 and looking for work C++ Booster (http://www.boost.org) _______________________________________________

Hi Jeremy,
1. Use pointers and bear this overhead 2. Store cached value of index inside vertex_descriptor, as you suggest in the other post. In this case we have space overhead.
True, though vertex_descriptor's are not stored in memory very often... usually appear as local variables, so the space is just consuming more registers.
Or stack space, but anyway, local variables can be optimized just fine by compiler. I'm not sure how ofter vertex descriptor is stored, but sometimes it is. For example, the predecessor map in depth_first_search. Maybe, if we really go for convenience, we can create 'vertex' class which is not so efficient but user friendly, and still retain vertex_descriptor for performance critical parts. The vertex would keep vertex_index, provide access to the 'primary vertex data' (e.g. City), and also provide access to other properties. There also would be a method to get 'vertex', given vertex_descriptor. Two classes for the same concept might cause user confusion, but is there are way to get both efficiency and convenience?
So, it seems we can't have nice-for-users 'v->value' or *v syntax without overhead. It's the question which kind of overhead is better.
BTW, what about vertex_descriptor stability. If we use vector for storage, it means the addresses can change when we *add* vertex, which would invalidate already stored vertex descriptors. Does this mean we need to use list/deque for the 'easy-to-use' graph type?
That's a good question. Invalidation is certainly not user friendly.
Then, we either use listS for user-friend graph, or use the vertex type which stores vertex_descriptor internally.
BTW, there's another issue with the idea of automatically assigning indices when VertexList!=vecS. If we want to make sure the indices stay in the range [0,num_vertices(g)) in the face of adding and removing vertices, then there will be a price to pay in terms of renumbering vertices or recycling indices.
Yes, I think I've talked about this last time the auto-number issue was raised. But do we have a choice? Is it practical to except that user will do renumbering/recycling himself? Besides, the price for recycling is just extra vector<int>, which is not much compared with all the other bookkeeping. - Volodya

Provisional Dissonance <prosonance@yahoo.com> writes: In general, posting anonymously goes against the grain of Boost's professional culture. People like to know who they're working with. I won't insist, but if it's not infeasible I'd appreciate it if you could use your real name to post. Apologies if "Provisional Dissonance" is really your name... you never know. -- Dave Abrahams Boost Moderator

Hi, On Apr 2, 2004, at 6:48 AM, Provisional Dissonance wrote:
of the framework? What are the advantages of this approach? What are the disadvantages of simplifying things for the client? This is the type of design issue that I intended to question with my initial post.
So I think your question can be boiled down to two questions, why does the BGL use vertex descriptors? and why does the BGL use property maps? The BGL graph interface was designed so that many different graph data structures could be uniformly manipulated. It sounds like the data structure that you are most familiar with is where each node is an object, and the connecting structure of the graph is represented with pointers. However, another important graph data structure is the adjacency list representation, where there are no node objects. Instead there is just an array of lists of integers. An integer n is associated with each node, and the out-edge list for a node is at offset n in the array. The vertex_descriptor abstraction provides an opaque handle to a "vertex", whatever that may be for the data structure. For one graph type that may be node*, for another int. The purpose of the property map interface is to provide a uniform way for an algorithm to access data that is associated with a node or edge. Again, there are many ways that such an association can be implemented, with various time, space, and convenience tradeoffs. Here are some examples: 1. Associated data is a member of a struct 2. Data is in an array, organized by vertex index offset 3. Data is in a hash table, somehow keyed on the vertex. Now, the problem you're pointing out is that the extra level of indirection of the property map makes the syntax rather unwieldy for accessing data. For example, instead of: v->color one must write: get(color_map, v) (not to mention the trouble of setting up the color_map to begin with) Perhaps with some cleverness, we can get closer to the syntax of v->color, while retaining the flexibility an uniformity provided by property maps? I'll put my thinking cap on. Cheers, Jeremy _______________________________________________ Jeremy Siek <jsiek@osl.iu.edu> http://www.osl.iu.edu/~jsiek Ph.D. Student, Indiana University Bloomington Graduating in August 2004 and looking for work C++ Booster (http://www.boost.org) _______________________________________________
participants (6)
-
David Abrahams
-
Douglas Gregor
-
Douglas Paul Gregor
-
Jeremy Siek
-
Provisional Dissonance
-
Vladimir Prus