Nick Edmonds wrote:
On Dec 10, 2008, at 2:09 PM, Geoff Hilton wrote:
Andrew Sutton wrote:
Aha! I had a feeling it was being wasted (in my code). Indeed it doesn't automatically assign indices (neither mine nor that of the BGL), it always remains zero throughout execution, this seems like a rather massive waste of memory to me. Since many BGL algorithms use the index properties I do believe they and their relationship with container selectors (since you say they also provide default index properties) merit some in depth explanation in the documentation as there currently is none whatsoever aside from their brief mention of existence and a paragraph on "what selector to choose to use the desired container". Aside from being indices of some sort I had no real idea of how to best (appropriately) use them since they were taking up memory, I ended up conceding them as a necessary evil/requirement of BGL algorithms despite not touching them in my own code except near algorithm calls (I'm still trying to wrap my head around graph theory). I think you're right about the issue only being half solved, do you know if there's a better way of solving the issue? Might there be a better (read: less dependency-inducing) way to provide default arguments, or do the index properties exist to supply the defaults? Failing that (and this is only theoretical in my current case since I can use defaults because of vecS) how might I use the index property in a more productive/less wasteful manner? [*]Don't properties already have an index by virtue of belonging to vertices and edges which are stored in containers? What about creating a property that explicitly exists only to provide default arguments (and is documented as such along with their relation to container selectors)? This property could also have a default state of cours e (which would be necessary anyway to remain compatible with older code). I hope the above didn't sound like I was bashing the documentation or whatnot, I am overall very grateful to have it. :) [*] Heh, after rereading this bit I think I answered that question myself. The BGL needs to know how to access the selected container in its' implementation as much as we need to access it through its' interface, thus the index property isn't necessarily always an int, it might be a Node* (with listS for example) right? Please feel free to confirm/correct me if and wherever I'm right/wrong. Thanks for the quick reply Andrew! In a word, "yes". To all of it :) It takes a good month or so of work with the BGL to really understand property maps, interior properties, exterior properties, and bundled properties, and then another month on top of that to understand the intricacies of their interaction with different instantiations of graph classes.
Great! I'm happy to know I was right! The BGL does indeed take some getting used to.
I'm not sure if the documentation contains a statement like, "A property map abstracts the ability to read and write data associated with an edge or vertex", but it probably should. The use of the _t types is just a mechanism of "naming" a property map with a type - and of course getting that property map off of the graph. The documentation is not entirely forthcoming about all of this. And since *every* example in the distro uses vecS, vecS, its hard to tell how things work.
Who do we have to bug to get improvements in the documentation and examples? :)
The memory isn't really being wasted if you're using it. In most cases, you're going to have to provide an index map anyways.
Out of curiosity (keeping in mind I'm still wrapping my head around graph theory), algorithmically speaking; why would I need to provide an index map if iterating over vertices and in/out edges with iterators is good enough? Is it a question of base algorithm or of implementation optimizations (switching out iterators and distances with container-specific pointer or index arithmetic as an oversimplified example)?
Maintaining the indices can save you a significant amount of time if you have to keep creating new maps (and aren't removing vertices).
How so? Could you provide a brief example? I ask because iteration doesn't seem to require explicit index maps to be defined (with the default set of provided container selectors at any rate). Even then when I had the index maps defined explicitly they were never modified (all were always zero) but everything functioned normally anyway, so any code that uses index maps (that I've seen thus far) seems to use them for their side effects (ie. the fact that they point to or are contained by somewhere specific), not for their contained value.
You might want to look here: http://svn.boost.org/svn/boost/sandbox/SOC/2007/graphs. It has some abstactions that make properties a little easier to work with. It also has an undirected_ and directed_graph adjacency list implementations that only work with bundled properties, and hide some of the weirdness of using vecS, listS, etc.
If you're really feeling bold, you can look at the 2008 version - which I'm still semi-actively working on - that will essentially be a complete re-implementation.
Andrew Sutton andrew.n.sutton@gmail.com
Awesome! I'll definitely be checking that out, is the plan for it to eventually replace the current BGL implementation?
Also, I recently came across the PBGL project; I haven't tested it, but is it not ready for use in a commercial application yet? My code will definitely need to be parallelized.
Hi Geoff,
As a developer on the PBGL project I figured I'd chime in here. I'll apologize in advance for the vagueness of my answer.
Whether the PBGL is 'ready for primetime' depends on what you want to do. The core data structures and algorithms work well and scale in the environment that I use them in most commonly, clusters of workstations running Linux or OS X (I've had some success with Solaris machines as well but they're a pain to work with) and communicating using OpenMPI with reasonably high speed interconnects (i.e. Infiniband/Myrinet/Quadrics). We aggressively target distributed memory applications and rarely provide non-distributed memory optimizations (as they normally require completely different algorithms and data structures). FYI, Running on GigE is painful, I doubt 10GigE is much better, but I don't have any machines that have 10GigE networks so I can't verify that claim.
Currently we have minimal support for node-level parallelism but that's one of our active development efforts. I'm working on generalizing the library to run on clusters of highly-multithreaded machines, this should show some benefit for commodity multi-core processors as well.
There's also a lot of experimental features/algorithms/optimizations under active development and if you want to use some of them you'll definitely end up doing some digging in the code and probably writing some yourself.
Before you start working with the Parallel BGL I would encourage you to be very familiar with the (sequential) BGL and make sure your graphs are big enough to justify the work you'll put into writing PBGL code. As a general rule of thumb if your graphs can fit in memory on a single machine you might want to consider whether you can achieve your goals using an embarassingly parallel approach rather than utilizing distributed graphs.
If you've got a problem that's too big to solve in core on a single machine and you have a good grasp of the (sequential) BGL and distributed memory programming (and a bit of knowledge about MPI) I would encourage you to try the Parallel BGL, we're always happy to have new users.
If you'd like to chat more about your applications and what it would take to get them running on the PBGL let me know. I'll be putting together a release in the near future (Q1) and if you've got any requests I might be able to squeeze them in.
Thanks, Nick Edmonds
Thanks, Geoff
Hi Nick, Happy new year! We've been working with the BGL so far to get the primary algorithm down, and have the intent of parallelizing it afterwards. May I e-mail you directly? Thanks, Geoff