On Thu, 16 May 2013, Anthony Foiani wrote:
Stephen Woodbridge
writes: 2. the user loads GIS data and the vender has assigned unique ids for the vertices. These tend to be arbitrary, non-sequential, and increase over time as the dataset evolves. So big numbers and not sequential.
We really have no idea which kind of data we are going to get. The number of edges in the graph varies based on the spatial extents the the start and end points define expanded a little to catch edge conditions.
I haven't used BGL very much, but when I did play around with user-assigned labels for the vertices, it was painful all the way around.
If I were to do anything with the BGL today, I would maintain my own map from "externally meaningful id" to sequential integers (starting at 0 or 1).
That probably does make the most sense, especially if you can use a hash table to do that mapping.
My analysis was that the mapping incurs a O(n) or O(n log n) cost twice (once to forward map, once to reverse map). However, the actual graph ops are so much faster with a vector than with any other representation, and looking up vertices by id happens many more times than twice.
Maybe I don't understand the BGL. I would not be surprised. :)
I did find it telling that the majority of the examples in the manual use small integers.
Your analysis is right -- the examples (and most of the work that I do with BGL myself) use small integers so they can be used as indexes into arrays. Many BGL algorithms won't work by default on graphs that don't have that kind of mapping available, even if the vertex descriptors themselves aren't small integers. -- Jeremiah Willcock