Re: [Boost-users] [Graph] Problem with large vertex ids

16 May 2013


      On Thu, 16 May 2013, Anthony Foiani wrote:
...
Stephen Woodbridge <woodbri@swoodbridge.com> writes:
...
2. the user loads GIS data and the vender has assigned unique ids for
the vertices. These tend to be arbitrary, non-sequential, and increase
over time as the dataset evolves. So big numbers and not sequential.
We really have no idea which kind of data we are going to get. The
number of edges in the graph varies based on the spatial extents the
the start and end points define expanded a little to catch edge
conditions.
I haven't used BGL very much, but when I did play around with
user-assigned labels for the vertices, it was painful all the way
around.
If I were to do anything with the BGL today, I would maintain my own
map from "externally meaningful id" to sequential integers (starting
at 0 or 1).
That probably does make the most sense, especially if you can use a hash 
table to do that mapping.
...
My analysis was that the mapping incurs a O(n) or O(n log n) cost
twice (once to forward map, once to reverse map).  However, the actual
graph ops are so much faster with a vector than with any other
representation, and looking up vertices by id happens many more times
than twice.
Maybe I don't understand the BGL.  I would not be surprised.  :)
I did find it telling that the majority of the examples in the manual
use small integers.
Your analysis is right -- the examples (and most of the work that I do 
with BGL myself) use small integers so they can be used as indexes into 
arrays.  Many BGL algorithms won't work by default on graphs that don't 
have that kind of mapping available, even if the vertex descriptors 
themselves aren't small integers.

-- Jeremiah Willcock