Re: [boost] Geometry and spatial indexes, my opinion

8 Oct 2008


      Hi Patrick,

:)
--------------------------------------------------
From: "Patrick Mihelich" <patrick.mihelich@gmail.com>
Sent: Wednesday, October 08, 2008 3:37 AM
To: <boost@lists.boost.org>
Subject: Re: [boost] Geometry and spatial indexes, my opinion
...
Hi Brandon,
Hi Mathias,
...
I just wanted to say something here about dimensional agnosticism. I 
don't
think this is necessarily the way to go. There are a couple of reasons 
that
I say this. The more trivial point is that higher than 3D geometry isn't
likely to be supported and isn't of much practical use anyway.
Strongly disagree! Spatial index structures are very commonly used to
accelerate nearest neighbor search, an important operation in numerous
fields. There is a nice listing at
http://en.wikipedia.org/wiki/Nearest_neighbor_search#Applications. The
search space in such applications may have hundreds of dimensions. In
computer vision we use nearest neighbor search for keypoint matching, 
which
in turn is fundamental to content-based image retrieval, visual odometry,
wide-baseline stereo, simultaneous localization and mapping, panorama
stitching, and the list goes on.... Computational geometry has many, many
applications beyond the obvious ones (e.g. collision detection), and
restricting yourself to 3 dimensions or fewer is rather crippling.
In my opinion, a Boost geometry library must have at least basic support 
for
n-dimensional computational geometry. Let's start with able to handle 
points
of arbitrary dimension and calculate distances between them (given some
metric). This is easy to do and already sufficient to implement many 
useful
spatial indices. I have not looked at Barend's geometry library in detail,
but on the surface it looks generic enough to support this.
Ok, I take your point. I wasn't suggesting that a library not be able to 
handle more than 3 dimensions. Rather I would suggest that the population of 
developers who have such requirements must be very small. Considering that 
generalization on dimension is not trivial for many algorithms, this leads 
me to the conclusion that the specific algorithms will decide the 
requirements on their inputs. This is why I chose a traits based interface 
for the geometry library I've been implementing. There's nothing to stop one 
from creating a multi-dimensional Cartesian/polar/parametric access traits 
interface for any algorithm which requires it. So I suppose I agree, the 
geometry library should not constrain on the number of dimensions. But I 
don't agree with full agnosticism (to mean agnosticism of coordinate frame 
as well.) There should be a way to differentiate points further to align 
with the concrete traits of the dimension they describe.
...
The second point is on the basis of coordinate frameworks. While I agree
...
that any library shouldn't necessarily be limited to 2D, I don't think 
that
you should lose the notions of what type of coordinate frame you are 
working
in. So while you may be able to have 0, 1, or 2 to describe agnostically 
the
dimension in the framework you want, the descriptions of what those
dimensions are is lost. For example, you may have an algorithm which is
designed to work in Cartesian coordinates in 2D (or 3D). You may have
another algorithm which requires points in Polar (Spherical) coordinates.
Another might want cylindrical coordinates to be optimal. Then there are
algorithms which require translation of points into parametric 
coordinates
to perform some logic optimally. If you have points which are agnostic to
dimension, how can you distinguish 0,1,2 from x,y,z or r, theta, phi?
Surely this can be handled by the type system? Simply have distinct 
Point3D
(Cartesian), SphericalPoint3D and CylindricalPoint3D classes with
appropriate functionality. In the vision library I used to work with, 
there
was a similar problem of supporting images with different numbers of
channels and different color representations (RGB, HSV, XYZ...). We
implemented them as distinct pixel types (IIRC Boost.GIL has more or less
the same architecture), and even gave them conversion operators to each
other so that we could use them interchangeably with correct behavior.
One certainly could if they chose to take the route of providing all these 
point classes.
...
...
The constraints imposed by these concepts are rather that when given a
point of some dimensionality you need to be able to determine the type of
coordinate frame that the point is describing. I've been working on a
generative geometry library which includes support for these distinctions
via conceptually enforced interfaces and traits.
So it sounds like you are already doing something of the sort?
Yes, I have taken a slightly different route than the other libs I have 
found in the vault. In my library the algorithms require inputs conform to a 
specific access traits concept. I have been using get_x, get_y, get_z as the 
general interface for cartesian (and would have used get_r, get_theta, 
get_phi for the others.. if I had gotten to that), but I think now that I'm 
fairly well persuaded to investigate a compile time indexed mechanism like 
boost::tuple. Has anyone done a performance test on using such a mechanism 
in practical settings? While I'm sure it's constant, how big is that 
constant in practice? Is it worth paying in cases where an algorithm 
requires a constrained dimensionality? I'm guessing nobody has really 
addressed these questions... so perhaps I'll take a gander and see if it's 
worth providing both interfaces in my access traits.
...
Back to Mathias' criticisms - although I realize it is an alpha, I was
similarly a bit disappointed when looking at the Spatial Indexes library. 
It
is good code (I hope that Federico will not be discouraged by criticism!),
but it is very limited in what it can do currently. It has no nearest
neighbor search, nor even a traversal interface that would allow the user 
to
implement it himself. It only provides 2D structures when it is very 
useful
to support spaces of arbitrary dimension. I would like for the library to 
be
generic in the dimension, not just the (2D) point type. Perhaps a good 
place
to start is the quadtree - I do not see why the difference between a
quadtree and an octree should be more than an int template parameter.
I'm admittedly opening up a bit of a can of worms here - nearest neighbor
search in high dimensions is a Hard Problem. However, there are
straightforward approximate nearest neighbor algorithms that are efficient
and plenty good enough for many applications.
I can't say that I have seriously considered this. Just out of curiosity, 
how do kd-trees scale memory-wise on average in high dimension?
...
Let me summarize with a wish list:
 * tree traversal interface
 * generic in the dimension for any structure for which this makes sense
 * (k-)nearest neighbor and other standard query methods, as mentioned by
Mathias
 * kd-tree implementation
 * metric tree implementation (splits need not be axis-aligned)
 * one or more good bulk-loading algorithms for each structure (repeated
insertion is not acceptable)
 * approximate (k-)nearest neighbor search
I could add some more items, but this is already a daunting amount of 
work.
So perhaps Federico and Barend can clarify, what are their plans for
development of the Spatial Index library? Do they include any of the items
above? It seems to me that Federico's work is quite a good start, but 
there
is a great deal of room for future development.
Cheers,
Patrick
Cheers,

Brandon
...
_______________________________________________
Unsubscribe & other changes: 
http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] Geometry and spatial indexes, my opinion

Brandon Kohn