Preview 3 of the Geometry Library

Dear list, Herewith the URL to the third preview of our Geometry Library, including documentation and examples. The first two previews were sent to this list in January and in March. Since then, Bruno Lalande, already a Boost contributor, joined and we together reworked the library and its concepts. Lines (sequences of points) are now generic, handled using iterator pairs. Concepts have been worked out and integrated everywhere in the code. A Google Summer of Code project, handling Spatial Indexing, is based on the geometries of this library. Many discussions and comments were related to the concepts and, more specific, to the Point concept. This is the most basic thing of a geometry library and has to be as generic as possible. We think we implemented it like that into the library. The library can handle all point types including legacy point types, point types provided by the library, user defined point types, Cartesian points, native arrays, points with spherical coordinates. The library provides point_xy for Cartesian systems and point_ll for lat long systems, but the library user is thus not confined to those types. Using a strategy mechanism, the library should be able to handle every point type, and lines made of those types. The distance / simplify algorithms clearly show this design. The library is not completed, it is still a preview. It is used in real life projects but will still be enhanced and extended. The point type is now generic and lines (sequences of points) are generic. However, boxes, circles and polygons still have to be reviewed and their concepts can be worked out further. Some algorithms will be reprocessed to use strategies and handle any point type. The URL with documentation is here: http://geometrylibrary.geodan.nl. The library is Open Source and uses the Boost License. However, being a preview and not yet a real submission, the SVN is not at boost but still hosted at Geodan. If you are interested you can get SVN access. There were since March two other geometry library references sent to the list. This really shows the need for a Geometry Library into the boost collection. We welcome any discussion, comment or contribution and we are open for any type of cooperation, merge or join. Barend Gehrels

Barend Gehrels wrote:
Dear list,
Herewith the URL to the third preview of our Geometry Library, including documentation and examples.
Here are a few comments from a short read, then. Let's start with naming: - The names of the geometric objects isn't dimension agnostic, which might or might not be a problem. A circle is a circle in 2D, but a sphere in 3D. A line, however, is always a line. Same for segment and polygon (and all yet-to-be-implemented objects which are polygons). I think naming should try to differentiate objects that stay what they are when dimension varies or that evolve with that dimension. Either way, the name convention should be clearly explained in the documentation. - Also, while most names are 2D-centric, "box" is 3D-centric. This seems fairly inconsistent. Or maybe it's just me and box is actually dimension agnostic. - Still about "box", it is really an axis-aligned (whatever that means in non-cartesian geometries) hyperrectangle. Since "box" can not contain an arbitrary box, maybe it should be called minmaxbox or aabox or whatever. - envelope calculates the minimum bounding axis-aligned hyperrectangle for an object. It might also be desirable to have a way to calculate the minimum bounding hyperrectangle (non necessary axis-aligned) or the minimum bounding sphere. What names would those algorithms have if envelope is reserved for AABHR? Objects: - Why can objects be default-constructible? - Modeling sequences of polygons as iterator pairs seems an annoyance to me, ranges are much prettier to handle since one variable is thus sufficient to represent one object (while two are required for iterators). Algorithms: - Algorithms should be overloaded to handle all geometric objects. I assume the reason it wasn't done is simply because of time to do so. Also, within/intersection/etc. are not symmetric (in some cases there is an overload for alg(a, b) but not alg(b, a)) - why isn't envelope returning in result rather than taking a reference? - Why are intersection_segment arguments template parameters instead of segment, why is it not simply an intersection overload and why does it return something different than intersection? Maybe the design of the generic intersection should be extended so that it can work with segments rather than segments be made a special case. In the case of segments, it either generates nothing, a segment or a point. A segment can represent a point, so the output might as well be a segment or nothing. Then the result can be checked to see whether it's a point or not. So I don't see why it needs special treatment. - An overlap algorithm, that would be equivalent to testing not_empty(intersection) || within, would probably be interesting, especially since it is very useful for collision detection or spatial indexing. It can obviously be much faster than performing intersection and within calls. - centroid says it throws an exception if the polygon does not contain any point. But how can a polygon not contain any points!? The constructor of a polygon should force it to have at least one point.

- Why can objects be default-constructible? ... - centroid says it throws an exception if the polygon does not contain any point. But how can a polygon not contain any points!? The constructor of a polygon should force it to have at least one
Mathias wrote: point. Obviously, a default constructed polygon contains no points. Polygons have to be default constructible so that they can be elements of a std container. Objects that are not default constructible are a pain. I have to deal with a legacy polygon type that is not default constructible, so I have a very clear idea about which way I prefer and why. Luke

On Tue, Oct 14, 2008 at 8:20 PM, Simonson, Lucanus J <lucanus.j.simonson@intel.com> wrote:
- Why can objects be default-constructible? ... - centroid says it throws an exception if the polygon does not contain any point. But how can a polygon not contain any points!? The constructor of a polygon should force it to have at least one
Mathias wrote: point.
Obviously, a default constructed polygon contains no points. Polygons have to be default constructible so that they can be elements of a std container. Objects that are not default constructible are a pain. I have to deal with a legacy polygon type that is not default constructible, so I have a very clear idea about which way I prefer and why.
STD library containers don't need DefaultConstructible, only CopyConstructible and sometimes Assignable. Could the algorithms be written in such a way that they didn't care whether the polygon type was DefaultConstructible? Is there a different motivating reason to allow empty polygons than dealing with a legacy class? --Michael Fawcett

On Tue, Oct 14, 2008 at 9:10 PM, Michael Fawcett <michael.fawcett@gmail.com> wrote:
STD library containers don't need DefaultConstructible, only CopyConstructible and sometimes Assignable.
Sorry, I believe I was mistaken. The containers all require CopyConstructible *and* Assignable. --Michael Fawcett

Simonson, Lucanus J wrote:
- Why can objects be default-constructible? ... - centroid says it throws an exception if the polygon does not contain any point. But how can a polygon not contain any points!? The constructor of a polygon should force it to have at least one
Mathias wrote: point.
Obviously, a default constructed polygon contains no points.
Yes, those two comments were indeed linked to show they were maybe inconsistent. Also, if there is a valid reason for polygons to be able to be empty, I think the algorithm shouldn't make the check but assume it as a precondition, possibly with a disactivable assert.
Polygons have to be default constructible so that they can be elements of a std container.
Thankfully, standard containers do not require objects to be default constructible, but only copy constructible. I'd personally be even happier if they required almost nothing at all, which hopefully they will.
Objects that are not default constructible are a pain.
Says who? Two-phase initialization is well known to be a pain. What's the point of having an object constructed that is not really initialized and usable? Constructors are made to enforce invariants. For some objects, it doesn't make sense to make to allow "emptiness" or "default" as a valid state, so we might as well not allow those states to exist. But anyway, that's another debate. I think it was discussed extensively lately in comp.lang.c++.moderated, so there is no need to discuss it here.
I have to deal with a legacy polygon type that is not default constructible, so I have a very clear idea about which way I prefer and why.
Dealing with non default-constructible types is no more difficult than dealing with default-constructible ones. If you have a very clear case where for polygons or something related to geometry it can lead to issues, feel free to expose it.

Hi, Here are the few answers I can give right now, I let Barend complete or correct me, and answer the questions for which he has a better view of the library than me.
- The names of the geometric objects isn't dimension agnostic, which might or might not be a problem. A circle is a circles in 2D, but a sphere in 3D. A line, however, is always a line. Same for segment and polygon (and all yet-to-be-implemented objects which are polygons). I think naming should try to differentiate objects that stay what they are when dimension varies or that evolve with that dimension. Either way, the name convention should be clearly explained in the documentation.
Indeed it's something that should be done, we had a quick discussion about that with Barend and decided to postpone that for now. This kind of subject is never easy. For example, I'd like to rename circle into hypersphere or nsphere, but then we'll also have to clarify the meaning of some algorithms. For example what we currently call the area of a circle is actually the volume of an hypershere<2>...
- Also, while most names are 2D-centric, "box" is 3D-centric. This seems fairly inconsistent. Or maybe it's just me and box is actually dimension agnostic.
box doesn't sound less dimension agnostic than anything else to me, but this can be discussed...
- Still about "box", it is really an axis-aligned (whatever that means in non-cartesian geometries) hyperrectangle. Since "box" can not contain an arbitrary box, maybe it should be called minmaxbox or aabox or whatever.
aabox could be good.
- envelope calculates the minimum bounding axis-aligned hyperrectangle for an object. It might also be desirable to have a way to calculate the minimum bounding hyperrectangle (non necessary axis-aligned) or the minimum bounding sphere. What names would those algorithms have if envelope is reserved for AABHR?
Indeed, maybe this name could become minimum_aabox and be accompanied by other algorithms like minimum_box, minimum_sphere (minimum_hypersphere??), etc...
- Modeling sequences of polygons as iterator pairs seems an annoyance to me, ranges are much prettier to handle since one variable is thus sufficient to represent one object (while two are required for iterators).
Maybe we can double each algorithm to have the choice between passing a pair and a range. AFAIK, this will be the approach of the algorithms in C++0x, am I right?
- Algorithms should be overloaded to handle all geometric objects. I assume the reason it wasn't done is simply because of time to do so.
Yes I suppose that we'll be able to add those overloads once the general design will be validated. About the default constructors debate: like Mathias I'd be inclined to have constructors that guarantee the integrity of any existing object and make checks useless, that's the choice I make in most cases. Luke, could you give some actual examples of cases where you found really annoying your legacy polygon class? Bruno

Just a few additions from my side.
Here are the few answers I can give right now, I let Barend complete or correct me, and answer the questions for which he has a better view of the library than me.
- Still about "box", it is really an axis-aligned (whatever that means in non-cartesian geometries) hyperrectangle. Since "box" can not contain an arbitrary box, maybe it should be called minmaxbox or aabox or whatever.
aabox could be good.
- envelope calculates the minimum bounding axis-aligned hyperrectangle for an object. It might also be desirable to have a way to calculate the minimum bounding hyperrectangle (non necessary axis-aligned) or the minimum bounding sphere. What names would those algorithms have if envelope is reserved for AABHR?
The names of geometric objects are, as much as possible, taken over from OGC (see also http://geometrylibrary.geodan.nl/ogc.html). Those names (point, linestring, linear ring, polygon) are also used in Google's KML and in most databases with geometry support. Indeed the "circle" and the "box" were not there, we've added them, and they are open for discussion. Also the names of the algorithms are borrowed from OGC, and are used in many GIS applications and spatially enabled databases. It seems therefore useful to have an algorithm "envelope" there. See for example http://dev.mysql.com/doc/refman/5.1/en/geometry-property-functions.html#func... (which, as OGC describes, returns a polygon - we thought a box was more appropriate for C++ applications, but it is open for discussion).
- Modeling sequences of polygons as iterator pairs seems an annoyance to me, ranges are much prettier to handle since one variable is thus sufficient to represent one object (while two are required for iterators).
Maybe we can double each algorithm to have the choice between passing a pair and a range. AFAIK, this will be the approach of the algorithms in C++0x, am I right?
None of the algorithms use iterator pairs for polygons. Polygons might have holes, in our library, so having just a sequence of points is not sufficient to define a polygon. On the other hand, algorithms take linestrings by iterator pairs, which was in fact a positive result of the discussions of this list. Return of linestrings / polygons is done using output iterators.
- Algorithms should be overloaded to handle all geometric objects. I assume the reason it wasn't done is simply because of time to do so.
Yes I suppose that we'll be able to add those overloads once the general design will be validated.
why isn't envelope returning in result rather than taking a reference? The reason is, like with centroid, that the output box / point might have a different type than the input geometry. If it was a result, the
- Why are intersection_segment arguments template parameters instead of segment, why is it not simply an intersection overload and why does it return something different than intersection? Maybe the design of the generic intersection should be extended so that it can work with segments rather than segments be made a special case. Right. The case of the segment, in fact, is actually an internal implementation function. It should have been in namespace "impl", will be changed. Agree, all intersection algorithms should return the same
Agree, where it makes sense. The "within" algorithm is not symmetric, a point can be within a polygon but a polygon cannot be within a point. library user has, when calling the function, to specify the type as template parameter. But it might be changed, it is open. thing and should have the same name. Internally it is useful that it exactly returns how it intersects.
- An overlap algorithm, that would be equivalent to testing not_empty(intersection) || within, would probably be interesting, especially since it is very useful for collision detection or spatial indexing. It can obviously be much faster than performing intersection and within calls. Agree. - centroid says it throws an exception if the polygon does not contain any point. But how can a polygon not contain any points!? The constructor of a polygon should force it to have at least one point. A polygon with only one point has also not much meaning. See also discussion about construction. As soon as the polygons follow the concept Polygon, the construction is open to the implementator / library user. Then the centroid algorithm can still get an empty polygon.
Barend

None of the algorithms use iterator pairs for polygons. Polygons might have holes, in our library, so having just a sequence of points is not sufficient to define a polygon.
Yep, I was assuming in my answer that Mathias had made a mistake in his question and was rather talking about linestrings. Mathias, correct me if I'm wrong. Bruno

Barend Gehrels wrote:
None of the algorithms use iterator pairs for polygons. Polygons might have holes, in our library, so having just a sequence of points is not sufficient to define a polygon.
I didn't say for polygons, I said for sequences of polygons (what used to be called multipolygons I think). It's true however no algorithm take these at the moment, so let's consider sequences of points, aka linestrings.
On the other hand, algorithms take linestrings by iterator pairs, which was in fact a positive result of the discussions of this list.
It is good, but ranges would be more practical and less verbose. Instead of passing two arguments (the begin iterator and the end iterator), you only pass one, a range (and by const-reference, not by value). A range is an object from which you can extract a begin and an end iterator (using boost::begin and boost::end). A std::pair of iterators is a valid range, but so is also any container. Consider std::for_each, here how you make it range-aware: template<typename Range, typename F> F for_each(const Range& r, F f) { return std::for_each(boost::begin(r), boost::end(r), f); } That way, you can directly pass a linestring, a vector of points, a deque of points, a lazily computed adaptor or whatever to the algorithms with concise syntax: ie. just 'foo' instead of 'foo.begin(), foo.end()'.
Agree, where it makes sense. The "within" algorithm is not symmetric, a point can be within a polygon but a polygon cannot be within a point.
If the case can never happen, why not make an overload that simply returns false? If I have generic code and I want to know whether an instance of T1 is within an instance of T2, I don't really want to make special the case where T2 is a point before calling the generic algorithm.

Mathias,
On the other hand, algorithms take linestrings by iterator pairs, which was in fact a positive result of the discussions of this list.
It is good, but ranges would be more practical and less verbose. Instead of passing two arguments (the begin iterator and the end iterator), you only pass one, a range (and by const-reference, not by value). A range is an object from which you can extract a begin and an end iterator (using boost::begin and boost::end). A std::pair of iterators is a valid range, but so is also any container.
Consider std::for_each, here how you make it range-aware: template<typename Range, typename F> F for_each(const Range& r, F f) { return std::for_each(boost::begin(r), boost::end(r), f); }
That way, you can directly pass a linestring, a vector of points, a deque of points, a lazily computed adaptor or whatever to the algorithms with concise syntax: ie. just 'foo' instead of 'foo.begin(), foo.end()'.
I agree, it is shorter and more elegant, we'll consider adding / replacing by range support, it'll probably not largely influence the structure of the library. Indeed Phil, who indicated the use of iterators on this list, already mentioned the ranges.
Agree, where it makes sense. The "within" algorithm is not symmetric, a point can be within a polygon but a polygon cannot be within a point.
If the case can never happen, why not make an overload that simply returns false? If I have generic code and I want to know whether an instance of T1 is within an instance of T2, I don't really want to make special the case where T2 is a point before calling the generic algorithm.
Also here I agree. Besides this, in the past discussions concerning the previous previews, and the other submissions, discussions concentrated on the points. How it was modelled and how it should look like. Until now I didn't saw any comments on this (besides construction of generic geometry objects). Can we conclude that the list is now satisfied with the provided point concept? Barend

Barend wrote:
Besides this, in the past discussions concerning the previous previews,
and the other submissions, discussions concentrated on the points. How it was modelled and how it should look like. Until now I didn't saw any
comments on this (besides construction of generic geometry objects). Can we conclude that the list is now satisfied with the provided point concept?
template <> struct access<example_point_1> { template <int I, typename PP> static typename forward_const<PP, typename support::coordinate<example_point_1>::type>::type& get(PP& p) { return I == 0 ? p.x : p.y; } }; No, not really. First of all, what about constructing and setting values of the point? It appears to be missing from the generic interface. Also, your get function returns a reference to coordinate. It should return by value. Many legacy point types may have a public API that prohibits direct access to their members because that is what good little C++ programmers were taught to do in school. Those point classes could not conform to your concept. A polar coordinate point could give a Cartesian coordinate view of itself, but not if it has to return is Cartesian coordinate values by reference. It could not conform to your concept either. A point is not a tuple. I don't think it is valuable that tuple conforms to your point concept. An interval has a low and high value, could be implemented as a tuple of size two, and would incorrectly conform to your point concept. That's not a good thing. If you had an interval concept in your geometry you would have to ensure that it was different from the point concept in some way. Since you are defining traits for your geometry objects, why not define their conceptual type as a trait? template<> struct geometry_concept<example_point_1> { typdef point_concept type; } This takes the guess work out of things and allows you to do tag dispatching based on conceptual type. Luke

Hello, Luke, I come back to you about this point from you:
Also, your get function returns a reference to coordinate. It should return by value. Many legacy point types may have a public API that prohibits direct access to their members because that is what good little C++ programmers were taught to do in school. Those point classes could not conform to your concept.
If you are talking about using the getter for read access, this way of writing traits specialization is not an obligation and the user can always return by value in his own one. If you are talking about using the getter for write access (that is: get<0>(p) = value) it's more problematics. We've discussed about that with Barend and it appears that as you say, a legacy point without direct access to its coordinates will never be adaptable to such traits. I hadn't thought about that before. The problem is that this approach (value = get<0>(p) for read access and get<0>(p) = value for write access) has been chosen after having been advocated by some people on this list. The argument was that it is a very common approach in Boost, for Tuple and Fusion. And this argument was quite convincing. Must I deduce that the Boost.Fusion approach is not totally generic and that a structure with unreferencable private members couldn't be adapted to it? And what would finally be the best way to handle point write access if the Boost.Fusion one is wrong? Thanks Bruno

On Tue, Oct 21, 2008 at 9:24 AM, Bruno Lalande <bruno.lalande@gmail.com> wrote:
Luke, I come back to you about this point from you:
Also, your get function returns a reference to coordinate. It should return by value. Many legacy point types may have a public API that prohibits direct access to their members because that is what good little C++ programmers were taught to do in school. Those point classes could not conform to your concept.
[snip]
If you are talking about using the getter for write access (that is: get<0>(p) = value) it's more problematics. We've discussed about that with Barend and it appears that as you say, a legacy point without direct access to its coordinates will never be adaptable to such traits. I hadn't thought about that before.
Could get<0>(p) return an object (convertible to the coordinate type, so it can also be used for read access) with an overloaded assignment operator that does the appropriate thing for write access in such a point class? It seems like that would solve the problem, but I'm not sure about any drawbacks. Kind regards, Stjepan P.S. kudos for all the geometry library development and discussions. I'm looking forward to a well-designed geometry library in boost.

Stjepan wrote:
Could get<0>(p) return an object (convertible to the coordinate type, so it can also be used for read access) with an overloaded assignment operator that does the appropriate thing for write access in such a point class? It seems like that would solve the problem, but I'm not sure about any drawbacks.
That is very creative, and would work, cool. However, requiring the user of the library to be creative and solve tricky problems like this is what I'd like to avoid by defining an interface that makes their lives easier. I want them to take the library for granted, and not appreciate how hard it was to write. Luke

Luke wrote:
Also, your get function returns a reference to coordinate. It should return by value. Many legacy point types may have a public API that prohibits direct access to their members because that is what good little C++ programmers were taught to do in school. Those point classes could not conform to your concept.
Bruno wrote:
If you are talking about using the getter for read access, this way of writing traits specialization is not an obligation and the user can always return by value in his own one.
Except that wouldn't work with library code that used it as an lvalue. If you are talking about using the getter for write access (that is: get<0>(p) = value) it's more problematics. We've discussed about that with Barend and it appears that as you say, a legacy point without direct access to its coordinates will never be adaptable to such traits. I hadn't thought about that before.
The problem is that this approach (value = get<0>(p) for read access and get<0>(p) = value for write access) has been chosen after having been advocated by some people on this list. The argument was that it is a very common approach in Boost, for Tuple and Fusion. And this argument was quite convincing.
Must I deduce that the Boost.Fusion approach is not totally generic and that a structure with unreferencable private members couldn't be adapted to it?
And what would finally be the best way to handle point write access if the Boost.Fusion one is wrong?
I don't think it's a matter of right and wrong. It's a matter of goals and objectives. My goal is to work with all geometry types and not impose unnecessary restrictions on the way geometry types need to be implemented. Fusion is a way to work with plain old data. Because point objects have no invariant to enforce, they should be plain old data (though often aren't defined that way), but that is not true of geometry objects in general. 1D intervals are almost as simple, but do have an invariant, and therefore should not provide direct access to their data members. I was just looking a Brandon's library, and it seems he has no write access to point, but does have a construct function that accepts the coordinates and returns a point by value. This could be used to overwrite a point type that provides an assignment operator, but would be cumbersome to use to write to a single coordinate of a point. My traits looks like the following: template <typename T> struct point_traits { typedef typename T::coordinate_type coordinate_type; static inline coordinate_type get(const T& point, orientation_2d orient) { return point.get(orient); } static inline void set(T& point, orientation_2d orient, coordinate_type value) { point.set(orient, value); } static inline T construct(coordinate_type x_value, coordinate_type y_value) { return T(x_value, y_value); } }; This is specifically the Cartesian 2D point traits, the 3D point traits are a different structure. It needs to be specialized for point types that don't conform to my expected default behavior. Like Brandon, I provide a construct function that returns a point by value. I also provide the set function that takes a point by reference, an orientation class encapsulated enum and the coordinate value. I prefer runtime indexing, as I've mentioned before. There is a certain degree of redundancy in the way I define the traits, since construct and set are both ways of getting data into an object. I think that Brandon may have the right idea with a getter function and a construct function. It is rare to write to only part of an unknown object type. I conclude that the point accessor should return by value or const reference, and that the traits should also provide a construct function to construct a point from coordinate values. A given point class should additionally provide an assignment operator and copy constructor to conform to the point concept. If we are able to get away without requiring default constructor that would be a good thing. For an ND point, the construct function should take an iterator pair over coordinate values and the accessor function must be runtime indexing. In looking more at Brandon's code, I have to repeat my observation that he has done an impressive job of defining a generic geometry type system and using boost libraries to do it. I think that if we make his accessor dimension agnostic (either runtime or compile time indexing) we have a winner. Regards, Luke

Hi Luke, Sorry for the late reply...
If you are talking about using the getter for read access, this way of writing traits specialization is not an obligation and the user can always return by value in his own one.
Except that wouldn't work with library code that used it as an lvalue.
From my point of view, the result of this get function (ie. the one that's used for reading in our concept, not for writing) should never be used as an lvalue. It's not its goal.
My traits looks like the following:
template <typename T> struct point_traits { typedef typename T::coordinate_type coordinate_type;
static inline coordinate_type get(const T& point, orientation_2d orient) { return point.get(orient); } static inline void set(T& point, orientation_2d orient, coordinate_type value) { point.set(orient, value); } static inline T construct(coordinate_type x_value, coordinate_type y_value) { return T(x_value, y_value); } };
This is specifically the Cartesian 2D point traits, the 3D point traits are a different structure. It needs to be specialized for point types that don't conform to my expected default behavior. Like Brandon, I provide a construct function that returns a point by value. I also provide the set function that takes a point by reference, an orientation class encapsulated enum and the coordinate value. I prefer runtime indexing, as I've mentioned before.
OK then I think a set() function instead of our "get for write" one would be better, less confusing and less tricky. The construction mechanism you mention is also very useful indeed, I think we'll implement one too.
There is a certain degree of redundancy in the way I define the traits, since construct and set are both ways of getting data into an object. I think that Brandon may have the right idea with a getter function and a construct function. It is rare to write to only part of an unknown object type.
It's not that rare, and having to construct a whole new point would be annoying in those cases. I think the best way is to keep both approaches.
I conclude that the point accessor should return by value or const reference, and that the traits should also provide a construct function to construct a point from coordinate values. A given point class should additionally provide an assignment operator and copy constructor to conform to the point concept. If we are able to get away without requiring default constructor that would be a good thing. For an ND point, the construct function should take an iterator pair over coordinate values and the accessor function must be runtime indexing.
In our compile-time accessing world, it would rather take something like a fusion vector I guess.
In looking more at Brandon's code, I have to repeat my observation that he has done an impressive job of defining a generic geometry type system and using boost libraries to do it. I think that if we make his accessor dimension agnostic (either runtime or compile time indexing) we have a winner.
Indeed, it's a very interesting part of his library. Unfortunately we really have to find a consensus about the compile-time / runtime access before we can do anything together... Maybe I will open a new thread about this precise subject. Brandon if you read this, there's a typo at 3 places of your code: "dimenstion" instead of "dimension". Regards Bruno

Simonson, Lucanus J wrote:
Also, your get function returns a reference to coordinate. It should return by value. Many legacy point types may have a public API that prohibits direct access to their members because that is what good little C++ programmers were taught to do in school. Those point classes could not conform to your concept.
This is not a problem, you just have to write a proxy. That kind of thing is done very often to achieve operator[] overloading with different semantics for reads and writes. It's also done a lot for iterators, which also return references.

Agree, where it makes sense. The "within" algorithm is not symmetric, a point can be within a polygon but a polygon cannot be within a point.
Mathias Gaunard wrote:
If the case can never happen, why not make an overload that simply returns false? If I have generic code and I want to know whether an instance of T1 is within an instance of T2, I don't really want to make special the case where T2 is a point before calling the generic algorithm.
Here I disagree. First of all, it is an absurd amount of work. Fully generic functions cannot be overloaded directly, so you would have to use tag dispatching. //legal input combinations: // polygon polygon // polygon rectangle // polygon segment // polygon point // rectangle polygon // rectangle rectangle // rectangle segment // rectangle point // segment segment // segment point // interval interval // interval coordinate template <typename geo_type_1, typename geo_type_2> bool within(const geo_type_1& geo1, const geo_type_2& geo2, bool boundary_is_in = true) { return within(geo1, geometry_tag<geo_type_1>::type(), geo2, geometry_tag<geo_type_2>::type(), boundary_is_in); } You can't specialize it either because that would mean rewriting the entire library for every new type. I want to emphasize that we are not talking about making a boost::point and boost::polygon objects and writing a bunch of functions that accept those types. We are talking about writing fully generic functions that work with anyone's point and polygon objects. Second of all, if you write code that is doing something nonsensical it is more likely that you are doing something wrong conceptually and would prefer a compiler error to remind you that a polygon cannot be within a point. The purpose of templates is not to make every line of code the developer could write compile by breaking down the type safety of C++. It is desirable to provide conceptual type safety for geometry types such that you cannot pass point and polygon to the within() function in the wrong order and get a bug that you have to figure out during testing. It is much better to get a compiler error. It is even better if the error looks like: No function within() in point_concept that accepts user_polygon_type. So that you can easily see from the error itself what your mistake was. Luke

Simonson, Lucanus wrote:
Here I disagree. First of all, it is an absurd amount of work.
Yes, you have to make an overload per combination for optimality. However, since most concepts are refinements of polygon, handling polygon in polygon should get you support for mostly everything but spheres and interpolated stuff, but not in an optimal way.
Fully generic functions cannot be overloaded directly, so you would have to use tag dispatching.
While that solution should work fine, I suppose using SFINAE (which might be considered "direct overloading") would be best.
You can't specialize it either because that would mean rewriting the entire library for every new type. I want to emphasize that we are not talking about making a boost::point and boost::polygon objects and writing a bunch of functions that accept those types. We are talking about writing fully generic functions that work with anyone's point and polygon objects.
Overloading over concepts is not much more difficult than overloading over types using SFINAE. There are no best matches with SFINAE however. (there may be ways to emulate that, any one knows of any trick?)
Second of all, if you write code that is doing something nonsensical it is more likely that you are doing something wrong conceptually and would prefer a compiler error to remind you that a polygon cannot be within a point.
I don't see how checking if an object is within another, be the combination impossible to be true, nonsense. The type of your object may not be known by your (meta-)program and it may need to know whether that information is true. Would you rather have a trait can_be_within<T1, T2>::value than just overloading within to return false? While it could be useful, requiring to check that before using within is quite a big hassle IMO.
The purpose of templates is not to make every line of code the developer could write compile by breaking down the type safety of C++.
You'll have to show how it is type unsound.
It is desirable to provide conceptual type safety for geometry types such that you cannot pass point and polygon to the within() function in the wrong order and get a bug that you have to figure out during testing.
You are assuming you're never using 'within' from a generic context. It seemed obvious to me a geometry within boost should be generic, and enable people to abstract the type (or meta-type/concept) of objects they're dealing with. If you have to special case everything, then you can't make a generic program. As for 'within', note also that any object may be in a point if you consider bounds and if that object is reducible to that same point (a polygon with all points being the same or having only one point, a sphere with zero radius, etc.). Which is quite funny if you think about it: the point is actually the most refined concept, since it's a polygon, a sphere, a line and everything at the same time, even though it's the basic building block.

- Also, while most names are 2D-centric, "box" is 3D-centric. This seems fairly inconsistent. Or maybe it's just me and box is actually dimension agnostic.
box doesn't sound less dimension agnostic than anything else to me, but this can be discussed...
Convex Hull? Jeff

Box has a dimension-agnostic flavor to me (if not to all geometers out there). I believe the orthodox term would be (axis-aligned) parallelotope, some people also use hyperrectangle. Box is shorter and implies the pairwise orthogonality of the sides, if not the axis- alignment. Personally I'd prefer to keep box for most common use, with a clear definition in the docs, and polytope (most general) and parallelopiped (2^n sides not necessarily orthogonal) for the more exotic uses... My two cents... Sent from my iPhone On Oct 15, 2008, at 11:23 AM, Jeff Flinn <TriumphSprint2000@hotmail.com> wrote:
- Also, while most names are 2D-centric, "box" is 3D-centric. This seems fairly inconsistent. Or maybe it's just me and box is actually dimension agnostic. box doesn't sound less dimension agnostic than anything else to me, but this can be discussed...
Convex Hull?
Jeff
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

About the default constructors debate: like Mathias I'd be inclined to have constructors that guarantee the integrity of any existing object and make checks useless, that's the choice I make in most cases. Luke, could you give some actual examples of cases where you found really annoying your legacy polygon class?
Ah, yes, I remember now. It wasn't that it didn't have a default constructor that was the real problem. I removed the requirement from my library that polygon be default constructible, but that didn't even begin the solve the problems the legacy polygon caused me. The problem with the legacy polygon was that it had no constructor that would allow me to create a non-rectangular polygon without it running a series of checks on the input sequence of points that was horribly slow. There was a way to get data into it directly through a friend declared helper class that encapsulated a list of points. Once the data was in that class I could call a make polygon member function of the helper class which dynamically allocated the polygon and returned a bare pointer. The effect was that it was impossible to construct the legacy polygon efficiently without dynamically allocating it as well. This made it impossible to use the legacy polygon as an element of a std container efficiently. Upon reflection, I could use the constructor that took a rectangle, dynamically allocate a polygon, assign it to the first polygon and delete the dynamically allocated polygon. Really annoying, but not because it didn't have a default constructor. Sorry for the confusion. Back to the issue of default vs. non-default constructible polygons, I believe that while the boost geometry library may choose to define a polygon data type that may not be default constructible, it cannot make that choice for legacy polygon types. We have to assume that the library's generic interfaces will work with polygons that are default constructible and that the library's algorithms need to be implemented in a way that is robust to all kinds of things that might normally be considered an invariant of a well designed polygon class. For instance, it is a common practice to make the first and last point in the sequence of points that represents a polygon be equal to eachother to show that it is a closed figure and not a line string. We have to assume that it may or may not be the case and handle either if we are going to have a generic concept of polygon that works with legacy data structures. Similarly, it is often assumed that a certain winding convention is used with a point sequence that represents a polygon. The algorithms must either be informed of the winding direction or compute it. Other common invariants of a polygon class include that there be no zero length edges or collinear edges described by the points in the point sequence. These also cannot be relied upon if we provide a generic interface to unknown polygon type. The library should not be dividing by zero in these cases. I don't, frankly, believe that there is a meaningful runtime overhead incurred by performing the checks necessary to ensure that algorithms are invariant to these cases. Similar to polygon, rectangle (axis aligned rectangle, mind you) classes also have invariants they like to provide. Specifically, a rectangle defined by a pair of points usually guarantees that one point is less than the other in some geometric sorting order of points. A generic interface that accepts arbitrary rectangles can't really depend on the class being defined this way, so I made the interface enforce my own invariant. My generic interface gets and sets the horizontal and vertical intervals of an unknown rectangle type. The interval itself enforces the invariant that its low end is less than or equal to its high end. I assume that user defined rectangles may be default constructible, and furthermore, that default constructed rectangles may contain garbage data. I, therefore, don't have a concept of an "empty" rectangle. Luke

Back to the issue of default vs. non-default constructible polygons, I believe that while the boost geometry library may choose to define a polygon data type that may not be default constructible, it cannot make that choice for legacy polygon types. We have to assume that the library's generic interfaces will work with polygons that are default constructible and that the library's algorithms need to be implemented in a way that is robust to all kinds of things that might normally be considered an invariant of a well designed polygon class.
Indeed, my preference towards non-default constructors is available for a class but becomes much less ideal when it comes to generalize it to a concept. In that precise context, runtime checks inside the algorithms are probably preferable. Bruno

Bruno wrote:
Indeed it's something that should be done, we had a quick discussion about that with Barend and decided to postpone that for now. This kind of subject is never easy. For example, I'd like to rename circle into hypersphere or nsphere, but then we'll also have to clarify the meaning of some algorithms. For example what we currently call the area of a circle is actually the volume of an hypershere<2>...
My preference would be to have a circle data type, a sphere data type and a hypersphere data type. Only the hypersphere would be parameterized on dimension. The user would pick the object that is named for what they want. If a user wants a circle they are going to be frustrated and confounded that we provide them nsphere<2>. There is something of a conceptual disconnect between circle and sphere. Perimeter has a non-obvious meaning to a sphere, but has a clear meaning with a circle. Area of a circle is more analogous to volume of a sphere, but it would be confounding to call area(sphere) to get its volume, and equally confounding to call volume(circle) to get its area. Just so that everyone who tries to use the library isn't completely confused I think we should make the naming of things specific, and err on the side of having more functions, more objects and more concepts in the library rather than reducing it to something minimalistic but thoroughly confusing. Geometry is the ideal thing to apply a concept based type system to because everyone knows and understands geometry at a conceptual level. We have an opportunity to write a library that provides an intuitive and productive API that results in highly readable user code. We don't want to let that opportunity slip through our fingers chasing some goal of template elegance that only we appreciate. The goal of templates should not be to save typing implementing the library, and we should make the user experience we want to achieve with our library the foremost thought in our minds during its design.
box doesn't sound less dimension agnostic than anything else to me, but this can be discussed... ...
- Still about "box", it is really an axis-aligned (whatever that means in non-cartesian geometries) hyperrectangle. Since "box" can not contain an arbitrary box, maybe it should be called minmaxbox or aabox or whatever. aabox could be good.
How can you model a rectangle that is not axis aligned and is still a rectangle? The only way I can thing of is to model it is width, height, center point and rotation. If you snap its corners to the coordinate grid it will, in general, no longer be a rectangle. Similar to the circle, I like rectangle, prism and hyperprism for 2D, 3D and ND rectangles. While non-axis-aligned rectangles could be useful, you have the option of modeling them as polygons with 4 vertices that are as nearly rectangular as you can make them within your coordinate system. Luke

Simonson, Lucanus J wrote:
How can you model a rectangle that is not axis aligned and is still a rectangle? The only way I can thing of is to model it is width, height, center point and rotation. If you snap its corners to the coordinate grid it will, in general, no longer be a rectangle. Similar to the circle, I like rectangle, prism and hyperprism for 2D, 3D and ND rectangles. While non-axis-aligned rectangles could be useful, you have the option of modeling them as polygons with 4 vertices that are as nearly rectangular as you can make them within your coordinate system.
In my domain we call this an oriented bounding box. Typically in 2d its represented by a center point, x and y axis vectors, width and height. Something like: template< typename T > struct obb { point<T,2> center; vector<T,2> x_axis, y_axis; T width, height; }; You could also represent it at 3 points I think. Although I suppose in your domain where you use mostly integers these definitions might not be workable. -- Michael Marcin

AMDG Simonson, Lucanus J wrote:
My preference would be to have a circle data type, a sphere data type and a hypersphere data type. Only the hypersphere would be parameterized on dimension. The user would pick the object that is named for what they want. If a user wants a circle they are going to be frustrated and confounded that we provide them nsphere<2>. There is something of a conceptual disconnect between circle and sphere.
What is wrong with typedef nsphere<2> circle; ? In Christ, Steven Watanabe

Simonson, Lucanus J wrote:
My preference would be to have a circle data type, a sphere data type and a hypersphere data type. Only the hypersphere would be parameterized on dimension. The user would pick the object that is named for what they want. If a user wants a circle they are going to be frustrated and confounded that we provide them nsphere<2>. There is something of a conceptual disconnect between circle and sphere.
Steven Watanabe:
What is wrong with typedef nsphere<2> circle;
Nothing. I'm not against saving typing with templates, I just don't want to do it at the expense of readability and intuitiveness of the library API. Some compilers will report the typedef name circle in errors and warnings, while others will report nsphere<2>, which could hurt readability of compiler errors. Also, there is more at issue here than what to name the geometry objects the library provides. The more important issue is whether there is a separate circle concept, sphere concept and nsphere concept. Circle and sphere could potentially be related to nsphere in some manner, both circle and sphere might satisfy nsphere, for instance, but not necessarily the other way around. We might use inheritance or specialization to produce a certain effect with concepts. Something that satisfies circle would also satisfy nsphere, because circle could be derived from nsphere, but nsphere would not satisfy circle unless it was 2D. I was thinking something like the following might be a good idea: template<int n> struct hypersphere_concept { ... }; struct sphere : hypershpere_concept<3> { //optionally add stuff specific to sphere concept not applicable to nsphere ... }; struct circle : hypersphere_concept<2> { //optionally add stuff specific to circle not applicable to nsphere ... }; This general pattern should be applicable everywhere, and is similar to what I'm doing with the generic geometry type system in my library. Luke

What is wrong with typedef nsphere<2> circle; ?
Actually I made a mistake in the mail where I was talking about nshpere<2>, the class would obviously not take the dimension directly but the point type, which is itself of a certain dimension. So I guess we'd rather have something like: template <class P> struct circle: nsphere<P> { BOOST_STATIC_ASSERT(/*assert here that the dimension is 2*/); } Bruno

Simonson, Lucanus J wrote:
Bruno wrote:
Indeed it's something that should be done, we had a quick discussion about that with Barend and decided to postpone that for now. This kind of subject is never easy. For example, I'd like to rename circle into hypersphere or nsphere, but then we'll also have to clarify the meaning of some algorithms. For example what we currently call the area of a circle is actually the volume of an hypershere<2>...
My preference would be to have a circle data type, a sphere data type and a hypersphere data type. Only the hypersphere would be parameterized on dimension. The user would pick the object that is named for what they want. If a user wants a circle they are going to be frustrated and confounded that we provide them nsphere<2>.
circle would just be a typedef for nsphere<2>. Still, that has issues, since you will only be able to use that in 2D space. So if you want to use a 'circle' in 3D, it won't work, since it needs to know in which plan the circle is. You might as well use CSG to intersect a sphere with a plan and construct a circle in 3D. By the way, I think the geometry library will need to consider CSG at a point or another so that intersection can always return something, at least combinations with spheres. Unless it is chosen to approximate spheres to polyhedrons to avoid the issue.
How can you model a rectangle that is not axis aligned and is still a rectangle?
I think the minimal representation just needs three points (while an aligned one requires only two), whatever the dimension, but I'm not sure about this. Of course, you can choose to represent it as a quadrilateral, i.e. a polygon with 4 points and no holes. I fail to see how it not being axis-aligned makes its existence as a rectangle more difficult. Considering 2D, take for example that non axis-aligned rectangle (actually square because I'm lazy) defined by those points, in direct order: (0, 1), (1, 0), (0, -1), (-1, 0) Yes, that can model a rectangle that's not axis aligned, even if the coordinate systems is made with integers.
The only way I can thing of is to model it is width, height, center point and rotation. If you snap its corners to the coordinate grid it will, in general, no longer be a rectangle.
I don't really see how it stops being a rectangle once rotated. Are you thinking in terms of pixels maybe, because straight lines that are not axis-aligned aren't straight when rendered as pixels?
Similar to the circle, I like rectangle, prism and hyperprism for 2D, 3D and ND rectangles. While non-axis-aligned rectangles could be useful, you have the option of modeling them as polygons with 4 vertices that are as nearly rectangular as you can make them within your coordinate system.
Rectangles have some nice proprieties regular quadrilateral do not have, so you might be able to use more efficient algorithms by making use of a rectangle (meta-)type (or concept) instead of a quadrilateral one.

Mathias Gaunard wrote:
Simonson, Lucanus J wrote:
Bruno wrote:
Indeed it's something that should be done, we had a quick discussion about that with Barend and decided to postpone that for now. This kind of subject is never easy. For example, I'd like to rename circle into hypersphere or nsphere, but then we'll also have to clarify the meaning of some algorithms. For example what we currently call the area of a circle is actually the volume of an hypershere<2>...
My preference would be to have a circle data type, a sphere data type and a hypersphere data type. Only the hypersphere would be parameterized on dimension. The user would pick the object that is named for what they want. If a user wants a circle they are going to be frustrated and confounded that we provide them nsphere<2>.
circle would just be a typedef for nsphere<2>. Still, that has issues, since you will only be able to use that in 2D space. So if you want to use a 'circle' in 3D, it won't work, since it needs to know in which plan the circle is.
Do you mean typedef in the literal sense? I thought these were structures were supposed to work with multiple data types. Template typedefs aren't available until C++0x. Is this library still targeting the current standard? I'm a bit confused. Thanks, -- Michael Marcin

Michael Marcin wrote:
Do you mean typedef in the literal sense?
Yes and no. The circle concept could be equal to the 2-nsphere concept, if you prefer. And the circle class (or template class) that implements this concept could be a real typedef of the nsphere template class.
Template typedefs aren't available until C++0x.
Yet you can do it in C++03, it's just slightly more verbose (::type). I don't see how template typedefs are related to concepts though.
Is this library still targeting the current standard?
I suppose so. It's true using C++0x would make it cleaner, but you can have the same features in C++03 for not much more verbosity.

I don't have much time at present though I hope to be able to look at least at the documentation for this latest preview. But for the time being I can't let this aspect of the discussion go without a comment: Bruno Lalande wrote:
I'd like to rename circle into hypersphere or nsphere
Personally I prefer to call a spade a spade, a circle a circle and a rectangle a rectangle. "Two dimensional hypershepere"??? "Two dimensional paralleloflange"??? "Quadropoint cultivational utensil"!!! Yes, a library that works in arbitrary dimensions is useful to more people than one that's limited to 2 or 3 dimensions. But it _must_ be done without this sort of vocabulary. Can you imagine the reaction of someone who learnt what "square", "rectangle" and "circle" meant aged 4 and now "just wants a simple library to do some graphics", when faced with the parallelotope? Phil.

Phil wrote:
Personally I prefer to call a spade a spade, a circle a circle and a rectangle a rectangle. "Two dimensional hypershepere"??? "Two dimensional paralleloflange"??? "Quadropoint cultivational utensil"!!!
Whats wrong with typedef digging_object<long_handled, wood_shaft, foot_assist, pointed_end> spade; ? Sorry Steven, I couldn't resist ;) Luke

Phil Endecott wrote:
Yes, a library that works in arbitrary dimensions is useful to more people than one that's limited to 2 or 3 dimensions. But it _must_ be done without this sort of vocabulary. Can you imagine the reaction of someone who learnt what "square", "rectangle" and "circle" meant aged 4 and now "just wants a simple library to do some graphics", when faced with the parallelotope?
What if that someone uses circle in a 3D environment, expecting a circle, and realizes it's actually a sphere?

Barend Gehrels wrote:
Many discussions and comments were related to the concepts and, more specific, to the Point concept. This is the most basic thing of a geometry library and has to be as generic as possible. We think we implemented it like that into the library. The library can handle all point types including legacy point types, point types provided by the library, user defined point types, Cartesian points, native arrays, points with spherical coordinates. The library provides point_xy for Cartesian systems and point_ll for lat long systems, but the library user is thus not confined to those types.
I've just recently thought about the point concept, because I was more focused on geometric objects the first time I looked into that library. I think it's doing two things at a time: - abstracting the actual point type being used, so that it can work with types from other libraries. - specifying the coordinate system, which also defines the dimension. Therefore, wouldn't it be a better idea that a point trait directly specifies a coordinate system? Then a coordinate system can specify its strategy traits. Decoupling the two would result in less duplication, I think.

Mathias Gaunard wrote:
- specifying the coordinate system, which also defines the dimension.
Actually, this is wrong. It doesn't specify a coordinate system, but a geometry. That means the topology of space gets different, and thus rules of what objects are and can be in the space are different. So the distance trait is no mere strategy: it actually defines the topology of the space. I think it should clearly be redesigned to show that we are selecting a type of geometry, defining the space we're working on. Instead of pythagoras/harvesine/vincenty, it should be euclidian/spherical/ellipsoid. By the way, the objects need to be defined not only in a dimension-agnostic way, but also in a space-agnostic way. Apart from that, allowing multiple coordinate systems within a given space would be fairly useless, and in Euclidian space the Cartesian system is the best choice since it was made for algebra and computations. For example, points expressed in polar coordinates should automatically perform conversions from/to cartesian when using get.

Mathias Gaunard wrote:
Apart from that, allowing multiple coordinate systems within a given space would be fairly useless, and in Euclidian space the Cartesian system is the best choice since it was made for algebra and computations. For example, points expressed in polar coordinates should automatically perform conversions from/to cartesian when using get.
I disagree with this. The best choice of coordinate system for internal representation is problem dependent. Imposing transformations to and from Cartesian internally can cause both numeric stability problems and performance problems. Every coordinate system was created for algebra, calculus and computations. Making the best choice of coordinates for a representation can speed calculations by orders of magnitude, or even in some cases make problems solvable that weren't in Cartesian coordinates (because of a lack of an appropriate representation for the solution). This is a huge deal in analytic calculations (See, for example the books by Morse and Feshback, or the early editions of Arfken where chapters are dedicated to finding and operating in the best coordinate system for the job. Also see facilities for this work in Maple or Mathematica.), but it is a sometimes under appreciated aspect of numeric calculation, as well. This is a conceptually similar problem to the discussions about unit systems from a couple of years ago. High accuracy and high performance calculations can't afford to have hidden conversion costs in computation and precision imposed on them. A well designed library should make such hidden conversions possible for those who want them, while also not making them mandatory for those who can't afford to pay for them. John Phillips

John Phillips wrote:
Every coordinate system was created for algebra, calculus and computations.
Yet when dealing with vectors and matrices, I'm pretty sure the cartesian coordinate system is what everyone uses.
Making the best choice of coordinates for a representation can speed calculations by orders of magnitude, or even in some cases make problems solvable that weren't in Cartesian coordinates (because of a lack of an appropriate representation for the solution). This is a huge deal in analytic calculations (See, for example the books by Morse and Feshback, or the early editions of Arfken where chapters are dedicated to finding and operating in the best coordinate system for the job. Also see facilities for this work in Maple or Mathematica.), but it is a sometimes under appreciated aspect of numeric calculation, as well.
I've heard that various analytical problems are much more easily represented in other coordinates. But is really the kind of problems you want a boost geometry library to deal with? Apart from that, I've seen people use homogeneous coordinates (which are very similar to cartesian) to avoid doing some divisions that might lead to numerical instability. That should be an easy addition.
This is a conceptually similar problem to the discussions about unit systems from a couple of years ago. High accuracy and high performance calculations can't afford to have hidden conversion costs in computation and precision imposed on them. A well designed library should make such hidden conversions possible for those who want them, while also not making them mandatory for those who can't afford to pay for them.
I thought it was better for simplicity, dealing with different geometries seems hard enough, but a prerequisite of the library since it aims at dealing with GIS; and hopefully, there won't be too many geometries to handle. On the other hand, different representations of points in the same space are all equivalent, and one representation is enough to describe everything we want to deal with, even if not in the most precise way. The problem is that it seems hard for algorithms to consider all the possibilities in existence for the points they're dealing with to be represented and select the right way to compute something accordingly. Latitude/Longitude seems close enough to cartesian so I supposed that a lot of code can be shared. Maybe only doing vector operations and generalizing to any orthogonal coordinate system is possible without much hassle by working with normalized basis vectors, but wouldn't that be quite similar to converting to cartesian everytime? I honestly have zero knowledge of non-cartesian coordinates, so I have no knowledge of the vector theory behind it (and I have no knowledge of how vectors apply to alternative geometries either). Maybe that could somehow also be applied to help dealing with other geometries that are locally euclidian? If there are ways to write algorithms generically in a coordinate-system-independent way, and maybe even geometry-independent way, then it would be great, for sure. But is that really possible?

Mathias Gaunard wrote:
John Phillips wrote:
Every coordinate system was created for algebra, calculus and computations.
Yet when dealing with vectors and matrices, I'm pretty sure the cartesian coordinate system is what everyone uses.
It is usually the first thing to try, but it is not always the best idea. I won't claim that a large fraction of the computing community is conscious of the other possibilities, or of when they are useful. However, much of the scientific computing part of the community is familiar with multiple systems and reasons to use them.
Making the best choice of coordinates for a representation can speed calculations by orders of magnitude, or even in some cases make problems solvable that weren't in Cartesian coordinates (because of a lack of an appropriate representation for the solution). This is a huge deal in analytic calculations (See, for example the books by Morse and Feshback, or the early editions of Arfken where chapters are dedicated to finding and operating in the best coordinate system for the job. Also see facilities for this work in Maple or Mathematica.), but it is a sometimes under appreciated aspect of numeric calculation, as well.
I've heard that various analytical problems are much more easily represented in other coordinates. But is really the kind of problems you want a boost geometry library to deal with?
Apart from that, I've seen people use homogeneous coordinates (which are very similar to cartesian) to avoid doing some divisions that might lead to numerical instability. That should be an easy addition.
I would like a boost library that serves as large a segment of the problem space as is reasonably possible. I haven't tried to design one, so I don't know if the extension somehow collides with the abstractions and implementation details needed for a clearly designed Cartesian library. I don't think so, with the small amount of thought I've put into it so far, but that isn't the same as a real study of the problem. Some problems are simplified incredibly by a good choice of coordinate system. As a trivial example, consider a circle in two dimensional coordinates. There is no properly defined function in Cartesian coordinates that produces a circle. (There is a relation, but a circle is not single valued in Cartesian coordinates, and so can't be written as a function.) The function in plane polar coordinates is r = c (The radius equals a constant), when talking about a circle of radius c centered at the origin. This is a function, and a particularly simple one. Physics problems in orbital mechanics and the quantum mechanics of electron orbitals, and mathematical problems of interpolation using radial basis functions are less trivial examples. In each case, the selection of a good coordinate system for the problem allows for a clear separation of concerns in the calculation. Analytically this can make it possible to apply techniques to the isolated components of the problem. Computationally, this reduces the multi-dimensional problem into a set of independent or nearly independent one dimensional problems. Since the computational effort of many common numerical techniques scales exponentially in the number of dimensions, this separation moves us out of exponential scaling and into linear scaling in the number of dimensions. I've never met a computer programmer who wouldn't approve of that trade.
This is a conceptually similar problem to the discussions about unit systems from a couple of years ago. High accuracy and high performance calculations can't afford to have hidden conversion costs in computation and precision imposed on them. A well designed library should make such hidden conversions possible for those who want them, while also not making them mandatory for those who can't afford to pay for them.
I thought it was better for simplicity, dealing with different geometries seems hard enough, but a prerequisite of the library since it aims at dealing with GIS; and hopefully, there won't be too many geometries to handle. On the other hand, different representations of points in the same space are all equivalent, and one representation is enough to describe everything we want to deal with, even if not in the most precise way.
The problem is that it seems hard for algorithms to consider all the possibilities in existence for the points they're dealing with to be represented and select the right way to compute something accordingly. Latitude/Longitude seems close enough to cartesian so I supposed that a lot of code can be shared.
All of the typical algebraic, differential and integral structures have well known translations into this world of different translations. For ortho-normal systems the key is the Jacobian. Even systems that are not ortho-normal can be approached with tensor methods. For some very common structures, like the dot product or the cross product, the algorithm doesn't even change. For differential and integral structures, the change is typically covered by throwing in a multiplicative factor. Details of this are available in most vector calculus books, or in the above mentioned books on mathematical physics from my previous note.
Maybe only doing vector operations and generalizing to any orthogonal coordinate system is possible without much hassle by working with normalized basis vectors, but wouldn't that be quite similar to converting to cartesian everytime?
No, it wouldn't. The Cartesian unit vectors are constants, and the Cartesian area and volume elements do not depend on current coordinate values. These facts are generally not true for non-Cartesian systems. (In plane polar coordinates, the same change of angle and the same difference in inner radius compared to outer radius sweeps out different areas depending on the actual value of the inner radius. As the inner radius gets bigger, the area swept out increases.) For problems where the natural description of the paths and shapes looks like rectangles, Cartesian coordinates provide a simple and accurate basis for calculation. However, if the natural paths and shapes don't look like parts of rectangles, Cartesian coordinates provide a bad description of the system and there is a need for substantial extra calculation to process the system in those coordinates. The answer is to look for a different system that matches the natural paths and shapes. (Better put, it shares the same symmetries as the problem.) These symmetries are automatically preserved just by writing the problem in the new coordinate system. There is no need to put computational time and effort into calculating them, or into sanity check functions that check for them and correct the results if they stray away from the required symmetries. In a best choice coordinate system, one characteristic change in the system can be well represented as one small change in a single system variable. (If it is a geometric system we are describing, the variables are coordinates.) This is good for the exact same reason that isolating effects in different modules in a computer program is good.
I honestly have zero knowledge of non-cartesian coordinates, so I have no knowledge of the vector theory behind it (and I have no knowledge of how vectors apply to alternative geometries either). Maybe that could somehow also be applied to help dealing with other geometries that are locally euclidian?
If there are ways to write algorithms generically in a coordinate-system-independent way, and maybe even geometry-independent way, then it would be great, for sure. But is that really possible?
This is the real question. My first guess is that it is possible for many algorithms. However, this is only an educated guess, and not the result of time spent producing real and practical implementations, so it could be wrong. In the cases where it isn't possible, we should try to do the same thing all generic programming does. Define a minimal set of concepts that must be modeled by the coordinate system for the algorithm to work correctly and embed such concepts and concept checks in the library. John

John Phillips wrote:
Some problems are simplified incredibly by a good choice of coordinate system. As a trivial example, consider a circle in two dimensional coordinates. There is no properly defined function in Cartesian coordinates that produces a circle. (There is a relation, but a circle is not single valued in Cartesian coordinates, and so can't be written as a function.) The function in plane polar coordinates is r = c (The radius equals a constant), when talking about a circle of radius c centered at the origin. This is a function, and a particularly simple one.
That seems only useful if you define your geometric objects as functions on coordinates. The geometry library doesn't take this analytical approach at all, since it provides a finite set of basic geometric objects. Basically, points, lines, segments, circles (which are really hyperspheres) and polygons are all you have to describe shapes with this library. I suppose it should be extended to also include half-lines, planes, half-planes, half-spaces, hyperplanes, etc. Adding the ability to construct objects from the union/intersection/difference of other objects could also be useful. Those objects are not even defined in terms of functions or not even equations upon their coordinates, but with more practical relations around points. For example, a circle is defined as the set of points that are at equal distance to some other point. A circle is thus represented as a point plus a distance. A polygon with no hole is the space which is inside a closed path of points. It is thus represented as a sequence of points in some specific order. Whether the points use one or another coordinate system has no consequences on the way the set of points is represented. The algorithms, however, will perform computations on those points, and thus require some proprieties on them. Orthonormal coordinates give a well-known framework to code in. The point is, I don't say what gain you can have by allowing other coordinate systems, since it will not allow easier expressions of shapes or problems in that library. Maybe working with other coordinate systems would be more the job of a different library dedicated to differential geometry, than of this library which only cares about simple basic and mostly discrete shapes. To clarify, this library should maybe be called computational geometry. The scope of the library is actually fairly close to the description at wikipedia: <http://en.wikipedia.org/wiki/Computational_geometry> I don't even know how one could define a polygon as a function in any coordinate system, by the way. As for your circle example, it only works if the center is at the origin, so how do you generalize it without parameterizing? By the way, in case you haven't noticed, I have zero differential geometry knowledge, so I barely understand most of what you've been saying ;).

Mathias wrote:
I don't even know how one could define a polygon as a function in any coordinate system, by the way.
You define it as a piecewise linear logic inequality. A planar polygon can be thought of as a function on x and y such that for any x and y that lies inside the polygon the function returns 1 and for any x and y that lies outside the polygon the function returns 0. The actual mathematic expression for doing that isn't really interesting, but thinking of a polygon as a mathematical function is. I take the partial derivative of a polygon in x and y to reduce it from something that is conceptually two dimensional to a series of zero dimensional quantities. These quantities are impulse functions with either positive or negative sign at a given point with a given direction. This means I can model them as positive or negative unit vectors. When I scan those quantities, I am taking the integral with respect to x then y and I get back my original polygon. When I scan many polygons this way I can do boolean logic operations on the 2D logic functions that are the input polygons. Because of the way I model polygons the algorithm employs arithmetic rather than flow control to do the core computational effort. Obviously, this approach extends easily into higher dimensional space at the conceptual level, but in practice, it is very hard to write an efficient n-1 dimensional sweep-plane algorithm due to a lack of suitable data structures, but it is easy to write a scan-line that handles the 2D case in near linear time using, for example, the stl red-black tree. Figures that have curves are not piecewise linear and harder to take the derivative/integrate because there is no order reduction taking place. In a different coordinate system the curves might be analogous to straight lines, allowing for order reduction, but the more common approach in computational geometry is to simply model the curves as their piecewise linear approximation and solve the problem in the Cartesian coordinate system. While non-Euclidean geometries are undeniably valuable, I think that it is not worthwhile to try to generalize the geometry away from the geometry library. Refactoring algorithms that are similar in different coordinate systems to be shared in their implementation is probably harder than duplicating that effort in separate libraries that target separate geometries. I think it is preferable to design the library to interoperate well with other geometries, rather than try to encompass them. Luke

After a little more thought, I have a different way to talk about coordinate systems that may make more sense in a computer science setting. Philosophical arguments aside, let us assume that some piece of space is the real object you are trying to use and describe in your program. In that case, as with any real thing brought into a computer program, you produce some abstraction and work with the abstraction as an approximate replacement for the real thing. The quality of that abstraction is determined by a few factors. Does it capture the important facets of the problem at hand? Does it provide an interface that allows for a clear description and solution of the problem? Can it be implemented in a way that translates from the interface to the hardware in a reasonably efficient and accurate way? Does it allow the user to solve the problem in terms of already familiar concepts? (There are some others, but this is enough of a list for the moment.) Cartesian coordinates are one example of such an abstraction. They are very useful for a number of real problems and they are the simplest to visualize, so they are often the first version anyone learns. By analogy, a simple array is typically the first data structure anyone learns when programming. It is easy to visualize, and easy to implement so it is a good first example. It is also useful for many problems so the time spent learning it is valuable. In the context of this analogy, other coordinate systems (such as spherical, or cylindrical, or confluent hypergeometric) are equivalent to other data structures (such as lists, queues or trees). They are a different abstraction of the same underlying real object (the piece of space, or the collection of data). They are a good choice exact in those cases where they make problems easier to think about and to solve. Just as anyone who does anything with a computer must know how to use at least one data structure, experts know the proper situations in which to apply many different data structures, and there are specialty data structure that are only known to the small communities that need them regularly; the same can be said for coordinate systems. Everyone who has ever made a graph knows at least one (usually, Cartesian), experts know many (most physicists are comfortable with at least 3 and could apply dozens at need), and some specialized communities have systems they use frequently that the rest of us have never spent time thinking about. The quality of this abstraction is measured the same way as any other, and by that measure of quality it is true that Cartesian is just not always the best choice. I don't know if this helps, but I'm trying to move my reasons out of specialized math and science considerations and into the language more common in good design. John

Barend Gehrels wrote:
Herewith the URL to the third preview of our Geometry Library, including documentation and examples.
Hi Barend, Some quick comment follow. Apologies for not looking at this in more depth. - Can you please see if it's possible to make the URL quoted above work better when Javascript is disabled? - I note that you're supporting homogeneous coordinates in arbitrary dimensions. My current requirements are for at least partially heterogeneous coordinates in 2 or 3 dimensions. Of course you can't support all permutations and combinations of requirements, but you should perhaps qualify some names to indicate their limited scope. - I have always used a very simple point type: struct point { T x; T y; };. It seems that I could continue to use "user" code in this style with your library; however, as a potential library author, I need to replace "p.x" with something like "access<P>::get<0>(p)". I'm not wildly enthusiastic about this verbosity. I accept that I'm outvoted on this, but I'm going to continue to remind you (all) that I'm unhappy about it. - I have previously noted that you can advertise your distance function as returning a "type convertible to T" in order to avoid unnecessary sqrt calculations while, for example, scanning a list of points to find the one furthest from p. - I look forward to seeing concepts for shapes. - Is there a strategy for "within" that can be used when the shapes being tested store their bounding boxes? Regards, Phil.

Hi Phil, > >> Herewith the URL to the third preview of our Geometry Library, >> including documentation and examples. >> http://geometrylibrary.geodan.nl > - Can you please see if it's possible to make the URL quoted above > work better when Javascript is disabled? As far as I knew, and can see now, there is no Javascript at all in the pages. The documentation is generated by Doxygen. It uses css stylesheets but no further tricks. Just tried with Lynx and everything seems to be all right for me. Do more people have problems with it? > > - I note that you're supporting homogeneous coordinates in arbitrary > dimensions. My current requirements are for at least partially > heterogeneous coordinates in 2 or 3 dimensions. Of course you can't > support all permutations and combinations of requirements, but you > should perhaps qualify some names to indicate their limited scope. OK, good to clarify this. > - I have always used a very simple point type: struct point { T x; T > y; };. It seems that I could continue to use "user" code in this > style with your library; however, as a potential library author, I > need to replace "p.x" with something like "access<P>::get<0>(p)". I'm > not wildly enthusiastic about this verbosity. I accept that I'm > outvoted on this, but I'm going to continue to remind you (all) that > I'm unhappy about it. It is true that you can continue using p.x and p.y, and it is also true that if you want to follow the concepts you have to use a more verbose style. A shorter variant of that verbose style is: geometry::get<0>(p), and if you are, as library author, already in the namespace geometry it is just get<0>(p). I got used to it. The advantage is that the resulting code will support many point types. > - I have previously noted that you can advertise your distance > function as returning a "type convertible to T" in order to avoid > unnecessary sqrt calculations while, for example, scanning a list of > points to find the one furthest from p. I just reread your previous note and my answer on it. It is possible with the library to avoid square-roots, when you take the implementation functions, which is allowed. However, it is interesting and we will relook at your alternative. > > - I look forward to seeing concepts for shapes. > > - Is there a strategy for "within" that can be used when the shapes > being tested store their bounding boxes? It is not yet implemented like this. It is currently not possible because the input is a "geometry::polygon<...>" and not yet a concept. But when it is a concept, and we can attach strategies to polygon-types, it would be possible and indeed very interesting. I will try to work it out as a useful testcase. Regards, Barend

Phil wrote:
- I note that you're supporting homogeneous coordinates in arbitrary dimensions. My current requirements are for at least partially heterogeneous coordinates in 2 or 3 dimensions. Of course you can't support all permutations and combinations of requirements, but you should perhaps qualify some names to indicate their limited scope.
Can your coordinates cast to and from one coordinate type at the interface between your data structure and the algorithms provided by a generic library?
- I have always used a very simple point type: struct point { T x; T y; };. It seems that I could continue to use "user" code in this style with your library; however, as a potential library author, I need to replace "p.x" with something like "access<P>::get<0>(p)". I'm not wildly enthusiastic about this verbosity. I accept that I'm outvoted on this, but I'm going to continue to remind you (all) that I'm unhappy about it.
You only need to use the accessor function when you don't know the data type of the object. If you are writing code within the library you can use std::pair<coordinate_type> for point and throw it in a vector and sort it and just get on with your life. The accessors are just for the interfaces.
- I have previously noted that you can advertise your distance function as returning a "type convertible to T" in order to avoid unnecessary sqrt calculations while, for example, scanning a list of points to find the one furthest from p.
I have a euclidean_distance_squared function on geometries that does what you say. Distance calls distance squared and returns its square root.
- I look forward to seeing concepts for shapes. - Is there a strategy for "within" that can be used when the shapes being tested store their bounding boxes?
I have a winding direction accessor function on my generic interface to unknown polygon type T. If the polygon knows its winding direction in constant time it can return it, otherwise it returns unknown and I compute the winding direction. A similar thing could be done for bounding box, where the cached bounding box can be returned in constant time. Because these things make the interface bigger, I'd like to come up with a way to make them syntactically optional. Any competent implementation of within on polygon would do bounding box comparison first to rule out simple cases. Getting the bounding box would go to the bounding box accessor first and could get your cached bounding box to speed up the check. Luke

- I have always used a very simple point type: struct point { T x; T y; };. It seems that I could continue to use "user" code in this style with your library; however, as a potential library author, I need to replace "p.x" with something like "access<P>::get<0>(p)". I'm not wildly enthusiastic about this verbosity. I accept that I'm outvoted on this, but I'm going to continue to remind you (all) that I'm unhappy about it.
You don't have to use this syntax in your own code, it's only used in the library. The only case in which you'll have to use this syntax yourself is when you want to write a library based on the Geometry Library and with the same level of genericity. Bruno

-------------------------------------------------- From: "Bruno Lalande" <bruno.lalande@gmail.com> Sent: Monday, October 27, 2008 11:11 AM To: <boost@lists.boost.org> Subject: Re: [boost] Preview 3 of the Geometry Library
Brandon if you read this, there's a typo at 3 places of your code: "dimenstion" instead of "dimension".
Hi Bruno, I'm not sure why you felt the need to point this out; the stuff I put up on the vault is a work in progress which was submitted for feedback. These have already been fixed. You can get the latest version which likely has other typos from: http://gengeomalg.sourceforge.net/ There is svn access. Regards, Brandon
Regards Bruno _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

I'm not sure why you felt the need to point this out; the stuff I put up on the vault is a work in progress which was submitted for feedback. These have already been fixed. You can get the latest version which likely has other typos from:
Oops sorry, I was looking at an old local copy at that moment indeed, and forgot to check your SVN. What you describe about compile-time / runtime access sounds very good! I'll have a look as soon as possible to see it in details. Bruno

-------------------------------------------------- From: "Bruno Lalande" <bruno.lalande@gmail.com> Sent: Monday, October 27, 2008 11:11 AM To: <boost@lists.boost.org> Subject: Re: [boost] Preview 3 of the Geometry Library
In looking more at Brandon's code, I have to repeat my observation that he has done an impressive job of defining a generic geometry type system and using boost libraries to do it. I think that if we make his accessor dimension agnostic (either runtime or compile time indexing) we have a winner.
Indeed, it's a very interesting part of his library. Unfortunately we really have to find a consensus about the compile-time / runtime access before we can do anything together... Maybe I will open a new thread about this precise subject.
Brandon if you read this, there's a typo at 3 places of your code: "dimenstion" instead of "dimension".
Sorry for the double reply, but I thought I should point out that my latest code-base also tackles the run-time/compile time access issue via specialization of the access types. I have a type indexed_access_traits which is specialized for coordinate_sequences (of which point and vector types specialize via coordinate_sequence_traits). The specialization of indexed_access_traits incorporates a flag which says whether the code should use run time access or compile time access for each particular coordinate_sequence type. If the user's point/vector types have either get<Index> or operator[] for the corresponding compile-time/run-time accessor.. then they can simply use a supplied macro to declare their point as a valid coordinate_sequence type... and all the indexed_access_traits work for free. I've even been playing around with operators for dot product and scalar arithmetic on vector types defined using this mechanism... so all in all pretty good stuff in my code base so far. I've also tacked the reference frame issue a bit.. providing mechanisms for users to use either implicit frame transforms or to proceed as if they are in complete control of the state at any given time. Please do have a look. I would love to hear feedback from any who are interested. Cheers, Brandon P.S. (This is a preliminary site that a friend put together for me.... the text is a bit brash :D.... please ignore and go to the download link to find SVN access.) http://gengeomalg.sourceforge.net/
Regards Bruno _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (13)
-
Barend Gehrels
-
Brandon Kohn
-
Bruno Lalande
-
Herve Bronnimann
-
Jeff Flinn
-
John Phillips
-
Mathias Gaunard
-
Michael Fawcett
-
Michael Marcin
-
Phil Endecott
-
Simonson, Lucanus J
-
Steven Watanabe
-
Stjepan Rajko