
Dear All, Here is my review of GGL: I have attempted to try the same examples that I did for my Boost.Polygon review. Unfortunately limited time has made it impossible for me to be as thorough as I would have liked. Here is what I have done, and the issues that I encountered along the way. (You may want to look at my Boost.Polygon review for background. It can be seen here: http://thread.gmane.org/gmane.comp.lib.boost.devel/193204/focus=193758 ) My first exercise was a GUI application in which I wanted to merge a large number of small rectangles into a single more complex polygon. Unfortunately I failed at the first step with this: I wanted to teach the library about my existing types (point, box, etc), but how to do this is not properly documented. It seems that there are some macros (with not-Boost-style names) but their only mention in the documentation is in an example. Yes, I could probably have worked out what to do from the example, but I have limited time and furthermore I don't think it's acceptable to expect a library end-user to learn something relatively fundamental like this from an example. (From my brief look at the example, it does look as if this mechanism is significantly less verbose that what I had to do to teach Boost.Polygon about my types, which is a good thing.) So I moved on to my second example, which reads a file containing the borders of the states and provinces of the US and Canada and determines which ones contains points that the user enters. I implemented this using the geometry types provided by the library, so the question of adapting to types did not arise. This all worked quite smoothly, though I did encounter the following issues, mainly with the documentation, which I imagine could be corrected quite easily: - The docs do not say what the C template parameter in point_xy<T,C> is. - The "complete list of members" for ggl::linear_ring is empty. (Presumably the issue here is that the members are all inherited from the std::container; it could at least say that.) - The documentation for ggl::linear_ring says that V is a "container type e.g. std::vector, list, deque"; it should describe this in terms of std concepts (presumably bidirectional container) so that I know if my own custom type will work. Looking at the code I don't see any concept-checking for this; that would be nice. - The args to the within() function are reversed wrt the equivalent function in Boost.Polygon. This is a good example of the issue that I raised in a previous email, where the generic nature of the concisely-named functions can lead to mistakes. - Doxygen doesn't give the full path to the headers, i.e. "point_xy.hpp" rather than "geometries/point_xy.hpp". - There seems to be a dependency on Boost.Math that isn't mentioned on the list of dependent libraries. - The library doesn't seem to be in namespace boost. I guess the intention is to add this later. This is unusual. - I was surprised that the box class doesn't have width and height members. - I got a page full of warnings from my test program; the cause seemed to be unused parameters in within.hpp, lines 118 and 178. Since I have touched only a small fraction of the library, I imagine there are other places with the same issue. As I said, these are minor issues that I imagine could be corrected quickly. However they did give me the impression that the library is still very much a work-in-progress - especially the documentation - rather than a fully-formed finished work. I don't necessarily believe that a library has to be perfect before it can be accepted into Boost, but I think it needs to be a little less rough around the edges than this. That unfortunately was all of the actual testing that I have been able to do. I'll now discuss other issues: Firstly there's the situation concerning Boost.Polygon. My feeling is that the situation that has developed here is very unfortunate. I think it's extraordinary that Boost can even consider having two incompatible, overlapping Geometry libraries reviewed within a few weeks of each other, with authors who are so publicly antagonistic towards each other. How on earth did we get into this situation? (Answer: IMO, the progress towards review flowed to the current situation by following the path of least resistance, and "someone" - a "leader" ? - should have jumped in earlier to steer things in a more constructive direction.) Anyway, Boost.Polygon has now been accepted (within an hour of this review starting! I mean, was that calculated to be antagonistic?) and we have to consider what that implies. One option is to imagine the two libraries in parallel universes. That may be the most stress-free way. But it's not what would deliver the best results for Boost's end users. What I would prefer to see is some work to investigate how more common ground can be introduced. For example: - Common terminology. ("within" vs. "contains", "extents" vs. "envelope", and dozens more. GGL's not-unreasonable adoption of OGC terminology is an issue here.) - Inter-operable types. I would like to hope that the basic geometry traits classes could be shared. Failing that, perhaps a method of forwarding from one set of traits to the other could be used. I am unsure how deeply the choice of tag dispatch vs. SFINAE makes inter-operation more difficult. Maybe someone could comment on that. Let me say a little about performance. There has been a great deal of discussion about this on the list. My feeling is that all libraries should document the worst-case complexity of all of their functions. In the current case we have polygon boolean operations where I believe I learnt as an undergraduate 20 years ago that they should be O(n log n). Here IIUC the implementation is worst-case O(n^2) but typically better. I feel uncomfortable with that, however many benchmarks and years of real-world use have gone in to it. Is it not possible to contrive an introsort-like "fallback" solution that gives both good typical case performance and an optimally-bounded worst case guarantee? The other controversial implementation issue is floating-point robustness. I'm still not clear what the position of this library is on that. Are we just given assurances that it seems to be robust in real-world use, or is it believed that it actually is certainly robust? I expected to see this discussed in the library documentation, but I don't see anything - maybe I've missed it. If it is "believed to be robust in real-world use", it would be helpful to describe the possible failure modes (e.g. segfault, runs forever, or just outputs the wrong answer.) In summary: Congratulations to the authors on getting their library to this point. I'm sure that it contains much useful code. At this time, the library is let down by incomplete documentation, and I feel it should be rejected for that. I believe it is also essential that some effort is done to find common ground (I'm avoiding saying "merge") with Boost.Polygon. If the authors are unable to work together, then I would prefer that Boost keep things simple for its end users, and avoid years of antagonism on the mailing list, but suggesting that one (or other!) library should exist outside of Boost. To answer the usual questions: - What is your evaluation of the potential usefulness of the library? Very useful. - Did you try to use the library? With what compiler? Yes; gcc-4.x on Linux. - How much effort did you put into your evaluation? I've spent about 5 hours in all. - Are you knowledgeable about the problem domain? I'm not a geometry expert, but I have done projects that could make use of this sort of library. Regards, Phil.

Hi Phil, Thanks for your review. Let me clarify a few points: Phil Endecott wrote:
- The library doesn't seem to be in namespace boost. I guess the intention is to add this later. This is unusual. - I got a page full of warnings from my test program; the cause seemed to be unused parameters in within.hpp, lines 118 and 178. Since I have touched only a small fraction of the library, I imagine there are other places with the same issue. As I said, these are minor issues that I imagine could be corrected quickly. However they did give me the impression that the library is still very much a work-in-progress - especially the documentation - rather than a fully-formed finished work. I don't necessarily believe that a library has to be perfect before it can be accepted into Boost, but I think it needs to be a little less rough around the edges than this. At this time, the library is let down by incomplete documentation, and I feel it should be rejected for that.
Our library is used by people. We have a mailing list, a wiki and there is a user base. Therefore: - it is not in namespace boost. The preparation of Formal Review took several months. We cannot surprise our users with adding namespace boost, and after rejection removing it. Besides that, I've asked this on the list because I didn't feel comfortable publishing a library with namespace boost, while not being accepted. Dave Abrahams answered: "Thank you very much for being so responsible." and about using before acceptance: "However, I'm neither encouraging nor discouraging that practice." - this is also the case for macro definitions, we didn't want to start with BOOST for them. - and this is actually also part of the issue of documentation: You are right, the documentation is not perfect, not everything is there yet. It might seem rough but behind the screens many effort is put in it. Iff we are accepted, we want to move to BoostBooks. If not, we will (probably) select something else, e.g. a Wiki. Because the library is large, we could not do this before review and take the risk to move after rejection to something else. We know that Doxygen is not perfect. What IS perfect in Doxygen is the integration between small code samples and documentation. All the samples you see do really compile. They are extracted automatically from sample-sources (included in the distribution). We want to keep this feature so on selecting another doc system we have to inventorize the possibilities (tips welcome). Regards, Barend

AMDG Barend Gehrels wrote:
Iff we are accepted, we want to move to BoostBooks. If not, we will (probably) select something else, e.g. a Wiki. Because the library is large, we could not do this before review and take the risk to move after rejection to something else. We know that Doxygen is not perfect. What IS perfect in Doxygen is the integration between small code samples and documentation. All the samples you see do really compile. They are extracted automatically from sample-sources (included in the distribution). We want to keep this feature so on selecting another doc system we have to inventorize the possibilities (tips welcome).
Quickbook has a way to extract fragments from source code http://tinyurl.com/ygjn8oh In Christ, Steven Watanabe

Quickbook has a way to extract fragments from source code http://tinyurl.com/ygjn8oh
Thank you, Steven! Regards, Barend

Phil Endecott wrote:
The other controversial implementation issue is floating-point robustness. I'm still not clear what the position of this library is on that. Are we just given assurances that it seems to be robust in real-world use, or is it believed that it actually is certainly robust? I expected to see this discussed in the library documentation, but I don't see anything - maybe I've missed it. If it is "believed to be robust in real-world use", it would be helpful to describe the possible failure modes (e.g. segfault, runs forever, or just outputs the wrong answer.)
This is an issue that troubles me as well. I think that a floating point computational geometry library is possibly a first for Boost in that you often have heuristics rather than algorithms due to the fuzzy effect of using floating types in comparison predicates. One of the more useful features of the library (GGL) would of course be the boolean operations. The problem however is clearly going to be robustness. I have never encountered a robust floating point boolean operation library in my 9 years of working in the geometry domain. While this may not mean it's impossible, I think it does mean it's unlikely. This is the reason why Boost.Polygon uses integral types. So a good question is how should the community judge whether such an algorithm based on floating types is acceptable into Boost? I think we've all come to expect that when we adopt a Boost library into our work, it should be correct. I would suggest the requirement that the library authors demonstrate that their algorithm is both correct and robust. We as a community should help define how this is done. Regards, Brandon

On Mon, Nov 16, 2009 at 7:06 AM, Brandon Kohn <blkohn@hotmail.com> wrote:
One of the more useful features of the library (GGL) would of course be the boolean operations. The problem however is clearly going to be robustness.
For some.
I have never encountered a robust floating point boolean operation library in my 9 years of working in the geometry domain. While this may not mean it's impossible, I think it does mean it's unlikely.
I believe that this is because for many use-cases, it isn't worth the run-time performance and development-time trade-off.
I think we've all come to expect that when we adopt a Boost library into our work, it should be correct. I would suggest the requirement that the library authors demonstrate that their algorithm is both correct and robust. We as a community should help define how this is done.
In my application domain, I'm really not that interested in numerical instability due to floating point imprecision. For my use-cases, a few things work fine w/ single precision, and double precision works well for everything else. If you can give me 100% numerical stability without making my code slow, your code clunky to use, or delay release, then great! Jon

Jonathan Franklin wrote:
In my application domain, I'm really not that interested in numerical instability due to floating point imprecision. For my use-cases, a few things work fine w/ single precision, and double precision works well for everything else.
Yes, but for something like boolean operations, this means it works sometimes and completely fails at others. I can see the logic of accepting the uncertainty and hoping for the best in practice, but it does beg the question whether Boost should contain such libraries. There are many buggy floating point boolean op libraries out there. Why do we need another? Regards, Brandon

On Mon, Nov 16, 2009 at 9:33 AM, Brandon Kohn <blkohn@hotmail.com> wrote:
Jonathan Franklin wrote:
In my application domain, I'm really not that interested in numerical instability due to floating point imprecision. For my use-cases, a few things work fine w/ single precision, and double precision works well for everything else.
Yes, but for something like boolean operations, this means it works sometimes and completely fails at others.
Well, for me, it means that it *almost always* works, and that *one* time it failed, I didn't notice. ;-)
I can see the logic of accepting the uncertainty and hoping for the best in practice, but it does beg the question whether Boost should contain such libraries.
As long as the known or suspected numerical issues are documented, then the user can make her own judgment as to whether the library is suitable for her needs. Clearly, there are people on this list that need complete and total numerical stability.
There are many buggy floating point boolean op libraries out there. Why do we need another?
Do we have a *standard* buggy FP boolean op library? Because I need one to build some other buggy algorithms on top of. ;-) And are we talking about numerical floating point issues, or algorithmic bugs? I don't consider those to be the same thing. Jon

Brandon Kohn wrote:
There are many buggy floating point boolean op libraries out there. Why do we need another?
I would like to ask you to refrain from spreading FUD with regards to the library under review without any concrete proof. Regards Hartmut GGL Review Manager ------------------- Meet me at BoostCon http://boostcon.com

Hartmut Kaiser wrote:
Brandon Kohn wrote:
There are many buggy floating point boolean op libraries out there. Why do we need another?
I would like to ask you to refrain from spreading FUD with regards to the library under review without any concrete proof.
I don't think this is a fair appraisal of my statements. It is common knowledge in computational geometry that floating point operations in boolean operations (orientation tests, intersections...) are inherently flawed and so in the absence of any proof by the authors that they have attempted to compensate I think it is justified to bring up these points. There has been some insinuation in recent posts that the GGL Boolean Ops could be used to perform floating point boolean operations as a proxy for Boost.Polygon. Such a usage would require that we know if the implementation actually works. It wasn't intended as a personal attack on GGL. Brandon

Hartmut Kaiser wrote:
Brandon Kohn wrote:
There are many buggy floating point boolean op libraries out there. Why do we need another?
I would like to ask you to refrain from spreading FUD with regards to the library under review without any concrete proof.
Hartmut, please see Barend's reply to my question about robustness: "
4. If the non-self-intersecting invariant is violated at the input of the polygon pair intersection algorithm what does the algorithm do?
This depends on how the polygon self-intersects, and how the self-intersection intersects with the other polygon. If the self-intersection is untouched by the other polygon, you'll get exactly that self-intesection back (for a union) or it is discarded (for an intersection). The self-intersections are not even tried to be detected. This is on purpose because it saves another call of get_intersection_points, which is the most time-consuming in the current implementation. If the self-intersection interferes with the other polygon, you can get an invalid polygon back. As the input was already considered as invalid, this is what is to be expected. We've researched this recently because it occured in practice. The algorithm will find an invalid path, detect that and skip that output built up so far. " and "
7. When intersection points are snapped to the floating point grid at the output of the polygon pair intersection algorithm edges may move laterally some small distance which may introduce self intersection artifacts. How is this handled by the polygon intersection algorithm?
This is not handled explicitly. On segment-segment intersection in the epsilon-regions, the intersection will fall between the endpoints of both segments. Even if it is rounded, it will not surpass those endpoints. Therefore, self-intersections will be unlikely in most cases. I'll not say that it is impossible, but it is unlikely. Our implementation is not yet checked on 100% numerical robustness. If using floating point (IEEE-single), there might be errors in the 1e-3x regions. If using double precision, there will be less errors. That is to say: in close-to-100% of the use-cases: no errors. But in the regions of 1e-3xx differences, errors might be expected, and in some cases they might even cause self-intersections, I'll not deny that now, we've to do more research here. " I might add that the robustness problem with line segment intersection is worse than that caused by snapping to floating point grid, since many bits of precision are lost during the line segment intersection point computation. I am not arguing for rejection of the library. I am arguing that it should be accepted when it meets requirements of algorithmic correctness, performance and numerical robustness and provides sufficiently generic interfaces for performing multi-polygon intersection operations. I'd like to add that the work to support infinite precision floating point coordinates in the intersection should be done before it is accepted, since it turns out that isn't actually supported yet. Basically I'm arguing that the library should be finished before becoming a part of boost, and that we accept in on condition that Barend finish it and do the things he says he plans to do before it is released. Thanks, Luke

Hartmut Kaiser wrote:
Brandon Kohn wrote: There are many buggy floating point boolean op libraries out there. Why do we need another?
I would like to ask you to refrain from spreading FUD with regards to the library under review without any concrete proof.
Hartmut, I'm unhappy that you as review manager made that comment. You ought to be more impartial. I don't think it's unreasonable to call the FP robustness issues "bugs". I think it's very likely that GGL has these issues when used with FP coordinates, and I don't accept your "concrete proof" requirement for making that allegation. The library's approach seems to be, "if you're not happy with undefined behaviour, use an arbitrary-precision coordinate type". Please correct me if I'm wrong, but I believe the library doesn't provide any way to hide the arbitrary-precision type and the type conversions from the user as an implementation detail, nor does it have a mechanism to fallback only when it detects a problem with the FP calculation, nor have any benchmarks been done on the arbitrary-precision case. Phil.

Brandon Kohn wrote: There are many buggy floating point boolean op libraries out there. Why do we need another?
I would like to ask you to refrain from spreading FUD with regards to
Hartmut Kaiser wrote: the
library under review without any concrete proof.
Hartmut, I'm unhappy that you as review manager made that comment. You ought to be more impartial. I don't think it's unreasonable to call the FP robustness issues "bugs". I think it's very likely that GGL has these issues when used with FP coordinates, and I don't accept your "concrete proof" requirement for making that allegation. The library's approach seems to be, "if you're not happy with undefined behaviour, use an arbitrary-precision coordinate type". Please correct me if I'm wrong, but I believe the library doesn't provide any way to hide the arbitrary-precision type and the type conversions from the user as an implementation detail, nor does it have a mechanism to fallback only when it detects a problem with the FP calculation, nor have any benchmarks been done on the arbitrary-precision case.
I'm sorry to disagree. The OP stated: <quote> There are many buggy floating point boolean op libraries out there. </quote> So far so good, I might agree with this. But then: <quote> Why do we need another? </quote> which refers to the library under review, clearly saying that it is buggy, no? Did I misunderstand that? If yes, I apologize. Please don't get me wrong. I've been working in GIS industry long enough to fully understand the problems of robustness and arithmetic stability (or the lack of) of geometry algorithms. That's not my point. I reacted to the unsupported claim of a reviewer stating the library we're currently looking at is buggy. It might very much be GGL has those bugs as you're silently prejudicing (is that an English verb?). But we're not here to point fingers, but to help improving design, implementation, and if needed, to pinpoint bugs. Broad allegations don't help with this. I'm not going to argue here about your statement arbitrary-precision has to be an implementation detail. I accept this as your personal opinion, but whether this should be a requirement for GGL has to be decided by the reviewers. Regards Hartmut GGL Review Manager ------------------- Meet me at BoostCon http://boostcon.com

Hi Brandon, Jonathan,
I believe that this is because for many use-cases, it isn't worth the run-time performance and development-time trade-off.
GGL provides the following approaches, let me summarize: 1) library users can use points with double, long double or float coordinates 2) for higher numerical robustness: users can call algorithms using another calculation type (e.g. GMP or CLN, ...) 3) idem: users can use points with GMP or CLN coordinates
There are many buggy floating point boolean op libraries out there. Why do we need another? Can you explain this? I've done several benchmarks comparing 7 libraries and I think they all reported the correct results. Do you refer to cgal, geos, gpc, terralib, gtl (b.p.), wykobi or ggl? Or do you use them in other domains than we've tested them?
And are we talking about numerical floating point issues, or algorithmic bugs? I don't consider those to be the same thing.
I totally agree with this. Algorithmic bugs should not be there, not in any library, of course. Numerical floating point issues can be addressed using arbitrary arithmetic types, and besides that other measures can be taken using the default floating point types. Only the very last thing is not done in GGL. Regards, Barend

Barend Gehrels wrote:
There are many buggy floating point boolean op libraries out there. Why do we need another? Can you explain this? I've done several benchmarks comparing 7 libraries and I think they all reported the correct results. Do you refer to cgal, geos, gpc, terralib, gtl (b.p.), wykobi or ggl? Or do you use them in other domains than we've tested them?
Hi Barend, I'm referring to OpenGL's CSG boolean ops and PolyBoolean, both of which routinely fail in an application with which I work. (It requires partitioning space by iteratively subtracting isovists from a triangle partition of a polygon with holes.) I have read the CGAL also has issues for similar reasons.
And are we talking about numerical floating point issues, or algorithmic bugs? I don't consider those to be the same thing.
I totally agree with this. Algorithmic bugs should not be there, not in any library, of course. Numerical floating point issues can be addressed using arbitrary arithmetic types, and besides that other measures can be taken using the default floating point types. Only the very last thing is not done in GGL.
I think this goes without saying. I was of course talking about issues where an algorithm fails because the underlying numeric type cannot be used correctly in the algorithm. As for GGL's working with arbitrary arithmetic types, I would have to take your word for it. I have found a few bits in the code that seem to be converting to double.. and so I'm not so sure about your claim. //excerpt from relate_cartesian_segments::relate double ra = double(da) / double(d); double rb = double(db) / double(d); Also when I perform: test_all<ggl::point_xy<int> >(); //to test correct It uses the following predicate (presumably for sorting coordinates): std::less<double> predicate ... //From correct_polygon: correct_ring<ring_type, std::greater<double> >::apply(*it); //* From get_intersection_points.hpp double eps = 1.0e-10; if (dir.how == 'i' && (dir.ra < eps || dir.rb < eps || 1.0 - dir.ra < eps || 1.0 - dir.rb < eps So these are at least four places where double/ fp issues could creep into an calculations with arbitrary precision types. Regards, Brandon

Hi Brandon,
I'm referring to OpenGL's CSG boolean ops and PolyBoolean, both of which routinely fail in an application with which I work. (It requires partitioning space by iteratively subtracting isovists from a triangle partition of a polygon with holes.) I have read the CGAL also has issues for similar reasons.
Thanks, I didn't tested them nor the use case you describe, but would be happy to test it (I mean: the use case)
Numerical floating point issues can be addressed using arbitrary arithmetic types, and besides that other measures can be taken using the default floating point types. As for GGL's working with arbitrary arithmetic types, I would have to take your word for it. I have found a few bits in the code that seem to be converting to double.. and so I'm not so sure about your claim.
Sure, you are right here. I think I mentioned this in another mail but will summarize it here with references. I was talking about how the design is setup. We've designed it with strategies and (optional) calculation types. That design is used in e.g. area calculation. Because the numeric_adaptor (this was another posting on the boost list) was not 100% completed and because it seems to have much overlap with Boost.Math bindings, we chose here for showing the design in some algorithms and we didn't yet apply it everywhere. So it is not yet applied to intersection indeed, and as the focus of the whole library is currently on intersections, I admit that I regret that and we should have chosen to apply it there. Anyway, I hope you'll believe me thus and please look in area, http://tinyurl.com/ggl-area1 and http://tinyurl.com/ggl-area2 where it is applied (it is also applied in centroid and distance). It will be in (a.o.) intersections too. Regards, Barend

Jonathan Franklin wrote:
On Mon, Nov 16, 2009 at 7:06 AM, Brandon Kohn <blkohn@hotmail.com> wrote:
One of the more useful features of the library (GGL) would of course be the boolean operations. The problem however is clearly going to be robustness.
For some.
I have never encountered a robust floating point boolean operation library in my 9 years of working in the geometry domain. While this may not mean it's impossible, I think it does mean it's unlikely.
I believe that this is because for many use-cases, it isn't worth the run-time performance and development-time trade-off.
The acceptability may depend on the failure mode; it is "looks OK but subtly imperfect", "wrong but valid output", "invalid output", "runs forever", "segfaults", or "reformats disk"? A while ago, Fernando Cacciola posted some links to a set of presentations and papers about FP line intersection robustness that I found very enlightening: http://www.mpi-inf.mpg.de/departments/d1/ClassroomExamples/ or directly to the PDF: http://www.mpi-inf.mpg.de/~kettner/pub/nonrobust_cgta_06.pdf I strongly encourage anyone who wants to express an opinion on this subject in the context of the current review to look at this material; it is very accessible and on page 2 there's a compelling example of what can go wrong. Personally, I find it better to use fixed point (e.g. for latitude/longitude) and relax knowing that I don't have to worry. Regards, Phil.

On Mon, Nov 16, 2009 at 6:34 PM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
A while ago, Fernando Cacciola posted some links to a set of presentations and papers about FP line intersection robustness that I found very enlightening: ....
I strongly encourage anyone who wants to express an opinion on this subject in the context of the current review to look at this material; it is very accessible and on page 2 there's a compelling example of what can go wrong.
I include a the end of this message he 24/3/2009 email on geometry robustness as it provides a succinct opinion. It's unfortunate that this email encouraged a confrontational approach rather than the common boost cooperation approach !
Personally, I find it better to use fixed point (e.g. for latitude/longitude) and relax knowing that I don't have to worry.
One thing is to answer what is the best library design for multiple application domains and the other one is what is typical in each domain. Is it common to use fixed point in the GIS domain ? I think libraries have to capture common practice so that is easy to migrate to them --------------------------- 24/3/2009 email on Geometry Robustness by Fernando Cacciola -------------------------- Hi boost geometers ;) When Barend first presented his geometric library proposal I argued that the so-called "epsilon-based" approach used by the library is just fundamentally wrong, and most importantly, totally unnecessary since much better methodologies exist. It deeply worries me to find that the GGL proposal haven't been updated in this regard, because the instrumentation of the proper techniques have a significant impact on the overall design. Robustness can't be treated as an implementation detail to be deal with later on (and I'm sorry if I haven't made this point clear when I warn about it before). If I where to review the library today I would have to strongly vote for rejection because of this. Here is a simple example of the fundamental problems I see all over the proposed library: There is a point_in_circle algorithm, which (as I can see from its implementation), essentially returns whether the distance from the point to the center is smaller than the circle radius. No doubt *that* is the one and only correct *mathematical* formulation for the predicate, but is one that just doesn't stand in practice at all. Computing *that very particular predicate ROBUSTELY* has been the subject of active research during the past decade, to such an extent, that pretty much every paper or article on the subject of robustness in geometric computing uses the in-circle test as a common example. The library proposed by Lucannus, OTOH, at least acknowledges the problem appropriately and approaches a solution in the right direction (there is still some important design-influencing considerations but are not showstopers) Here is a starting point http://www.mpi-inf.mpg.de/departments/d1/ClassroomExamples/ Best

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Phil Endecott Sent: Monday, November 16, 2009 5:35 PM To: boost@lists.boost.org Subject: Re: [boost] GGL Review
http://www.mpi-inf.mpg.de/~kettner/pub/nonrobust_cgta_06.pdf is indeed thought-provoking.
Personally, I find it better to use fixed point (e.g. for latitude/longitude) and relax knowing that I don't have to worry.
But isn't this assuming that your latitude/longitude (or whatever) is exact? Although the latitude/longitude values you are using obviously are 'exact', there is always an uncertainty about the real-life position. If you can't tell the algorithm what this (not-zero) uncertainty is, you can't expect an exact answer. Is this why one group of users accepts that his input is uncertain and accepts (slightly) uncertain result, but others (like you and Luke) assert your input is exact and demand an exact result? Paul --- Paul A. Bristow Prizet Farmhouse Kendal, UK LA8 8AB +44 1539 561830, mobile +44 7714330204 pbristow@hetp.u-net.com

Hi,
Is this why one group of users accepts that his input is uncertain and accepts (slightly) uncertain result
There are many discussions going on about (numerical) robustness and the problems they can give in libraries. We want to clarify our approach once again, and give some backgrounds. People seem to understand different things by the word robustness, as pointed out on this list: a) numerical floating point issues b) algorithmic bugs c) those two are distinct, but also related I'll first explain c) in more detail. Intersections in the past were often done using perturbations. See for example http://tinyurl.com/ggl-rb1 (Greiner/Hormann, a variant of Weiler/Atherton). This makes some lifes easier. Easier for the implementator, but not for the user. It is *not* correct behaviour and results in (on large scale) small deviations and might actually cause bugs (if moved vertices happen to cause new intersections (see e.g. Kim&Kim, http://tinyurl.com/ggl-rb2)). All this is *not* done in GGL. We do *not* follow this approach and perturbate vertices. We also do not follow the approach of Kim&Kim. These perturbations might be one cause of the fear people apparently have on FP intersections. Still about c): Another approach that widely used GIS packages have is the tolerance value. For intersecting or cleaning, a tolerance can be specified and all points within that tolerance are merged, and there is a default tolerance as well. This results in moving vertices. This tolerance approach is also *not* followed by GGL. This tolerance behaviour is probably another cause of problems people have had in the past (and in the present!) using FP intersections. About b). The perturbation option was invented to make programmers life easier, but can cause bugs as explained. If *not* used, programmers live is more difficult and this can lead into bugs. This is probably another cause of problems, I do not have a reference to it (it is welcome). About a). Numerical floating point calculation can lead to values considered equal while they are not, to overflow and rounding errors, etc. There are several ways to approach this: - using high precision arithmetics - using methods to avoid overflow or rounding errors. This is e.g. done in the Boost.Math hypot function. A classical example is the calculation of triangular area, where terms can be sorted to get a more precise result (e.g. http://tinyurl.com/ggl-rb3). So there are several methods to avoid instability and we don't know them all, some of them might be applicable to our algorithms. This is what I meant by "not checked on 100% robustness". As pointed out in the discussions on this list this spring (a.o. "The dependent concept should explicitely require unlimited precision since the library specifically uses it to *guarantee* robustness. [...] Users should be left with no choice but to pick some external component fulfilling the unlimited precision requirement"), it seems that it is not possible to solve all these issues using any FP number. Therefore we decided to go for supporting high precision numeric types first. Our approach (I presented this list earlier but realized that I forgot option 4): 1) library users can use points with double, long double or float coordinates 2) for higher numerical robustness: users can call algorithms using another calculation type (e.g. GMP or CLN, ...) 3) idem: users can use points with GMP or CLN coordinates 4) GGL supports strategies (http://tinyurl.com/ggl-rb4). They are selected by default but can be overriden by the user. This is like 2, but it is also possible to *implement* a strategy yourself and implement it using another numeric approach (still having the same coordinate type).The design allows this. This approach can be seen in area, distance and centroid but not yet in intersections. If this approach doesn't fully make sense, please state exactly why. Regards, Barend

I think this discussion has deviated a bit of course from what Jonathan and I were discussing. I should have been a bit more explicit when I said 'Why do we need another buggy library.'. If you follow our discussion, I started by stating my concerns about the floating point robustness of GGL with respect to the boolean operations. The point I was trying to make was that I believe that GGL would be the first library of its kind in Boost that walks the line of being provably correct (WRT boolean operations with FP coordinates) and that this perhaps means that more care needs to be taken in reviewing it. Jonathan replied that for his needs it didn't matter if the algorithms were correct, only that they worked most of the time. My own view is that something that only works part of the time really doesn't work. From these two polarized views came my statement which was essentially meant to say: Why would we really want an algorithm that only works some/most of the time in Boost? Of course a good argument to that is that we can use GMP as the coordinate type if we need the added stability. Fine. This still comes back to my original post/reply that (IMO) these considerations are treading new ground in a review process and that the author of the library needs to address these concerns by providing tests proving any (implied) claims of an algorithm's correctness. If this cannot be done at the start, perhaps the boolean operations should be omitted until such a time as they can be properly proven. Luke seems to have much expertise in the area of boolean operations, and has already suggested a battery of tests. I think this would be an excellent place to start. Regards, Brandon

and has already suggested a battery of tests. I think this would be an excellent place to start.
Hi Brandon, some questions on this: - did you look here: http://geometrylibrary.geodan.nl/formal_review/sets.html Probably yes but I want to be sure that you read it - did you look in the tests? I don't think we need a "place to start" because we already have many tests on this, in the distribution, and more. I mean by this that we're started, on the way. But thanks for pointing this out. - clearly we didn't cover *all* cases, because of the bug in the (new) assemble process, but we know that a lot of tests are necessary and therefore we have a battery of them - personally I don't think that batteries of random data will cover *all *cases. I understand that exceptions or endless loops might become clear. But with random it is difficult to verify the results. We can ask Luke of course, how he verifies them, and I've no problem to add them then. - it is still not clear to me if you refer to "algorithmic robustness" (handling all cases) or "100% numeric stability" (still being correct in the ranges where float or double cannot be expressed anymore), see also our last post about this Regards, Barend

Barend Gehrels wrote:
and has already suggested a battery of tests. I think this would be an excellent place to start.
Hi Brandon, some questions on this:
Hello Barend,
- did you look here: http://geometrylibrary.geodan.nl/formal_review/sets.html Probably yes but I want to be sure that you read it
I have read this.
- did you look in the tests? I don't think we need a "place to start" because we already have many tests on this, in the distribution, and more. I mean by this that we're started, on the way. But thanks for pointing this out.
I have looked at test_overlay.hpp which seems to contain around 15 or so tests of very small polygons. My use cases require robust usage on polygonal regions (with holes) that come from CAD for airports, cities etc. with many operations performed sequentially and progressively on the outputs of previous iterations. So my point is that the tests we have seem inadequate in in proving more than a rudimentary level of robustness/correctness. Perhaps I missed more tests?
- clearly we didn't cover *all* cases, because of the bug in the (new) assemble process, but we know that a lot of tests are necessary and therefore we have a battery of them
- personally I don't think that batteries of random data will cover *all *cases. I understand that exceptions or endless loops might become clear. But with random it is difficult to verify the results. We can ask Luke of course, how he verifies them, and I've no problem to add them then.
Batteries of tests will never cover all cases in geometry because there are infinitely many input configurations. I think large numbers of tests on reasonably constructed random inputs does inspire more confidence that things work and is a good way to uncover problems. Real world use cases are probably the best, but these are much harder to implement in the sizes to really hammer the algorithm. I'll see if I can come up with some data to help on this.
- it is still not clear to me if you refer to "algorithmic robustness" (handling all cases) or "100% numeric stability"
In this thread I have been talking about how it performs using native FP types. So what can I expect from using a double. Clearly this is where a majority of your users will come in. I don't expect that it will work in all cases as we all know it probably cannot. My own experience with FP boolean operations has shown them to be flawed with models of only moderate complexity (which I why I hesitate recommending them for Boost). Perhaps your algorithms fare better than the ones I've used (I hope!) At this point I don't know that we (from the tests you have) are certain if it works even when using something like GMP. We won't know until we test, find something that breaks using double and then works with GMP. I would define algorithmic robustness under this condition: when something works with a GMP type but fails using a native FP type. I don't really say 100% numeric stability anywhere do I? I think you never have this under native FP types. My assumption there is that numeric stability is defined as how the number's representation changes with use in arithmetic. Regards, Brandon

Brandon,
- did you look in the tests? I don't think we need a "place to start" because we already have many tests on this, in the distribution, and more. I mean by this that we're started, on the way. But thanks for pointing this out.
I have looked at test_overlay.hpp which seems to contain around 15 or so tests of very small polygons. My use cases require robust usage on polygonal regions (with holes) that come from CAD for airports, cities etc. with many operations performed sequentially and progressively on the outputs of previous iterations. So my point is that the tests we have seem inadequate in in proving more than a rudimentary level of robustness/correctness. Perhaps I missed more tests?
We have more tests and most of them test very small polygons. What I test with those is different cases / configurations (see your point below). I'm testing the configurations where small (preferably triangles, sometimes more) are formed in cases, important for the algorithm implemented. The more detailed tests are testing the phases independantly, they are rougher so I didn't include them in formal distribution. However, they can be provided.
Batteries of tests will never cover all cases in geometry because there are infinitely many input configurations.
I was mentioned this because you (or Luke) mentioned batteries of tests... I agree, total configurations are infinit. Cases different for the algorithm should be finit.
I think large numbers of tests on reasonably constructed random inputs does inspire more confidence that things work and is a good way to uncover problems. Real world use cases are probably the best, but these are much harder to implement in the sizes to really hammer the algorithm. I'll see if I can come up with some data to help on this.
Your testdata is very welcome.
- it is still not clear to me if you refer to "algorithmic robustness" (handling all cases) or "100% numeric stability"
In this thread I have been talking about how it performs using native FP types. So what can I expect from using a double. Clearly this is where a majority of your users will come in. I don't expect that it will work in all cases as we all know it probably cannot. My own experience with FP boolean operations has shown them to be flawed with models of only moderate complexity (which I why I hesitate recommending them for Boost). Perhaps your algorithms fare better than the ones I've used (I hope!) At this point I don't know that we (from the tests you have) are certain if it works even when using something like GMP. We won't know until we test, find something that breaks using double and then works with GMP. I would define algorithmic robustness under this condition: when something works with a GMP type but fails using a native FP type. I don't really say 100% numeric stability anywhere do I? I think you never have this under native FP types. My assumption there is that numeric stability is defined as how the number's representation changes with use in arithmetic.
I have cases where float goes wrong and (long) double does not. I've not yet a case where double goes wrong but that should be feasable with some trial&error. So I can test this against GMP-types. Only (as you mentioned) I've to do some search/replace in segment-intersection (actually it is somewhat more than search/replace of course) to have them there as well. However, if float goes wrong and double goes right, GMP should go right as well... Barend

Barend Gehrels wrote:
- personally I don't think that batteries of random data will cover *all *cases. I understand that exceptions or endless loops might become clear. But with random it is difficult to verify the results. We can ask Luke of course, how he verifies them, and I've no problem to add them then.
Batteries of random tests won't cover all cases, but will cover more than hand designed tests and much faster. You should use both hand designed tests for cases you think are important to test and randomly generated tests for the cases you wouldn't thing of (and didn't think of while writing the algorithm.) I explained how to verify the correctness of the results when I suggested the random tests. Use a known correct algorithm and perform an XOR operation on the results of your algorithm and the known good then shrink (negative offsetting) the result of that by a small value to eliminate artifacts of numerical error. Anything left in the output is an area of discrepency that you need to investigate. Thanks, Luke

On Tue, Nov 17, 2009 at 6:35 AM, Brandon Kohn <blkohn@hotmail.com> wrote:
The point I was trying to make was that I believe that GGL would be the first library of its kind in Boost that walks the line of being provably correct (WRT boolean operations with FP coordinates) and that this perhaps means that more care needs to be taken in reviewing it.
And if we can provide a strategy (in addition to "normal" FP-based operations) that fulfills this, then that's totally awesome! I *might* actually need it some day. You need it now.
Jonathan replied that for his needs it didn't matter if the algorithms were correct, only that they worked most of the time.
WRT FP-induced instability.
My own view is that something that only works part of the time really doesn't work.
I'm sure that's great for whatever you do. It is valid in many use cases. The visualization software that I work on, on the other hand, prefers speed over making sure that point p1 is provably inside or outside the box. If it's *that* close, then it doesn't matter. Or more aptly, when computing the union of many polygons to simply display an area of coverage, it is much more important to be fast than to use the provably correct point p1 or p2, which happen to be so close together, that uncertainty/error in measurement is more significant than the FP precision. And it turns out that for my use case, the points are so close together, either one will do fine. I posit that the majority of users of a geometry library have similar needs to my own.
Why would we really want an algorithm that only works some/most of the time in Boost?
I stand by my use case, and I accept yours.
... the author of the library needs to address these concerns by providing tests proving any (implied) claims of an algorithm's correctness.
I agree.
If this cannot be done at the start, perhaps the boolean operations should be omitted until such a time as they can be properly proven.
If the operations cannot be shown correct *without* consideration of FP-precision, then they should be omitted. In that case, they clearly would not be ready. If they can be shown correct, up to FP-precision, then include them, and document the fact that they are unstable WRT FP-precision. Once we have provably stable and correct versions, with an easy and intuitive way to choose the version we want, then we'll rule the pool. ;-)
Luke seems to have much expertise in the area of boolean operations, and has already suggested a battery of tests. I think this would be an excellent place to start.
I agree. Thanks, Jon

Jonathan Franklin wrote:
My own view is that something that only works part of the time really doesn't work.
I'm sure that's great for whatever you do. It is valid in many use cases.
The visualization software that I work on, on the other hand, prefers speed over making sure that point p1 is provably inside or outside the box. If it's *that* close, then it doesn't matter.
Or more aptly, when computing the union of many polygons to simply display an area of coverage, it is much more important to be fast than to use the provably correct point p1 or p2, which happen to be so close together, that uncertainty/error in measurement is more significant than the FP precision. And it turns out that for my use case, the points are so close together, either one will do fine.
Hi Jonathan, I figured I should point out here that all failures will not be in the form of marching points, merged points, or point in polygon tests failing. The effects can manifest in the form of failed predicate checks which lead to large changes in the outputs. For example, if an algorithm at some point performs an orientation test and determines a state based on that orientation, another part of the algorithm may derive values and make decisions which are in conflict with that state. For boolean operations this can mean you have vertices which are skipped on output, incorrect point ordering when recovering faces/rings, or any other number of odd artifacts. These are of course algorithm dependent, and I don't claim to know how GGL would fail. Regards, Brandon

Jonathan Franklin wrote:
On Tue, Nov 17, 2009 at 6:35 AM, Brandon Kohn <blkohn@hotmail.com> wrote:
My own view is that something that only works part of the time really doesn't work.
I'm sure that's great for whatever you do. It is valid in many use cases.
The visualization software that I work on, on the other hand, prefers speed over making sure that point p1 is provably inside or outside the box. If it's *that* close, then it doesn't matter.
Hi Jonathan, Do please have a look at the paper that I linked to before, and in particular the figure at the top of the second page: http://www.mpi-inf.mpg.de/~kettner/pub/nonrobust_cgta_06.pdf Two points: 1. Failures may not occur only in cases when things are very close. A problematic input may cause the output to be grossly wrong. 2. You may not get a wrong answer; the program might instead segfault or loop forever. Regards, Phil.

On Tue, Nov 17, 2009 at 12:23 PM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
Jonathan Franklin wrote:
On Tue, Nov 17, 2009 at 6:35 AM, Brandon Kohn <blkohn@hotmail.com> wrote:
My own view is that something that only works part of the time really doesn't work.
I'm sure that's great for whatever you do. It is valid in many use cases.
The visualization software that I work on, on the other hand, prefers speed over making sure that point p1 is provably inside or outside the box. If it's *that* close, then it doesn't matter.
Hi Jonathan,
Do please have a look at the paper that I linked to before, and in particular the figure at the top of the second page:
http://www.mpi-inf.mpg.de/~kettner/pub/nonrobust_cgta_06.pdf
Two points:
1. Failures may not occur only in cases when things are very close. A problematic input may cause the output to be grossly wrong.
2. You may not get a wrong answer; the program might instead segfault or loop forever.
This discussion raises some important implementation considerations, but for the purposes of this review, the discussion should simply be "Are we requiring that the implementation be provably 100% robust and 100% numerically stable before acceptance?". If the answer to that is yes, then I think we are holding GGL at a *much* higher standard than previously accepted libraries (I'm not talking about any specific library). Note that even CGAL, a highly regarded library failed that paper's tests. I believe the interfaces and concepts should be what is discussed. Implementation bugs and optimizations can come later if the interfaces are well designed. I'm hoping to do a full review within a day or two, but I just wanted to add my two cents to this discussion. --Michael Fawcett

On Tue, Nov 17, 2009 at 12:36:46PM -0500, Michael Fawcett wrote:
This discussion raises some important implementation considerations, but for the purposes of this review, the discussion should simply be "Are we requiring that the implementation be provably 100% robust and 100% numerically stable before acceptance?".
If the answer to that is yes, then I think we are holding GGL at a *much* higher standard than previously accepted libraries (I'm not talking about any specific library). Note that even CGAL, a highly regarded library failed that paper's tests.
I don't think that is what the paper says. CGAL uses a traits mechanism to allow the user to select the geometric kernel, which includes the coordinate number type and the kind of predicates and constructions used. You are allowed to use floating point numbers and naive predicates. The algorithms will certainly fail. The point of the paper is precisely this. But CGAL does not stop there: you can use the *same* algorithms but with exact number types for coordinates. This will work, but generally be very slow. One of the features of CGAL, however, is that you can get the best of both worlds: use floating point coordinates but still perform exact predicates. Using floating-point filters the speed can be competitive with the naive algorithm, but without the crashes. -Steve

Hi Steve, [disclaimer: I'm just responding to current traffic totally out of thread and without knowing the current state of the GGL]
Using floating-point filters the speed can be competitive with the naive algorithm, but without the crashes.
There is in fact something VERY important to add here: I know for a fact, since I have carefully looked into it, that Boost.Polygon is 100% robust when used with integer coordinates provided the user properly sets the "extended integer type" needed by the filtering mechanism (which is already set when the coordinate type is "int" and sizeof(long) == 2*sizeof(int)). IOW, Boost.Polygon uses a filter, just like CGAL. This is why I specifically said that the library should be considered to be restricted to integer coordinates. While this was not brought up during the formal review, it was in the preceding discussions. I was aware of this and this *fundamental library property* was the most important reason why I accepted the library into boost. If you google the developers list you should find my very lenghly explanations about why this is an absolute requirement, why GTL (at that time) was already meeting it, and why GGL was not. I even sketched a general approach that could be followed to make GGL robust with floating point coordinates (essentially just explaining what a floating point filter is and how it works). FWIW, Barend is very well aware of all this and has always been more than willing to do any required fixing. What happened is that I never had all the time needed to dedicatedly give him the neccesary guidance, so the issue is probably still unresolved. Best -- Fernando Cacciola SciSoft Consulting, Founder http://www.scisoft-consulting.com

Hi list,
I even sketched a general approach that could be followed to make GGL robust with floating point coordinates (essentially just explaining what a floating point filter is and how it works).
FWIW, Barend is very well aware of all this and has always been more than willing to do any required fixing. What happened is that I never had all the time needed to dedicatedly give him the neccesary guidance, so the issue is probably still unresolved.
Because the discussions on robustness are over different threads, we created a new documentation page describing our approach: http://geometrylibrary.geodan.nl/formal_review/robustness.html Regards, Barend

On Wed, Nov 18, 2009 at 10:52 PM, Steve M. Robbins <steve@sumost.ca> wrote:
On Tue, Nov 17, 2009 at 12:36:46PM -0500, Michael Fawcett wrote:
This discussion raises some important implementation considerations, but for the purposes of this review, the discussion should simply be "Are we requiring that the implementation be provably 100% robust and 100% numerically stable before acceptance?".
If the answer to that is yes, then I think we are holding GGL at a *much* higher standard than previously accepted libraries (I'm not talking about any specific library). Note that even CGAL, a highly regarded library failed that paper's tests.
I don't think that is what the paper says.
CGAL uses a traits mechanism to allow the user to select the geometric kernel, which includes the coordinate number type and the kind of predicates and constructions used. You are allowed to use floating point numbers and naive predicates. The algorithms will certainly fail. The point of the paper is precisely this.
But CGAL does not stop there: you can use the *same* algorithms but with exact number types for coordinates. This will work, but generally be very slow. One of the features of CGAL, however, is that you can get the best of both worlds: use floating point coordinates but still perform exact predicates. Using floating-point filters the speed can be competitive with the naive algorithm, but without the crashes.
I don't know CGAL at all, so I won't argue any of your points, but the paper does point out cases where it fails with the floating point types, which I'm fairly sure GGL would fail as well. The GGL authors have advised to use exact number types if you require exactness, and their interfaces are extensible to support this. It would be great if they could be extended to the point CGAL is at. Maybe they already can? I have no clue. Regardless, the point I was trying to make was that (to me) the interface is more important than the implementation at this stage in the library's life, and that I don't think it's fair to base acceptance solely on whether the implementation is 100% robust and numerically stable unless the documentation states says otherwise. --Michael Fawcett

Thu, Nov 19, 2009 at 5:24 AM, Michael Fawcett <michael.fawcett@gmail.com> wrote:
Regardless, the point I was trying to make was that (to me) the interface is more important than the implementation at this stage in the library's life, and that I don't think it's fair to base acceptance solely on whether the implementation is 100% robust and numerically stable unless the documentation states says otherwise.
Hi Michael, This is a very important point in which many agree completely. This is an issue related to updating the Boost review process. The summary to me is that a proposed libary should not merit a Boost review if its scope doesn't match Boost goals (clearly stated at the beginning of the home page). It's for this reason that I argued strongly to Fernando that Boost.Polygon should be withdrawn to avoid setting a precedent (despite of other technical merits the library has!) regards

Jose wrote:
Thu, Nov 19, 2009 at 5:24 AM, Michael Fawcett <michael.fawcett@gmail.com> wrote:
Regardless, the point I was trying to make was that (to me) the interface is more important than the implementation at this stage in the library's life, and that I don't think it's fair to base acceptance solely on whether the implementation is 100% robust and numerically stable unless the documentation states says otherwise.
Hi Michael,
This is a very important point in which many agree completely. This is an issue related to updating the Boost review process.
The summary to me is that a proposed libary should not merit a Boost review if its scope doesn't match Boost goals (clearly stated at the beginning of the home page). It's for this reason that I argued strongly to Fernando that Boost.Polygon should be withdrawn to avoid setting a precedent (despite of other technical merits the library has!)
Please state clearly in what ways the scope of Boost.Polygon doesn't match Boost goals. Regards, Luke

Hello Jose, I tried to answer both to your current mail, and to the issues raised previously. I tried to answer your current mail straight and direct in the first part of the mail (so don't interpret the tone as unfriendly, its just meant to be clear and succinct), and then try to go into more detail of the issues raised previously. ---------- Jose wrote:
Michael Fawcett wrote:
Regardless, the point I was trying to make was that (to me) the interface is more important than the implementation at this stage in the library's life, and that I don't think it's fair to base acceptance solely on whether the implementation is 100% robust and numerically stable unless the documentation states says otherwise.
This is a very important point in which many agree completely. This is an issue related to updating the Boost review process.
The summary to me is that a proposed libary should not merit a Boost review if its scope doesn't match Boost goals (clearly stated at the beginning of the home page). It's for this reason that I argued strongly to Fernando that Boost.Polygon should be withdrawn to avoid setting a precedent (despite of other technical merits the library has!)
So you think the review process should be updated. Furthermore, you think Boost.Polygon should be withdrawn, because its scope doesn't match Boost goals. Let's have a look at the review process then. The page http://www.boost.org/development/submissions.html says: Steps for getting a library accepted by Boost: * Learn about Boost * Determine interest * Preliminary submission * Refinement * Submission for review * Formal Review * Web site posting * People page * Lifecycle The three steps "Determine interest", "Preliminary submission" and "Refinement" should be able to determine whether a proposed library matches Boost goals. Otherwise, the "Formal Review" would be quite a waste of time for all the reviewers, and even more a waste of time for the author of the proposed library. The decision that a library should be withdrawn after acceptance by "Formal Review" should certainly have better reasons than that it doesn't matches "Boost goals", and nobody noticed this earlier. It might happen that we are forced to withdraw a library because of license issues. I could also imaging that Boost.Numeric.Bindings (let's hope that it will even reach the "Formal Review" step) could be withdrawn after acceptance by "Formal Review", because the burden on the regression testing system might turn out to be intolerable. See also Paul A. Bristow's remark on his own review http://lists.boost.org/Archives/boost/2009/09/156419.php and David Abrahams' answer http://lists.boost.org/Archives/boost/2009/09/156044.php to Michael Fawcett's review http://lists.boost.org/Archives/boost/2009/09/155852.php ---------- However, the reason I answer to your "summary" is that you indicated "interested in your objective viewpoint" in a thread where I already stated "I don't think it is wise to make any statements in this thread". Jose wrote:
Hi Thomas, I apologized for some bad wordings on my side. Questioning the process is not an attack.
I think I am more interested in your objective viewpoint (which is kind of hard to do in this thread), but you already provided it in your reply to me before and it was useful.
So, first of all I want to tell you that I really appreciate your apologies. And it will be OK even if you react heated to this reply. I was just unhappy that I was forced to add anything to that already unfortunate thread. And I'm not even sure whether my reply about "GENERIC GEOMETRY" set the unfortunate tone of that thread. That reply was from an unsent mail trying to explain why I don't like the name GGL and would prefer the name Boost.Geometry. (So be careful with shouting, otherwise you might be tempting people to reuse parts of unsent mails...) Your writing about the unfair treatment of GGL and that boost polygon should be rewritten in terms of GGL gave me the impression that you are not fully appreciating the complicated situation: Jose wrote:
4) The GGL library proposal, which had been iterating the design with input from boost, in my opinion received an unfair treatment in the way the schedule was managed.
Jose wrote:
I think in this case it should be determined by at least two experts whether it is possible to have a generic library that can satisfy the multiple application domains, and if yes, then take GGL as the base, update the design and incorporate boost polygon algorithms as appropriate.
Jose wrote:
Boost.Polygon doesn't provide that base although it may have algorithms brilliantly implemented!.
I am asking you to reconsider your decision and also provide your expert opinion so that GGL can be acceptable as a base where the Polygon algorithms can be included.
At that point I was still trying to evaluate GGL, so I certainly knew about the strong and weak points of Boost.Polygon, but I couldn't say much about the strengths and weaknesses of GGL. (Now I know that the main issue of GGL is that it is still unfinished!) There are two types of issues for Boost.Polygon: 1) The design of the geometric concepts in Boost.Polygon is well drafted, but the implementation still has room for improvements. (But at least it is finished!) Fernando Cacciola's review gives a staggering overview of the implementation issues: http://lists.boost.org/Archives/boost/2009/09/156180.php One could even classify the mediocre performance of the intersection algorithm as an implementation issue, because Luke knows exactly which part of the algorithm still uses a suboptimal implementation. 2) The other issue is floating-point. This one is really tough, because of all the subtly different opinions how the correct solution should look like. Some of us had even encouraged Luke to stay with fixed point, and only provide assistance to correctly convert floating-point to fixed point. (I guess both Fernando Cacciola and I were guilty of encouraging Luke to take that approach.) I don't even know whether the ones that reported a loss of accuracy simply made the mistake to use an arbitrary scaling factor instead of a power of two, or used the same scaling factor for both x- and y-coordinates. But it was exactly this fixed point approach which lead some reviewers to the conclusion that Boost.Polygon is only useful for PCB, VLSI and CAD. The approach of GGL to the floating-point issue might receive more approval than the approach of Boost.Polygon, but Barend Gehrels' detailed answer to the robustness questions http://lists.boost.org/Archives/boost/2009/11/158809.php (he references to http://geometrylibrary.geodan.nl/formal_review/robustness.html) shows that also GGL's approach is not without drawbacks. So let's finally address the most difficult point. You don't like to have two incompatible boost geometry libraries (and don't see a good reason why it should not be possible to make them compatible). So you could have the idea to suggest to reject both libraries, and tell the authors that they have to agree on common concepts for the interface to the basic geometric objects at least in 2D. But you may not realize the amount of work this might imply, if the authors succeed to agree on common concepts at all. And rejecting a library means first of all that the library is not wanted in boost. (This is exactly the reason you gave why Boost.Polygon should be withdrawn: Because boost is the wrong place for it in your opinion.) And it would be a rejection for political reasons, not based on concrete technical deficiencies that could be fixed by just improving the library, like when Boost.Serialization was first rejected. So perhaps I could help you a bit to appreciate that Fernando Cacciola's decision was not as wrong as it may seem. And please remember that it wasn't my decision. I'm just one of the reviewers that took part in both reviews, and I just tried to explain to you my personal point of view. And you will most certainly disagree with some parts of my view, and this is perfectly OK. And if you spend the time to read all this, let me also apologize that I didn't succeed in being more succinct. I hope my answer at least doesn't make you feel worse. Regards, Thomas

Hi Thomas, Your email is objective and I have no interest on a heated debate now that there is an intent to collaborate. I just felt Michael's email added to my previous argument and I replied to that. My reasoning for any library is try to look at this goal stated in the Boost home page: "Boost libraries are intended to be widely useful, and usable across a broad spectrum of applications" We probably disagree on "widely useful". To me, in the specific geometry case it translates as "valid for multiple application domains". Also, in this case, I think there is more value in collaboration from authors in complementary areas than confrontation. We should agree to disagree! regards jose On Sun, Nov 22, 2009 at 3:43 AM, Thomas Klimpel <Thomas.Klimpel@synopsys.com> wrote:
Hello Jose,
I tried to answer both to your current mail, and to the issues raised previously. I tried to answer your current mail straight and direct in the first part of the mail (so don't interpret the tone as unfriendly, its just meant to be clear and succinct), and then try to go into more detail of the issues raised previously.

On Tue, Nov 17, 2009 at 10:23 AM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
Do please have a look at the paper that I linked to before, and in particular the figure at the top of the second page:
http://www.mpi-inf.mpg.de/~kettner/pub/nonrobust_cgta_06.pdf
I have read this, and several other papers. I've even taken a few numerical analysis classes at the university.
1. Failures may not occur only in cases when things are very close. A problematic input may cause the output to be grossly wrong.
Lady luck has smiled on me thus far. ;-) And a small amount of data pre-processing goes a *long* way. Know your algorithms, and their pre/post-conditions.
2. You may not get a wrong answer; the program might instead segfault or loop forever.
This is not a FP-induced instability problem. This is a bug in your algorithm. Coincidentally, I just fixed a loop-forever bug in one of our geometry algorithms the other day. Jon

Barend Gehrels wrote:
It is *not* correct behaviour and results in (on large scale) small deviations and might actually cause bugs (if moved vertices happen to cause new intersections (see e.g. Kim&Kim, http://tinyurl.com/ggl-rb2)).
The mentioned URL doesn't work for me.

Barend Gehrels wrote:
It is *not* correct behaviour and results in (on large scale) small deviations and might actually cause bugs (if moved vertices happen to cause new intersections (see e.g. Kim&Kim, http://tinyurl.com/ggl-rb2)).
The mentioned URL doesn't work for me. _______________________________________________
If I click on it I get the PDF. Anyway, it directs to: http://www.cadanda.com/CAD_A_3_1-4_48.PDF Does that work? Barend

Barend Gehrels wrote:
If I click on it I get the PDF. Anyway, it directs to: http://www.cadanda.com/CAD_A_3_1-4_48.PDF
Does that work?
works like a charm.

Paul A. Bristow wrote:
Phil Endecott wrote:
Personally, I find it better to use fixed point (e.g. for latitude/longitude) and relax knowing that I don't have to worry.
But isn't this assuming that your latitude/longitude (or whatever) is exact?
Both fixed and floating-point formats represent latitude and longitude values snapped to a grid. In the case of fixed point it's a grid of squares (on a flat earth, anyway); in the case of floating point it's a grid of rectangles whose sizes change depending on how far you are from the equator and prime meridian. Floating point is only "less exact" in the sense that we often don't worry about what the grid size is, but it is still there.
but others (like you and Luke) assert your input is exact and demand an exact result?
Really I'd just like to know what the limitations of GGL's FP algorithms are, and I'd like to know that the possibility of gross errors, invalid output, etc., have been understood and documented. The "Classroom Examples" paper shows one way in which one can construct test cases that are likely to trigger any problems. I don't think that testing with random inputs or real-world data is very likely to find anything.
Paul A. Bristow Prizet Farmhouse Kendal, UK LA8 8AB
I'm going to be in Kendal at the weekend for the Mountain Film Festival. Fancy a beer? We could have a mini-boostcon-a-deux :-) Phil.

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Phil Endecott Sent: Tuesday, November 17, 2009 5:39 PM To: boost@lists.boost.org Subject: Re: [boost] GGL Review
Paul A. Bristow wrote:
Phil Endecott wrote:
Personally, I find it better to use fixed point (e.g. for latitude/longitude) and relax knowing that I don't have to worry.
But isn't this assuming that your latitude/longitude (or whatever) is exact?
Both fixed and floating-point formats represent latitude and longitude values snapped to a grid. In the case of fixed point it's a grid of squares (on a flat earth, anyway); in the case of floating point it's a grid of rectangles whose sizes change depending on how far you are from the equator and prime meridian. Floating point is only "less exact" in the sense that we often don't worry about what the grid size is, but it is still there.
But the *actual position* is never known exactly, it always has some uncertainty, say + or - 1 grid unit. So, for example, if it appears to be in the wrong (next door or so) grid, you could decide that it doesn't matter to you. Is this why users seem to be happy with a potentially uncertain FP implementation? Is an interval based implementation another way to go - except that it will probably be too slow to be useful? (But I note that one author Sylvian Pion http://www.mpi-inf.mpg.de/~kettner/pub/nonrobust_cgta_06.pdf an interval enthusiast and Boost Interval author doesn't mention it, so perhaps not?). Paul --- Paul A. Bristow Prizet Farmhouse Kendal, UK LA8 8AB +44 1539 561830, mobile +44 7714330204 pbristow@hetp.u-net.com

Paul A. Bristow wrote:
Phil Endecott wrote:
Paul A. Bristow wrote:
Phil Endecott wrote:
Personally, I find it better to use fixed point (e.g. for latitude/longitude) and relax knowing that I don't have to worry.
But isn't this assuming that your latitude/longitude (or whatever) is exact?
Both fixed and floating-point formats represent latitude and longitude values snapped to a grid. In the case of fixed point it's a grid of squares (on a flat earth, anyway); in the case of floating point it's a grid of rectangles whose sizes change depending on how far you are from the equator and prime meridian. Floating point is only "less exact" in the sense that we often don't worry about what the grid size is, but it is still there.
But the *actual position* is never known exactly, it always has some uncertainty, say + or - 1 grid unit. So, for example, if it appears to be in the wrong (next door or so) grid, you could decide that it doesn't matter to you.
Yes, for both fixed and FP.
Is this why users seem to be happy with a potentially uncertain FP implementation?
I hope that's not why they seem happy with it, because it overlooks the more serious consequences i.e. gross errors as illustrated in the "Classroom Examples" paper. Phil.

Am Monday 16 November 2009 15:06:54 schrieb Brandon Kohn:
Phil Endecott wrote:
The other controversial implementation issue is floating-point robustness. I'm still not clear what the position of this library is on that. Are we just given assurances that it seems to be robust in real-world use, or is it believed that it actually is certainly robust? I expected to see this discussed in the library documentation, but I don't see anything - maybe I've missed it. If it is "believed to be robust in real-world use", it would be helpful to describe the possible failure modes (e.g. segfault, runs forever, or just outputs the wrong answer.)
This is an issue that troubles me as well. I think that a floating point computational geometry library is possibly a first for Boost in that you often have heuristics rather than algorithms due to the fuzzy effect of using floating types in comparison predicates. One of the more useful features of the library (GGL) would of course be the boolean operations. The problem however is clearly going to be robustness. I have never encountered a robust floating point boolean operation library in my 9 years of working in the geometry domain.
I think the question of this review is not whether some algorithms in GGL have limitations, as long as they are documented, but if the concepts and algorithm/strategy interface is sufficient to allow the implementation of specialized algorithms. is this the case? luke seems to be willing to integrate his Boost.Robust2DIntegerPolygon into GGL, but he also seems to have some concerns if he'll be able to do that using GGL's polygon representation. those should be addressed before GGL is accepted. I was surprised that Polygon was accepted on a 6 to 4 vote, even though a "geometry core" library was right around the corner. I don't think people were aware of that during the review, iirc there was a lot of talk about "waiting" for a generic geometry library. a future review process should make it possible to postpone or reschedule a review.

2009/11/16 Stefan Strasser <strasser@uni-bremen.de>:
A future review process should make it possible to postpone or reschedule a review.
I wonder whether it would be worthwhile to have a formalized "review readiness" process, if just something simple like a weekend that asks people, "Do you think a review of this should go ahead? (Based on when it's held, the previous discussion, the extent of documentation and examples, etc.)" that takes a critical mass before it gets added to the formal review queue. Alternatively, it could be part of the normal process, where the review manager would be to make a call on whether to postpone the review after the first few days.

On Mon, Nov 16, 2009 at 11:16 AM, Stefan Strasser <strasser@uni-bremen.de> wrote:
I think the question of this review is not whether some algorithms in GGL have limitations, as long as they are documented, but if the concepts and algorithm/strategy interface is sufficient to allow the implementation of specialized algorithms. is this the case?
The concepts, interfaces, and algorithms provided by a library are indeed of paramount importance during a review. However, the implementation is also extremely important. Any serious algorithmic bugs identified in the implementation should probably be fixed prior to release (e.g. acceptance is contingent on fixing them). My argument is that numerical stability issues due to floating point limitations are not algorithmic bugs, but rather "properties" of the chosen algorithm(s). </flame_bait> I greatly like the ability to generically pick the representation of my choice, w/ the ability to use a more "robust" algorithm when needed. Any bugs that would prevent the more robust algorithm from being as robust as it claims should be fixed.
luke seems to be willing to integrate his Boost.Robust2DIntegerPolygon into GGL, but he also seems to have some concerns if he'll be able to do that using GGL's polygon representation. those should be addressed before GGL is accepted.
I was under the impression that Luc was basing his reservations on robustness issues due to floating point limitations. Perhaps I misread, or missed a point or two. Jon

luke seems to be willing to integrate his Boost.Robust2DIntegerPolygon into GGL, but he also seems to have some concerns if he'll be able to do that using GGL's polygon representation. those should be addressed before GGL is accepted.
I was under the impression that Luc was basing his reservations on robustness issues due to floating point limitations. Perhaps I misread, or missed a point or two.
It is reversed indeed, Luke wants to integrate our FP overlay algorithm in his library, so he can specialize on numeric type, for integers his algorithm, for floats ours. We're happy to extend our Polygon Concept for this feature, if the std:: container is not sufficient. But I don't know if you can ask that it should be there before acceptance, because the BP was accepted after the announcement of GGL review... Barend

Jonathan Franklin wrote:
On Mon, Nov 16, 2009 at 11:16 AM, Stefan Strasser <strasser@uni-bremen.de> wrote:
I think the question of this review is not whether some algorithms in GGL have limitations, as long as they are documented, but if the concepts and algorithm/strategy interface is sufficient to allow the implementation of specialized algorithms. is this the case?
The concepts, interfaces, and algorithms provided by a library are indeed of paramount importance during a review. However, the implementation is also extremely important. Any serious algorithmic bugs identified in the implementation should probably be fixed prior to release (e.g. acceptance is contingent on fixing them).
My argument is that numerical stability issues due to floating point limitations are not algorithmic bugs, but rather "properties" of the chosen algorithm(s). </flame_bait>
I greatly like the ability to generically pick the representation of my choice, w/ the ability to use a more "robust" algorithm when needed. Any bugs that would prevent the more robust algorithm from being as robust as it claims should be fixed.
luke seems to be willing to integrate his Boost.Robust2DIntegerPolygon into GGL, but he also seems to have some concerns if he'll be able to do that using GGL's polygon representation. those should be addressed before GGL is accepted.
I was under the impression that Luc was basing his reservations on robustness issues due to floating point limitations. Perhaps I misread, or missed a point or two.
Barend requires that the polygon provide a type with stl container interfaces that holds its holes. This prevents it from adapting the majority of polygon data types (including my own) that don't provide direct acceess to private members. My request was that he make the interface for his polygon concept more generic. If you bother to look at my submission you will find that the interfaces for the geometry concepts are fully generic, and the fixed point requirement for polygon intersection is a property of the algorithm I chose, and documented. I can easily call eg. Barend's algorithm when the coordinate is floating point without the need to change my interfaces at all, so I think it is inaccurate to describe my library as "not generic". The concepts I define are fully generic and equivilent to what Barend has proposed. I don't handle different coordinate systems, but that is a difference in scope between my library and GGL and not a design flaw. I'd also like to point out that fixed point is not a VSLI specific requirement, nor is robustness. Some GIS applications use fixed point coordinates and closed source solutions for polygon clipping that are robust and fast are used in VLSI all the time. Thanks, Luke

On Mon, Nov 16, 2009 at 12:35 PM, Simonson, Lucanus J <lucanus.j.simonson@intel.com> wrote:
I was under the impression that Luc was basing his reservations on robustness issues due to floating point limitations. Perhaps I misread, or missed a point or two.
Barend requires that the polygon provide a type with stl container interfaces that holds its holes. This prevents it from adapting the majority of polygon data types (including my own) that don't provide direct acceess to private members. My request was that he make the interface for his polygon concept more generic.
Sorry, I was remembering this comment from the 13th or so, and confused it w/ the current discussion: Simonson, Lucanus J wrote:
Upon reflection the polygon formation algorithm requires robust comparisions in the tree, which floating point arithmetic cannot provide, so the code cannot be adpated to solve the problem with holes in GGL.
If you bother to look at my submission you will find that the interfaces for the geometry concepts are fully generic, and ...
This isn't about your library. However, coming out of boostcon, it was at least *my* understanding that you were supposed to *at least* work out these issues w/ Barends prior to submitting your library for review. That clearly didn't happen, so let's just move on. It would be nice if we could just focus on reviewing GGL for its merits.
I'd also like to point out that fixed point is not a VSLI specific requirement, nor is robustness. Some GIS applications use fixed point coordinates and closed source solutions for polygon clipping that are robust and fast are used in VLSI all the time.
I totally agree. I have done some work in the past where we needed precise lat/lon computations, and used fixed point. I'm just fighting against the "it has to be 100% numerically stable a la fixed point" mentality that has been prevalent, because it doesn't serve my (and many others) needs. Jon

Brandon Kohn wrote:
Phil Endecott wrote:
The other controversial implementation issue is floating-point robustness. I'm still not clear what the position of this library is on that. Are we just given assurances that it seems to be robust in real-world use, or is it believed that it actually is certainly robust? I expected to see this discussed in the library documentation, but I don't see anything - maybe I've missed it. If it is "believed to be robust in real-world use", it would be helpful to describe the possible failure modes (e.g. segfault, runs forever, or just outputs the wrong answer.)
This is an issue that troubles me as well. I think that a floating point computational geometry library is possibly a first for Boost in that you often have heuristics rather than algorithms due to the fuzzy effect of using floating types in comparison predicates. One of the more useful features of the library (GGL) would of course be the boolean operations. The problem however is clearly going to be robustness. I have never encountered a robust floating point boolean operation library in my 9 years of working in the geometry domain.
Certainly, I'm not going to argue with the float-point issues you point on. They exist, of course. I've been working in GIS for a while and I understand the problem we're discussing here. However, I'd like to make, perhaps, an interesting observation. There is a Open Source computational geometry library [1] that is dedicated to GIS domain and is widely used in GIS products. I've been involved in its development for a couple years, and I may only confirm, again, that FP vs robustness problems loomed large in our minds. In spite of the fact this library is configurable regarding precision model user wants to use, what's interesting, in most use cases of the library I've seen, in geographical data processing libraries, geospatial databases as well as in regression tests cases of this particular library, float-point model is used at most. BTW, GGL has been tested in comparison to this library. [1] http://trac.osgeo.org/geos/ -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org
participants (16)
-
Barend Gehrels
-
Brandon Kohn
-
Fernando Cacciola
-
Hartmut Kaiser
-
Jonathan Franklin
-
Jose
-
Mateusz Loskot
-
Michael Fawcett
-
Paul A. Bristow
-
Phil Endecott
-
Scott McMurray
-
Simonson, Lucanus J
-
Stefan Strasser
-
Steve M. Robbins
-
Steven Watanabe
-
Thomas Klimpel