
The formal review of the Boost.Polygon library by Lucanus Simonson starts today, August 24, 2009 and will finish September 2, 2009.
First, congratulations Luke, and also Gyuszi, with having finished this library and having it for review. As one of the authors of another Geometry Library, also aimed for Boost, I'll try to be as objective as possible. I'll not compare it too much to our library, though I'll give some performance measures compared with many other libraries, also ours (ggl). When I started testing Boost.Polygon, in February or March, there was not yet any documentation, it was difficult to write the benchmarks, it didn't compile using MSVC 2005. That is all solved now, it is documented, it compiles using MSVC 2005 now, there are some positive reviews on design, implementation and the paper. So this is a good step forward. In my domain (GIS) we, normally, only use floating point coordinates on arbitrary angle polygons. I'll not review the 45/90 polygons because I don't know them and I'm not interested in them. Expectations: I think a templated Boost.Polygon library should at least: 1) support most of the basic polygon geometry algorithms 2) have a performance comparable to other libraries 3) support both integer and floating point coordinate types 1) the library supports many operations, but some basic polygon algorithms are missing: - convex hull - centroid - is_convex Those operations are found in many libraries and I think they should be supported by Boost.Polygon. It would also make the performance comparison (even) more interesting. 2) though some reviewers were satisfied about the performance, and the paper indicates it is well performing, our benchmark shows that it is in general slower or much slower than other libraries: - the performance of the "contains" operation is perfect, fastest available - the "area" operation is about a factor 1.5 slower than most other libraries. This is surprising because "area" is quite a simple operation, it should be possible to implement it with the same performance as e.g. CGAL or wykobi. - the "intersection" operation is too slow compared with other libraries. The paper (referred to in the documentation) compares the intersection performance with GPC and CGAL, but does not give exact figures. It gives some plots about scalability using a very high number of points per polygon. In normal usage, in nearly all cases, GPC outperforms Boost.Polygon with a high factor: --> GPC: 9.0 seconds, GTL/Boost.Polygon: 92.2 seconds for the same operation (for 100 points per overlayed ellipse) --> GPC: 13.7 seconds, GTL/Boost.Polygon: 115.3 seconds for the same operation (1000 points per overlayed ellipse, which is already large). --> Also with 10000 points GPC is still much faster (factor 5). --> Only in high-intersection-densities with a very high number of points GTL is faster than GPC. If I was a user and license would allow me, I would definitely take GPC and not BP, because of performance and because of floating point... --> compared with CGAL: in normal usage, GTL is indeed faster. But in intersection-dense environments CGAL scales better (this in contrary to the paper) --> compared with GGL: in normal usage Boost.Polygon is a factor 81.1 slower than GGL. While it was suggested that we should use BP for our boolean operations, and we evaluated this (resulting in this comparison), we're glad that we first compared it to other libraries because a user can wait for 1 second, but cannot wait for 92 seconds... for a less precise result... In all tests GGL is way faster. --> for the complete overview, see http://trac.osgeo.org/ggl/wiki/Performance and http://trac.osgeo.org/ggl/wiki/IntersectionMore Most of our benchmark results are contrary to the results of your paper. I think that if you write things like this, you should publish the benchmark sources because it is very sensitive information. People should be able to reproduce them. Also exact figures in the paper would help to interpret the results. The Boost.Polygon library is specialized for boolean operations and the paper is all about performance. But in the end it is really not performing well... 3) Boost.Polygon supports integer but does not support double for boolean operations. For the operations "area" and "contains", coordinates in "double" floating precision are supported. Also the boolean operations accept double coordinates, but crashes immediately. I think it should not crash, slightly better is to prevent floating point coordinates by giving a compilation error. Much better: floating point coordinates should be implemented. For me as GIS-user, a library without floating point support is unusable. For the comparison I converted to integer coordinates (multiplying using a factor of 10000.0). But in the resulting summed area you can see that this is not working perfectly. The area differs, and where it should increase (star-ellipse with more and more points) it is decreasing... I tried other integer-double-conversion factors as well but that made things worse.
- What is your evaluation of the design?
- What is your evaluation of the implementation? It is curious that Boost.Polygon went from Tag Dispatching to SFINAE last year, while we (GGL) went from SFINAE to Tag Dispatching. We're very glad with our change, while most compiler problems and warnings in Boost.Polygon origin from the SFINAE implementation. I'm still convinced
- What is your evaluation of the documentation? I didn't read all. It is looking good but contains several errors in the
- What is your evaluation of the potential usefulness of the library? I think most people using geometry / polygons are using floating point coordinates. As these are not supported in the algorithms where this
It is difficult to answer this question because I (with Bruno, and some folks of this list) designed another library in another way, it is hard to be objective here. I would prefer a geometry library, instead of a polygon library. Polygons come seldom alone. It would be useful to have other geometry types as well. The 45/90 polygons could be implemented as specializations (special cases) of the normal polygons, overriding implementations. This library is designed the other way round, it started with 90/45 polygons and arbitrary polygons were added to be submitted as a boost library. The diagram shows the polygon_90 as a refinement to a rectangle, the polygon_45 as a refinement to a polygon_90, the polygon as a refinement to a polygon_90, and a polygon_45_with_holes as a refinement to both a polgon_90_with_holes and a polygon_45. All this makes really no sense to me, I think it could and should be much more simple. that Tag Dispatching, suggested to us on this list by Dave Abrahams, is a very good choice in a geometry library implementation. We're happy with it and all compiler and compatability problems are vanished. Things are copied from other boost libraries. File isotropy.hpp contains a nearly literal copy of enable_if. There is also an extraction of MPL. While this is an explicit choice, I don't think it is a good choice. The Boost Concept Check Library is not used to check concepts. I didn't try everything, but I'm getting the feeling that the concepts are not implemented as they should be. The file polygon_set_concept.hpp, which should give an outline of the polygon set concept, is dependant on polygon_set_data.hpp (which implements that concept). That should be the other way round. Inside that concept file, points are often copied into a polygon_set_data temporary. That is not how concepts should be implemented, and is very bad for performance. The file transform.hpp contains a large number of enumarions for transforming polygons up/down/west/east etc. I think this is not necessary, and besides that "west" / "east" gives the impression of spherical coordinate systems while this library is cartesian only. Left/right should already be better, but why not two offset numbers in two dimensions? details. E.g. documentation of the polygon_90_with_holes_concept, function begin_holes has the description "Returns the begin iterator over the range of coordinates that correspond to horizontal and vertical edges.". This is probably copy/paste error. Another example is the "void clean() const" function, which cannot be const of course, indicating that this documentation is written / copied / pasted and not generated with e.g. Doxygen. library is specialized on (booleans), I think it is useful for a limited public (VSLI?)
- Did you try to use the library? With what compiler? Did you have any problems? MSVC 2005, MSVC 2008, GCC 4.4 using MinGW. Everything compiled but the * notation for intersections didn't compile anymore (it did before) - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
An in-depth study to several aspects of the library
- Are you knowledgeable about the problem domain?
I'm designing and implementing GIS / geometry libraries since 1992. I hope that my review was as objective as can be expected from an alternative library writer. I conclude with the remark that I've asked several times on this list and off-list for cooperation, I suggested that Luke could join us writing a combined Geometry Library for Boost. All these requests were never answered. I would still be in favour of one library. If this library is accepted it will create a different situation for subsequent geometry libraries. In reviews it will be expected that these libraries work together or that one is built on top of the other. If that is the case, the base should be very very good. It is to the other reviewers to decide that this is the case. Regards, Barend