
Hi Luke,
Fernando Cacciola wrote:
The library proposed by Lucannus, OTOH, at least acknowledges the problem appropriately and approaches a solution in the right direction (there is still some important design-influencing considerations but are not showstopers)
Thank you, Fernando. I agree with you 100% that robustness needs be designed in up front and not tacked onto an existing geometry code base as an afterthought. In particular, the robustness strategy needs to be comprehensive and not patchwork. I acknowledge that my library needs a lot of work in this area to make it robust for both integer and floating point, particularly since it was only designed with integer robustness in mind. I will have to redesign the low level stuff quite significantly to get both integer and floating point fully robust and sharing the same algorithm. I don't really have the knowledge I need to make my library floating point robust. I need to read some of the papers you mention.
I'll try to sketch the general ideas below to get you started.
Making the coordinate data type a template parameter and "allowing" the user to specify infinite precision numerical data type if they want robustness is not a solution
Precisely :) life would be too easy it that very already fast enough.
, because there are very simple and effective ways to make integer and floating point operations robust. I don't know if we need to design in points of customization to expose the ability to do that to the user, but it needs to be at least possible for the library developer to specify different strategies for different coordinate data types and different contexts.
Exactly... and the library as a service to other higher level libraries should and can provide a general methodology and the corresponding tools for doing that. Ultimately users shouldn't be concerned at all with all this, except when there is reasonable speed/robustness tradeoff that users would like to threshold themselves.
I think Brandon's library is trying to handle both integer and floating point robustness, and I've noticed he is at least as interested in multi-precision numerical data types as I am, which is a good sign that he's doing the right things. I'm interested in learning more about his approach.
As I'll try to sketch below, bigfloat are part of the toolset, but there is more to it.
Performance is fine as a goal, but if the algorithms fail (crash, hang, drop polygons or "gracefully" throw an exception) even a small percentage of the time due to robustness issues then they are worthless, regardless of how fast.
Exactly.
If we tried to use a non-robust library and it failed in a mission critical situation it costs us over one million dollars a day to fix the problem. Failure is not an option. When I say 100% integer robust I'm serious. Its all or nothing and the stakes for me are high.
And so are for end users.. it is the developers that think: "this works in all cases I tried, then that's good enough". But it isn't, and there is no need to compromise since lots of techniques exists and while they don't cover all of the possible algorithms, there is clearly little justification for not using the techniques where they are applicable. === LONG EXPLANTION BELOW === OK, I'll try to sketch the general idea behid floating point robustness in geometric computing. There is a lot on the subject on the literature, is just difficult to find, so consider this just an intrduction. On the one hand, if an algorithm only needs to perform algebraic operations, a "Bigfloat", i.e. a software-based adaptive unlimited precision floating point type, delivers totally exact results. If division is needed then you can use a rational of bifgloats. If other operations are needed, such as sqrt() or transcendentals (say cos, sin) they you can use a number type that tracks the expression trees and avoids actually performing the non-exact operations as much as possible, specially when you query for discrete properties such as its sign. On the other hand, it is possible to design and implement any geometric algorithm in terms of "predicates" and "constructions" and it is possible to consider the output as having separate topological and geometric parts. Let's consider an example: the straight skeleton (and the voronoi diagram in this case) of a square contains one and only one vertex in the center and one and only one edge from that center node to each square corner. That result has a topology (how vertices and skeleton nodes are connected) and a geometry (the coordinates of the center node). The topology might be correct even if the geometry is not: i.e. the center node is not exactly in the center, yet they are connected correctly. For many many geometric applications, it is the topology what needs to be correct, not the geometry. If you do clipping for instance, it might very well not matter at all (for instance in drawing applications) if the vertices are slightly off. But if an entire region is included in the result when it should not, then that is a disaster for whatever application. Since the input to an algorithm is error free no matter the representation (or more precisely: there is nothing the algorithm can do about its implicit error), it is easy to see that, if an algoritm is designed in such a way that the topological construction is based exclusively on the results of predicates called exclusively on the input (and never ever on an intermediate result), then the topology would be correct no matter what capabilities the underlying input arithmetic has, provided the predicates can do whatever magic is neccesary to produce exact and correct results given input of any type. That is, if there is a predicate that can take machine integer or floating point coordinates, and, in whatever way, produce a guaranteed error free result, then you can write the code of an algorithm totally agnostic of the limitations in the type of input coordinates. That is, the agorithm itsef can just pretend the computation is performed in the RealRAM. Researchers spent a lot of time figuring out how to implement exact predicates based on finite precision integer or floating-point input. While the results are limited and somewhat complicated from an engineering POV they do allow for incredibly fast algorithms that take simple native type coordinates (int or double) and produce 100% robust code. But even then, the *concept* of a black box predicate is still useful because there are other programming-specific techniques to provide simple-to-program but fast predicates: the so called "floating point filters" which I'll explan below. IMO, any geometric library should provide a toolbox of exact predicates (which shoud be exact independetly of the input type) and state as a guideline that any algorithm should only make critical decisions based on the result of predicates, and, unless totally impossible, the predicates should never be called with intermediate results. Here's a simple example: T d1 = distance(a,b) T d2 = distance(c,d) if ( d1 < d2 ) blabla that is fundamentally flawed beyond repair because rebustness becomes the resposability of T, and at the end of the day the poor user finds himself wrapping his input into a fancy numeric type just because of that flaw. If OTOH that were written like so: if ( compare_distace(a,b,c,d) ) blabla then the input could still be of any type yet some basic library tool could go to whatever extent is needed to give the correct answer. And if you are like me and love to exploit C++ unsurpreased expressive power, you can even write that like so: distance_result d1 = distance(a,b) distance_result d2 = distance(c,d) if ( d1 < d2 ) blabla where the comparison between distance_result types does the magic of producing an exact answer no matter what the type of input coordinates are. And this would even prevent the algorithms in the library to mistakely do the wrong thing. Following this methodology at large needs support from the bottom layer of the library, i.e. it needs a exact precicates toolbox that can operate on arbitrary input types and always yield correct results, but it also needs a general methodology: i.e. predicates needs to be used istead of "direct" logic and predicates need to be given only the input and never intermediate results (or as I'll explain below, that must be carefully cosidered and instrumented with its own special technique). These guidelines can be enforced by the lirary given the right design (i.e. having "disntace" return "distance_type" which actually stores the input) All that above allows algorithms to be called on arbitraty input types that don't need to carry the weight of providing robustness. i.e. machine integer and floating point numbers can be used. Couldn't be any simpler for the end user. Constructions OTOH are similarly back-boxes like predicates are, but serve different purposes since it is presumed that the algorithm is well written and never uses the results from constructions to feed predicates (or rather they do so only very carefully and by design, and only in those cases when it was impossible to express the algorithm output as a direct function of its input). That is, while the library provides, say, Point p = centroid(a,b,c) that 'p' is never used to drive the decisions on the algorithm (is never passed to a predicate), yet still that centroid can be a higher level facility that *interally* uses whatever it needs to come up with a reasonable approximated result. Whenever you can express an algorithm in terms of predicates and constructions such that predicates only operate on the input and never on the result of constructions, then your algorithm is robust automagically without any requirement on the input type and without the need to to anything at all to workaround numerical issues within the algorithm itself. That's the job of the predicate oracles. If an algorithm just can't call predicates with only the input and needs to step over intermediate results, things get more complicated. However, if all those intermediate results that would be feed into predicates are ONLY produced with "constructions", sound alternatives can be explored. One alternative, the one used the most, is to have the constructions produce higher level abstactions instead of actual numerical results. For instance, they can return the expression trees that refer to the input, and have the predicates evaluate that lazily on demand (so the predicate is still only really operating on the input). Again, this needs clever support from the library at design type in tgat, for example, the result from "centroid" would't be a point but a centroid_result, storing the input points as well as an approximated result, which could be perfecttly feed into predicates that could do the right thing having the input at hand. This works reasonably well given the inherent complexity but is still the subject of active research. Most of the time the use of DAGs for results is not really needed, so the ideal framework would be able to mix both native input and lazy expressions in a coherent manner. One starting idea (that I had many years ago but never had the time to evolve) would be to have a hybrid geometric type that uses a dynamic dual representation: input type or DAG. Now a word on predicates. In C++ it is actually somewhat trivial to have a blackbox predicate than can produce correct results in spite of the input type using the following technique known as a floating-point filter (and would be equally trivial *within* a virtual machine for any dynamic language FWIW): template<class T> bool pred( T a, T b ) { interval<T> ia = a ; interval<T> ib = b ; interval<T> r = some operations on ia and ib if ( certainly( r == 0 ) ) return true ; else if ( certainly( r != 0 ) ) return false ; else return pred<some_exact_type>(a,b); } That exploits the capability of interval arithmetic to implement certified logic. I won't explain how or why here, but "r == 0" doesn't return bool but TriBool, so you can explicitely query for the "I don't know" answer, and *only* in that case fall back to exact solution. All these machinery must ultimately use some mapping from input type and the interval and exact types needed. Designing that mapping needs to consder different aspects. For instance, not all algoritm require the same exact_type for instance: some might need an exact sqrt() for example. So, the mapping could need to be something like this: coordinate_trats<T>::template exact<requiremets_tag>::type Also, predicates and constructions serve different purposes so they also have differet requirement on the exact types. Going back to the previous straight skeleton example, while the topology must be right, the coordinates of the center nodes need not, BUT, a user could reasonably expect a black box costruction to neverthess come up with a decent approximation, to avoid for instance points totaly off because of floating point overflow. In this case the algorithm should use a "constructor" object that can somehow be told what exact type to use, which might be different than the one used for predicates. Thus, the traits might need to be more like this: coordinate_trats<T> ::template exact<for_predicate_tag, requiremets_tag>::type Best -- Fernando Cacciola SciSoft Consulting, Founder http://www.scisoft-consulting.com