
John Phillips <phillips <at> delos.mps.ohio-state.edu> writes:
The basis for chosing the units is instead attempting to maximize numeric stability, and so choosing units that keep things as close to order 1 as possible.
BTW, this post will be only of interest to those with a mathematics or numerical analysis background. Others can safely skip it without missing anything. This is a very interesting issue to me! I'm not sure what you have in mind as far as keeping things close to order 1 improving numerical stability, but let me explain what comes to mind for me. First, there may be a possibility of overflow or underflow with values significantly far from order 1, especially with single precision floating point values. In my application, I use double precision for everything, so this doesn't concern me. The second issue relates to things like inversion of ill-conditioned matrices, where a matrix may be nearly singular due to a large difference in the orders of magnitude of its eigenvalues. Clearly this can cause severe numerical problems, and keeping things near order 1 helps prevent this. In some cases FWIW I think this problem can be resolved in another way, or even seen as a symptom of a different problem, which can be addressed - as I'll explain. And the discipline imposed by strongly-typed dimensional analysis may force the implementer of the algorithm to correct the problem. (Of course, you can read this as "annoy the implementer because his mathematically correct algorithm won't compile," depending on your perspective :-) ) Consider, for example, a function that inverts a matrix A using an SVD, in order to avoid numerical problems. Let's assume A is a symmetric, positive definite matrix composed of values having different physical dimensions - such as a covariance matrix of a set of parameters having different dimensions. I don't use SVD (so forgive me if I display my ignorance); but as I understand, it would essentially find the eigenvalues and eigenvectors, and invert the eigenvalues; but for any very small eigenvalues it would replace the very large reciprocal value with a zero instead. If we're solving Av=b, then loosly speaking the numerical problem is that many values of v solve the equation, some with very large coordinate values. Geometrically, the SVD can be described as choosing from among these values of v the one closest to the origin. What has always bugged me about this description is that "closest" requires a distance metric, like d=sqrt(x*x+y*y), but this is meaningless in a physical sense if the components x and y have different physical dimensions. Sure, if you have numerical values in some particular units system, you can do the computation and get a number. But what doesn't sit right with me is that if you change your units, you can get a different "closest" point! So even without roundoff errors and the like, the answer that comes out of the algorithm will change depending on what physical units are chosen. To me, it seems that a physical problem should have a particular solution regardless of the units chosen (aside from issues of computational accuracy). Likewise, the eigenvalues of this matrix will have some sort of hybrid dimensions just like the distance metric above. And the eigenvectors can vary based on choice of units, because even the notion of two vectors being orthogonal is dependent on the relationship among the units of the parameters defining the matrix. Because of this, as you pointed out in another post, a strongly-typed dimensions library would not even be able to compile the SVD (unless all the elements of the matrix are of the same dimensionality). And how do I set the threshold of what's a "very small" eigenvalue for the SVD? Perhaps .000001? In what units, since the matrix elements are of different dimensions? Again, the meaning of my threshold changes depending on the relationship among my parameter units. I used to think all this was just the price you pay for using an SVD. But with the way I've constructed my matrix, it can be decomposed as: A = DCD where D is a diagonal matrix of dimensioned quantities, and C is a dimensionless symmetric matrix. It's easy, in fact, to find such a diagonal matrix D, with no a priori knowledge of the magnitude of the values, by taking the square roots of the diagonal elements of A. Then C is just the correlation matrix corresponding to the covariance A, and its elements tend to be close to order 1. Now the matrix to be inverted (C) is likely to avoid numerical problems caused by large differences in magnitude, and the solution will be independent of the units chosen - since they will just become multipliers on the elements of D. The SVD/eigenvalue computation can be done without violating the strong typing, and the choice of units no longer impacts the numerical stability of the matrix inversion. There may be a small penalty (such as calculating the square roots), but as compared to an SVD algorithm I think it's almost negligible. I expect a similar technique could be found for cases of nondefinite matrices or non- square matrices, and other types of matrix calculations. In parts of the algorithm that are not "mixing" dimensions (as the distance formula above does) - where strongly-typed computations would compile - you're always adding apples to apples, so the choice of units won't influence numerical problems like summing values of very different magnitudes. Most matrix multiplications should probably be type-safe, for instance, and thus independent of units as far as numerical stability. These ideas would never have occurred to me when I was just manipulating matrices as set of numbers. It wasn't until I started to think of them as strongly-typed physical quantities that this perspective emerged. I think that's also part of the value of using a dimensional analysis library. I hope what I've suggested here makes some sense. There may be reasons it doesn't apply in certain situations. But if anyone finds it helpful, it's something you can try. Regards, -- Leland