Re: [boost] [Review] Quantitative units - compile time vs runtime

26 Mar 2007

      Hello everybody,

my background is in software development for scientific and engineering
computing, facing complex applications in industry. Therefore, I may
take the opposite stand here. For fairness, I must add that I am
somewhat from the competion, as I have started my own little (not yet
upgrown) project in this area.

Practically speaking, when developing such a technically minded piece of
software, approximately 30% of the development effort goes into the
computation kernel whereas the remaining 70% of the effort goes to the
overall infrastructure, business layer, presentation layer (batch, gui,
api,..), interoperability, productizing etc.  It is sad to say, but such
is life.

The proposed solution is only targeted towards the compuation kernels.
As a matter of fact, you have to ensure that the right things get in,
and the right things get out there. What happens inside however is a
completely different story. Once you know for sure what you got, you'll
take a pencil and a piece of paper and you write down what you're going
to implement. Once you are sure what you are doing, you go and implment
it (using plain dimensionless floating point arithmetic). That worked
well for over 50 years now. 

While a template-based solution might seem appealing and might even be
applicable in some cases, there are many others where it is not:
Consider for example solving an ordinary differential equation: A
mechanical system described by locations and velocities, the excitation
given by some forces depending on various factors. As long as you
restrict yourself to explicit time integration, you may get around with
type-encoded units. For implicit schemes, however, you will have to
solve linear or nonlinear systems of equations, with various unknowns of
different physical quantities (aka types) in a single vector (locations
and velocities in the above case). This is not possible in C++ and thus
the point where the approach by this library does no more hold. You can
apply some fancy transforms beforehand to basically make all numbers
dimensionless, but then you dont need a unit library anyway. Similar
situations occur in all kind of fluid dynamics, finite difference,
finite element codes, dynamic simulation codes and the like. 

I feel a far stronger need for a physical unit library in two other
parts: Namely the business locgic and the presentation layer. A) The
business logic is about intermediate storage of compuation results for
the use by other parts of the software, probably written by other
people, at other times, in other locations. Here you have to ensure that
the units _really_ match in every possible case. And there are lots of
cases possible. If they do not match, a conversion is usually
acceptable, as it does not happen over and over again. We also talk
about serialisation aspects here. Language dependend output (as
suggested for future development in the documentation) to files is an
absolute no-go: Otherwise the file saved by one engineer will not be
able to get read by his collegue, only because they do not have the same
language setting. 

Furthermore, many times, you do not know wile writing the software, what
kind of quantity you actually handle. This is not so true for compuation
kernels but more for backend storage that abstractly handles results of
one computation over to some other computation, some postprocessing
tools or the like. Nevertheless correct unit handling is indispensable.

B) The presentation layer consists of everthing that is visible to the
user, namely the GUI or the like. People in academics can hardly imagine
what expectations engineers might have towards the units of their data.
(Before entering industry, I was no exception to that rule :-) )  This
is partly due to historical reasons; the choice of units in the
presentation layer may vary by corporate guidelines, personal
preference, professional background of the user or geographical
location. If you want to get around with a single piece of code then,
you cannot live with a fixed units for the quantities in your program.
(The presentation layer, that is -- computation kernels are another
story.)

As another example, consider simply a nifty small graph plotting widget.
Clearly enough, this widget must handle whatever unit may arrive.
Furthermore, users may decide to add or multiply some channels or
combine them by some other means. Therefore the arithmetic operations
for units are necessary at the runtime of the program. 

All together, the library does not fit my need or those of my present or
previous coworkers. As mentioned, there are severe obstacles for the
application where it could be deployed and it does not help where the
real problems are. In my opinion, it simply does not go far enough.
Template-based solutions are known for many, many years by now (the book
of Barton&Nackman is the oldest I personally know of) but have not been
widely adopted until now. In my opinion, this is not so much due to the
lack of compiler support but because the template-based approach can
only be applied to some small portion to the issue, and thus can only be
a minor (though not so unimportant) building block of a solution. 

I like the notion of a unitsystem as a mean of choice for the internal
represenation of the quantities (and have already taken that idea to my
own project). On the other hand: the length of my desk is a physical
quantity that exists independently of the unit it is expressed in. The
numerical value of that quantity is only fixed for a given unitsystem
but this is an implementation detail, not a principal thought or
definition. Thus I find the definition "A quantity is defined as a value
of an arbitrary value type that is associated with a specific unit."
somewhat irritating. 

Some more details:
------------------

The author mentiones the problem to distinguish the units for energy and
torque. Both quantities result as a product of a length and a force but
have completly different physical meanings. When reading the
documentation, I got the impression that these two should be
distinguishable somehow, but the author does not tell us of what type
the product (2*meter)*(5*Newton) should actually be and how a compiler
should tell the difference.

I got the impression that the intent of the following code is not so
obvious:
quantity<force>     F(2.0*newton);
std::cout << "F  = " << F << std::endl;
The author states that the intent is obviously F  = 2 m kg s^(-2); In
contrast to the author, I personally would naturally expect something
like:
F = 2.0[N]
This issue is not half as superficial as it might seem at first glance.
In fact that is what the whole presentation layer stuff is all about. 

I have yet to see any stringend need for rational exponents. They rather
obscure the physical meaning of certain terms, imho.

To define all possible conversions of n unitsystems, you'll need about
O(n*n) helper structs to be defined. I may be overlooking some more
elaborated automatism, though. However, we are likely to face a larger
number of unitsystems as already the conversion of "cm" in "m" requires
different systems.

The handling of temperatures (affine units in general) is correct but
not very practical.

The executive summary:
----------------------

	* 	What is your evaluation of the design?

I am strongly missing concepts that help to create additional
functionality instead of "only" error avoidance. To me the semantics of
quantities or units is not clear, and does not conform well with the one
of the NIST (National Institute of Science and Technology:
http://physics.nist.gov/cuu/Units/introduction.html). 

	* 	What is your evaluation of the implementation?
...
From the technical aspects, this library looks probably good. My
objections focus on the semantic level.
* 	What is your evaluation of the documentation?

The documentation looks very good, though the "quick start" and the
following pages is too much about the internal implementation and not
helpfull enough for those people that simply want to use the library.

	* 	What is your evaluation of the potential usefulness of
the  library?

In the current form, it is no of direct use for me. I am sorry to say
that. 

Basically it is what is was originally written for: "An example of
generic programming for dimensional analysis for the Boost mailing
list". It was not designed with focus on the problems of realistic large
scale scientific and engineering C++-programming. Nevertheless I already
took ideas from it. 

	* 	Did you try to use the library?  With what compiler?
Did  you have 
any problems?

I briefly compiled some examples and tried to extend them. I had no
technical difficulties using VC8. Rather had to struggle to find
syntactically how I should do the things I wanted to. But that would be
only a minor documentation issue.

	* 	How much effort did you put into your evaluation? A
glance? A quick 
  reading? In-depth study?

I thouroughly read the documentation, thought deeply about it, and
briefly played around with the implementation.

	* 	Are you knowledgeable about the problem domain?

Yes, definitely. I am working on that topic for about 2 years now,
analyzed our existing solutions, held meetings with different coworkers
about the strengths and weaknesses of their existing solutions,
evaluated some other existing libraries. Because all were
unsatisfactory, I finally started to go for my own solution few months
ago.

	* 	Do you think the library should be accepted as a Boost
library? 

No, I dont think this library should be accepted as it is. I am sorry.

I apologize as well for the lengthy mail, but I hope my requirements get
clearer that way.

Yours, 
	Martin Schulz.

--
Dr. Martin Schulz (schulz@synopsys.com)
Software Engineer
Sigma-C Software AG / Synopsys GmbH,
Thomas-Dehler-Str.9, D-81737 Munich, Germany
Phone: +49 (89) 630257-79, Fax: -49
http://www.synopsys.com