Re: [boost] [Review] Quantitative units - compile time vs runtime

Hello everybody, my background is in software development for scientific and engineering computing, facing complex applications in industry. Therefore, I may take the opposite stand here. For fairness, I must add that I am somewhat from the competion, as I have started my own little (not yet upgrown) project in this area. Practically speaking, when developing such a technically minded piece of software, approximately 30% of the development effort goes into the computation kernel whereas the remaining 70% of the effort goes to the overall infrastructure, business layer, presentation layer (batch, gui, api,..), interoperability, productizing etc. It is sad to say, but such is life. The proposed solution is only targeted towards the compuation kernels. As a matter of fact, you have to ensure that the right things get in, and the right things get out there. What happens inside however is a completely different story. Once you know for sure what you got, you'll take a pencil and a piece of paper and you write down what you're going to implement. Once you are sure what you are doing, you go and implment it (using plain dimensionless floating point arithmetic). That worked well for over 50 years now. While a template-based solution might seem appealing and might even be applicable in some cases, there are many others where it is not: Consider for example solving an ordinary differential equation: A mechanical system described by locations and velocities, the excitation given by some forces depending on various factors. As long as you restrict yourself to explicit time integration, you may get around with type-encoded units. For implicit schemes, however, you will have to solve linear or nonlinear systems of equations, with various unknowns of different physical quantities (aka types) in a single vector (locations and velocities in the above case). This is not possible in C++ and thus the point where the approach by this library does no more hold. You can apply some fancy transforms beforehand to basically make all numbers dimensionless, but then you dont need a unit library anyway. Similar situations occur in all kind of fluid dynamics, finite difference, finite element codes, dynamic simulation codes and the like. I feel a far stronger need for a physical unit library in two other parts: Namely the business locgic and the presentation layer. A) The business logic is about intermediate storage of compuation results for the use by other parts of the software, probably written by other people, at other times, in other locations. Here you have to ensure that the units _really_ match in every possible case. And there are lots of cases possible. If they do not match, a conversion is usually acceptable, as it does not happen over and over again. We also talk about serialisation aspects here. Language dependend output (as suggested for future development in the documentation) to files is an absolute no-go: Otherwise the file saved by one engineer will not be able to get read by his collegue, only because they do not have the same language setting. Furthermore, many times, you do not know wile writing the software, what kind of quantity you actually handle. This is not so true for compuation kernels but more for backend storage that abstractly handles results of one computation over to some other computation, some postprocessing tools or the like. Nevertheless correct unit handling is indispensable. B) The presentation layer consists of everthing that is visible to the user, namely the GUI or the like. People in academics can hardly imagine what expectations engineers might have towards the units of their data. (Before entering industry, I was no exception to that rule :-) ) This is partly due to historical reasons; the choice of units in the presentation layer may vary by corporate guidelines, personal preference, professional background of the user or geographical location. If you want to get around with a single piece of code then, you cannot live with a fixed units for the quantities in your program. (The presentation layer, that is -- computation kernels are another story.) As another example, consider simply a nifty small graph plotting widget. Clearly enough, this widget must handle whatever unit may arrive. Furthermore, users may decide to add or multiply some channels or combine them by some other means. Therefore the arithmetic operations for units are necessary at the runtime of the program. All together, the library does not fit my need or those of my present or previous coworkers. As mentioned, there are severe obstacles for the application where it could be deployed and it does not help where the real problems are. In my opinion, it simply does not go far enough. Template-based solutions are known for many, many years by now (the book of Barton&Nackman is the oldest I personally know of) but have not been widely adopted until now. In my opinion, this is not so much due to the lack of compiler support but because the template-based approach can only be applied to some small portion to the issue, and thus can only be a minor (though not so unimportant) building block of a solution. I like the notion of a unitsystem as a mean of choice for the internal represenation of the quantities (and have already taken that idea to my own project). On the other hand: the length of my desk is a physical quantity that exists independently of the unit it is expressed in. The numerical value of that quantity is only fixed for a given unitsystem but this is an implementation detail, not a principal thought or definition. Thus I find the definition "A quantity is defined as a value of an arbitrary value type that is associated with a specific unit." somewhat irritating. Some more details: ------------------ The author mentiones the problem to distinguish the units for energy and torque. Both quantities result as a product of a length and a force but have completly different physical meanings. When reading the documentation, I got the impression that these two should be distinguishable somehow, but the author does not tell us of what type the product (2*meter)*(5*Newton) should actually be and how a compiler should tell the difference. I got the impression that the intent of the following code is not so obvious: quantity<force> F(2.0*newton); std::cout << "F = " << F << std::endl; The author states that the intent is obviously F = 2 m kg s^(-2); In contrast to the author, I personally would naturally expect something like: F = 2.0[N] This issue is not half as superficial as it might seem at first glance. In fact that is what the whole presentation layer stuff is all about. I have yet to see any stringend need for rational exponents. They rather obscure the physical meaning of certain terms, imho. To define all possible conversions of n unitsystems, you'll need about O(n*n) helper structs to be defined. I may be overlooking some more elaborated automatism, though. However, we are likely to face a larger number of unitsystems as already the conversion of "cm" in "m" requires different systems. The handling of temperatures (affine units in general) is correct but not very practical. The executive summary: ---------------------- * What is your evaluation of the design? I am strongly missing concepts that help to create additional functionality instead of "only" error avoidance. To me the semantics of quantities or units is not clear, and does not conform well with the one of the NIST (National Institute of Science and Technology: http://physics.nist.gov/cuu/Units/introduction.html). * What is your evaluation of the implementation?
From the technical aspects, this library looks probably good. My objections focus on the semantic level.
* What is your evaluation of the documentation? The documentation looks very good, though the "quick start" and the following pages is too much about the internal implementation and not helpfull enough for those people that simply want to use the library. * What is your evaluation of the potential usefulness of the library? In the current form, it is no of direct use for me. I am sorry to say that. Basically it is what is was originally written for: "An example of generic programming for dimensional analysis for the Boost mailing list". It was not designed with focus on the problems of realistic large scale scientific and engineering C++-programming. Nevertheless I already took ideas from it. * Did you try to use the library? With what compiler? Did you have any problems? I briefly compiled some examples and tried to extend them. I had no technical difficulties using VC8. Rather had to struggle to find syntactically how I should do the things I wanted to. But that would be only a minor documentation issue. * How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? I thouroughly read the documentation, thought deeply about it, and briefly played around with the implementation. * Are you knowledgeable about the problem domain? Yes, definitely. I am working on that topic for about 2 years now, analyzed our existing solutions, held meetings with different coworkers about the strengths and weaknesses of their existing solutions, evaluated some other existing libraries. Because all were unsatisfactory, I finally started to go for my own solution few months ago. * Do you think the library should be accepted as a Boost library? No, I dont think this library should be accepted as it is. I am sorry. I apologize as well for the lengthy mail, but I hope my requirements get clearer that way. Yours, Martin Schulz. -- Dr. Martin Schulz (schulz@synopsys.com) Software Engineer Sigma-C Software AG / Synopsys GmbH, Thomas-Dehler-Str.9, D-81737 Munich, Germany Phone: +49 (89) 630257-79, Fax: -49 http://www.synopsys.com

AMDG Martin Schulz <Martin.Schulz <at> synopsys.com> writes:
Hello everybody,
my background is in software development for scientific and engineering computing, facing complex applications in industry. Therefore, I may take the opposite stand here. For fairness, I must add that I am somewhat from the competion, as I have started my own little (not yet upgrown) project in this area.
Practically speaking, when developing such a technically minded piece of software, approximately 30% of the development effort goes into the computation kernel whereas the remaining 70% of the effort goes to the overall infrastructure, business layer, presentation layer (batch, gui, api,..), interoperability, productizing etc. It is sad to say, but such is life.
<snip>
I feel a far stronger need for a physical unit library in two other parts: Namely the business locgic and the presentation layer.
<snip>
I'm sorry. I don't understand what you are getting at. would something like this be more acceptable. unit meters(_symbol = "m", _conversion_factor_to_si = 1); unit kilograms(_symbol = "kg", _conversion_factor_to_si = 1); unit seconds(_symbol = "s", _conversion_factor_to_si = 1); unit joules(kilograms * meters * meters / (seconds * seconds), _symbol = "J", _name = "Joule"); unit hours(_name = "hour", _conversion_factor_to_si = 1/(3600.0)); unit kilowatt_hours(joules / seconds * hours, _name = "kilowatt-hour"); quantity<double> energy(2.89 * joules); Or is it primarily IO that you want. istringstream stream("1.29038761 J"); quantity<double> energy; stream >> energy; cout << symbol << energy << endl; //1.29038761 J cout << name << energy << endl; //1.29038761 Joule assert(energy.value() == 1.29038761); assert(energy.unit() == joules); energy.convert_to(kilowatt_hours); cout << name << energy << endl; //... kilowatt-hours
I like the notion of a unitsystem as a mean of choice for the internal represenation of the quantities (and have already taken that idea to my own project). On the other hand: the length of my desk is a physical quantity that exists independently of the unit it is expressed in. The numerical value of that quantity is only fixed for a given unitsystem but this is an implementation detail, not a principal thought or definition. Thus I find the definition "A quantity is defined as a value of an arbitrary value type that is associated with a specific unit." somewhat irritating.
Trying to give a quantity an absolute meaning leads to defining everything in terms of SI.
Some more details: ------------------
<snip>
I got the impression that the intent of the following code is not so obvious: quantity<force> F(2.0*newton); std::cout << "F = " << F << std::endl; The author states that the intent is obviously F = 2 m kg s^(-2); In contrast to the author, I personally would naturally expect something like: F = 2.0[N] This issue is not half as superficial as it might seem at first glance. In fact that is what the whole presentation layer stuff is all about.
I agree. The ouput that we have is not really useful for anything other than debugging.
I have yet to see any stringend need for rational exponents. They rather obscure the physical meaning of certain terms, imho.
They are present because someone needed them.
To define all possible conversions of n unitsystems, you'll need about O(n*n) helper structs to be defined. I may be overlooking some more elaborated automatism, though. However, we are likely to face a larger number of unitsystems as already the conversion of "cm" in "m" requires different systems.
Yes. That is a problem. There is no way to avoid it that I know. Suppose that you need meters, feet, furlongs, and miles. Then, you will need 12 conversion factors. Now, it would be possible for the library to automatically work out meters->feet e.g. if feet->meters is already defined, but that still requires O(n*n) specializations. Unless the library makes one unit special there is no way to reduce this to O(n).
The handling of temperatures (affine units in general) is correct but not very practical.
The executive summary: ----------------------
<snip>
Thanks for you comments. In Christ, Steven Watanabe

Practically speaking, when developing such a technically minded piece of software, approximately 30% of the development effort goes into the computation kernel whereas the remaining 70% of the effort goes to the overall infrastructure, business layer, presentation layer (batch, gui, api,..), interoperability, productizing etc. It is sad to say, but such is life.
A Boost library isn't mandated to completely solve all aspects of a given problem domain for all potential users. I could easily have written a comparably negative review of, for example, GIL because it doesn't meet my specific needs. However, it appears to be fine for the purposes that many users intend to use it for. Similarly, one could argue that the Boost quaternion/octonion library is too restricted - why not implement complete Clifford algebras instead? And, as has been noted, that library also does not provide transparent support for 3D graphics applications that are the most likely candidates for real-world use of quaternions. In any case, like most Boost authors, I wrote a library that was useful for my applications - in research, I spend basically all my time implementing and testing new equations and have little or no use for GUI features. I certainly am not arguing that what you're asking for is not useful - but if you wanted to submit a runtime unit library to Boost, should I reject it because it incurs runtime overhead that is not acceptable to me? At this point, unfortunately, either a unit library does or does not incur this runtime overhead - for some users it is acceptable and for others it is not. I would be delighted to see you put forth your library as a complementary contribution to Boost and help ensure interoperability. I don't really understand why this particular topic appears to have become a zero sum game...
to implement. Once you are sure what you are doing, you go and implment it (using plain dimensionless floating point arithmetic). That worked well for over 50 years now.
The Mars Climate Orbiter team would disagree with this assessment... http://www.space.com/news/orbiter_error_990930.html
apply some fancy transforms beforehand to basically make all numbers dimensionless, but then you dont need a unit library anyway. Similar situations occur in all kind of fluid dynamics, finite difference, finite element codes, dynamic simulation codes and the like.
Ensuring that an equation is dimensionless is sufficient in all cases...
the units _really_ match in every possible case. And there are lots of cases possible. If they do not match, a conversion is usually acceptable, as it does not happen over and over again. We also talk about serialisation aspects here. Language dependend output (as suggested for future development in the documentation) to files is an absolute no-go: Otherwise the file saved by one engineer will not be able to get read by his collegue, only because they do not have the same language setting.
The current submission fully supports Boost.Serialization which provides language-independent input and output of units and quantities.
Furthermore, many times, you do not know wile writing the software, what kind of quantity you actually handle. This is not so true for compuation kernels but more for backend storage that abstractly handles results of one computation over to some other computation, some postprocessing tools or the like. Nevertheless correct unit handling is indispensable.
If you are going to work in a fixed internal unit system, you can use explicit unit conversion through the constructor to do user input.
distinguishable somehow, but the author does not tell us of what type the product (2*meter)*(5*Newton) should actually be and how a compiler should tell the difference.
2 is a scalar value and 5 is a scalar value, so 10*Newton*meter is an energy. Torque is a pseudovector, so two value types whose product forms a pseudovector would result in a torque. No library can substitute for a complete understanding of the problem domain.
The author states that the intent is obviously F = 2 m kg s^(-2); In contrast to the author, I personally would naturally expect something like: F = 2.0[N]
It would be relatively easy to generate specializations for unit output that address the named derived units in the SI system. However, in general a unit may be expressed in many ways if you allow a non-orthogonal basis, so there is no general solution that can unambiguously assign a set of fundamental and derived units to it. You could possibly reduce it to the minimal combination of fundamental and derived units, but this may not give the user what they expect, either...
I have yet to see any stringend need for rational exponents. They rather obscure the physical meaning of certain terms, imho.
Some people need them.
To define all possible conversions of n unitsystems, you'll need about O(n*n) helper structs to be defined. I may be overlooking some more elaborated automatism, though. However, we are likely to face a larger number of unitsystems as already the conversion of "cm" in "m" requires different systems.
There is no way around this if you want to avoid having a common unit system through which all conversions occur. The reasons why that is a bad idea have been exhaustively discussed in previous conversations in this mailing list on the topic. Furthermore, we may want to only allow implicit conversions to go in one direction, so two specializations are needed for each pair of units.
The handling of temperatures (affine units in general) is correct but not very practical.
I'd be happy to hear of a more practical solution that preserves the zero runtime overhead of the library.
From the technical aspects, this library looks probably good. My objections focus on the semantic level.
Thanks for your input. Matthias

Matthias, I'll split my answer in two parts. The more technical will follow.
A Boost library isn't mandated to completely solve all aspects of a given problem domain for all potential users.
Maybe the whole conflict is only due to fundamentally different expectations. When I read the introduction, full of words like "zero runtime overhead", "in a general and extensible manner", "generic", "arbitrary unitsystem models", "arbitrary value types", "general facility", "fine-grained control", "complete SI and CGS systems" etc then I (and I suppose others as well) get the impression that this library certainly is the definite unit library that claims to solve any unit conversion problem under the sun. Generally speaking, Boost libraries have a very high reputation to actually solve the problem they claim to solve. But this one does not live up to my personal expectations raised by that introduction. This is clear to me, but I am afraid others may take some time to discover that fact the hard way. Maybe the not the library, but just the expectation is wrong. To avoid such misconception, please give your potential users the chance to adjust their expectations beforehand. I'd suggest to add to the documentation a section like the following (not exhaustive, you will probably add other points that are important to you), clearly stating what can and what cannot be expected: "Domain of application and restrictions This library enables to specify the exact physical dimension of any scalar or fixed aggregate quantity by encoding the dimension in the respective type. The type system is able to deduce the internal type of a physical quantity from arithmetic expressions involving other quantities fixed at compile time. For storage in variables however, the user of the library will need to specify the exact type. It is believed that the library is most useful when this type system is applied to the whole program. Since the library does not allow for mixed storage of quantities e.g. in vectors, further workarounds will be necessary when dealing linear algebra like BLAS, LAPACK or other advanced numerical libraries. Because the physical dimensions of all quantities are fixed ad compile time, there is no support for code that acts on certain quantities regardless of what these quantities actually represent, i.e. all quantities specified at runtime of the software. This includes GUI frameworks, post processing or visualization tools. The library provides support for language-independent serialization by the use of Boost.serialization library. No attempt is made to ensure that the units of the deserialized quantities match the ones of the serialized ones. No attempt is made to provide formatting for quantities in human readable form (beyond debugging output). No attempt is made to provide parsing of unit input in human readable form." Does this reflect the the admissible expectations more closely? Yours, Martin.

On 3/28/07, Martin Schulz <Martin.Schulz@synopsys.com> wrote:
Since the library does not allow for mixed storage of quantities e.g. in vectors, further workarounds will be necessary when dealing linear algebra like BLAS, LAPACK or other advanced numerical libraries.
Wouldn't the run-time library suffer even more restrictions when it came to BLAS, LAPACK, etc? Those functions want a float or double pointer that points to a contiguous array (the functions do not have a stride parameter), as well as an array size. quantity<abstract> will have to be more than a simple value if you want mixed storage and run-time conversions. It requires polymorphism, or carrying around an extra 'unit type' variable. The run-time library does allow things like: vector<quantity<abstract> > quantities; but I really don't see how interfacing with BLAS, et. al. would be possible directly. The compile-time version, on the other hand, DOES interface directly if the vector was full of say quantity<SI::length> since those are simply doubles.
Because the physical dimensions of all quantities are fixed ad compile time, there is no support for code that acts on certain quantities regardless of what these quantities actually represent, i.e. all quantities specified at runtime of the software. This includes GUI frameworks, post processing or visualization tools.
The current version does not preclude use with any of those. Currently I have a GUI that accepts meters, nautical miles, miles, and feet, depending on what the user wants to input his values in. It processes that input with other values gathered from various databases (feet and nautical miles) and then displays the results to a window rendering with OpenGL (so I had to convert the values to OpenGL units).
The library provides support for language-independent serialization by the use of Boost.serialization library. No attempt is made to ensure that the units of the deserialized quantities match the ones of the serialized ones.
No attempt is made to provide formatting for quantities in human readable form (beyond debugging output).
No attempt is made to provide parsing of unit input in human readable form."
I look at all of those as huge wins for this submission. The can of worms you open up when trying to solve the I/O issues with units and locales is huge, and IMHO warrants a separate, interoperable submission. --Michael Fawcett

Martin, Thanks for your comments.
But this one does not live up to my personal expectations raised by that introduction. This is clear to me, but I am afraid others may take some time to discover that fact the hard way. Maybe the not the library, but just the expectation is wrong. To avoid such misconception, please give your potential users the chance to adjust their expectations beforehand.
This is a good suggestion; as you point out, there is no benefit in having the expectations of potential users set beyond the scope of problems that the library actually addresses. I will add a clarifying section on "Domain of application and restrictions" to the documention as you suggest.
Since the library does not allow for mixed storage of quantities e.g. in vectors, further workarounds will be necessary when dealing linear algebra like BLAS, LAPACK or other advanced numerical libraries.
Perhaps Boost.Any provides a solution to this issue? This code compiles OK : boost::numeric::ublas::vector<boost::any> v(3); v(0) = boost::any(quantity<SI::length>(3.0*meters)); v(1) = boost::any(quantity<SI::mass>(3.0*kilograms)); v(2) = boost::any(quantity<SI::time>(3.0*seconds)); I honestly don't know how frequently the problem of containers of heterogeneous quantities will arise, but I understand that there may be situations where this is the case.
The library provides support for language-independent serialization by the use of Boost.serialization library. No attempt is made to ensure that the units of the deserialized quantities match the ones of the serialized ones.
Clearly, there is a limit to what one can accomplish when converting between a data stream of raw bytes to a class. Furthermore, the overhead involved in storing specific type information on units and quantities could be significant, so that storing a large array of quantities would occupy dramatically more disk space than the corresponding array of value types. This problem is a general one, anyway : I don't know of any Boost libraries that provide type checked serialization.
No attempt is made to provide formatting for quantities in human readable form (beyond debugging output).
At present this is true. However, I believe that extending the formatting options is a reasonable request and will be given serious consideration for future incarnations of the library. This would include IO manipulators and facets to control human language dependent output.
No attempt is made to provide parsing of unit input in human readable form."
This is true at present. Since, as you already are aware, the unit must be specified at compile time, this would provide minimal utility at present. However, we would be delighted to work with you to ensure compatibility and interoperability if you are interested in implementation of a library for handling runtime units. Ultimately, as many if not most participants in Boost are volunteers doing the work on their own time, expecting complete solutions of significant problems like those posed by units to be provided in monolithic form, I believe, will just end up preventing anything at all from being accomplished in the problem domain. Ultimately, the process functions best via collaboration, with multiple individuals contributing their expertise to solve problems within their domain of experience and interest. Regards, Matthias

AMDG Matthias Schabel <boost <at> schabel-family.org> writes:
I don't know of any Boost libraries that provide type checked serialization.
If this is done at all it should be done by using an archive that stores the type of each element. In Christ, Steven Watanabe

Since the library does not allow for mixed storage of quantities e.g. in vectors, further workarounds will be necessary when dealing linear algebra like BLAS, LAPACK or other advanced numerical libraries.
Perhaps Boost.Any provides a solution to this issue? This code compiles OK :
BLAS does not congest boost::anys. It crunches numbers. Bare-bone floating point numbers, in single precision and contiguous memory for best performance. I like the picture of a conveyor belt for the transportation of my quantities: Near the end of the belt, there is a last check that everything is in correct shape (aka unit) and then - off it goes: Undressed of all other attributes, the naked numbers get fed into that horrible numbercrunching monster machinery. After getting out on the other side, they will get nicely dressed again and released into the shiny world anew. :-) Consequently, I consider a separation of concerns here: To the number crunching monster, it should look like a POD floating point field. To the outside, we need to dress these numbers with their further attributes. That could be done by some proxy or facade object that models a quantity but only holds a reference to the actual location of the numerical value instead of the numerical value itself. Those proxy objects could be both either static or dynamic in nature, just as needed. A separate object for each number, that would clearly incur a noticeable overhead. To keep that manageable, it appears quite practicable to provide one such object for larger parts of the floating point field, implementing the usual sequence semantics. (More gentle minds do not feed those cute quantities to a monster, they do perform a conversion into a dimensionless system beforehand.)
Clearly, there is a limit to what one can accomplish when converting between a data stream of raw bytes to a class.
I would have expected an attempt to catch at least the major sources of errors. Otherwise you end up with the same kind of vulnerability as the Mars Climate Orbiter teams. This time, not on the level of subroutine calls but of files written by on program and read by the another one. For an example, you could check whether the stringified unit of the quantity that is to be deserialized is the same as the stringified unit of the quantatity that has previously serialized. I.e. something along the lines of template<class Archive,class Unit,class Y> inline void serialize(Archive& ar,boost::units::quantity<Unit,Y>& q,const unsigned int version) { static std::string s1=stringify(unit<Dim,System>()); std::string s2(1); ar & q.value(); ar & s2; if (s1!=s2) throw incompatibilty_exception(q, s); }
Furthermore, the overhead involved in storing specific type information on units and quantities could be significant, so that storing a large array of quantities would occupy dramatically more disk space than the corresponding array of value types.
Cleary, safety does not come for free. However in the case of a large vector of quantities of the same type (the only one you considered so far), the unit need to get checked only once for the whole vector, not repeatedly for each and every element therein (much like the sequence semantics I mentioned above). That should alleviate the associated overhead sufficiently to get manageable again. Yours, Martin.

AMDG Martin Schulz <Martin.Schulz <at> synopsys.com> writes:
template<class Archive,class Unit,class Y> inline void serialize(Archive& ar,boost::units::quantity<Unit,Y>& q,const unsigned int version) { static std::string s1=stringify(unit<Dim,System>()); std::string s2(1);
ar & q.value(); ar & s2;
if (s1!=s2) throw incompatibilty_exception(q, s); }
It would be better to make an archive that checks the types of all the elements. In Christ, Steven Watanabe

The Mars Climate Orbiter team would disagree with this assessment...
Thank you for the link, that article nicely illustrates what I meant when I said: "As a matter of fact, you have to ensure that the right things get in, and the right things get out there [the kernel]. What happens inside however is a completely different story." If we may believe the article, that software falls into at least two parts (say, "kernels", "modules", "packages", "libraries",..) connected by some interface. While each team did a proper handling within their part, they did not made sure the they really got (kilometers) what they expected to get (miles). The issue is particularly tricky as at the same time, all modules probably have passed the respective quality assurance tests without any failure...
Ensuring that an equation is dimensionless is sufficient in all cases...
But then again, there would be no need for a unit library after all. But since we both feel the need for such a library, we seem to agree that the handling in dimensionless quantities will not be the overall solution. (Though it is certainly very helpful in specific situations)
of what type the product (2*meter)*(5*Newton) should actually be and how a compiler should tell the difference.
2 is a scalar value and 5 is a scalar value, so 10*Newton*meter is an energy. Torque is a pseudovector, so two value types whose product forms a pseudovector would result in a torque.
Well basically what makes the difference in my understanding is nothing more or less than wether the distance and the force are considered perpendicular or colinear to each other. The "to each other" being the key word here. But then again, how could that relation be encoded independendly in the types of the two scalar quantities (2*meter) and (5*newton)?
No library can substitute for a complete understanding of the problem domain.
Certainly true, but we draw different conclusions. While you insist on making a difference, I came to the conclusion, that both torque and energy should have the very same internal representation. The interpretation of that internal representation then clearly requires the understanding of the domain.
The author states that the intent is obviously F = 2 m kg s^(-2); In contrast to the author, I personally would naturally expect something like: F = 2.0[N]
It would be relatively easy to generate specializations for unit output that address the named derived units in the SI system.
I am afraid, the problem is deeper than that. IMHO and contrary to ad hoc expectations, the mapping of physical quantities to products of exponentials of basic units is not injective. This means the mapping is not invertible. So the mapping generated by the suggested specializations can at best represent a kind of "maximum likelihood solution".
rational exponents....
Some people need them.
Do you know about their background? Could you please elaborate on that? Yours, Martin.

On 3/26/07, Martin Schulz <Martin.Schulz@synopsys.com> wrote:
B) The presentation layer consists of everthing that is visible to the user, namely the GUI or the like. People in academics can hardly imagine what expectations engineers might have towards the units of their data. (Before entering industry, I was no exception to that rule :-) ) This is partly due to historical reasons; the choice of units in the presentation layer may vary by corporate guidelines, personal preference, professional background of the user or geographical location. If you want to get around with a single piece of code then, you cannot live with a fixed units for the quantities in your program. (The presentation layer, that is -- computation kernels are another story.)
I'm a little unclear on why some people think run-time support is required for robust interaction with GUIs. It could very well be a misunderstanding on my part, but with a user interface that allows arbitrary units, wouldn't you still need something like this: // Assume quantity<abstract> provides run-time support quantity<abstract> on_dialog_ok() { int selection = get_combobox_selection(ID_COMBO_DISTANCE); double value = get_editbox_value<double>(ID_EDIT_DISTANCE); quantity<abstract> distance; // This must be kept in sync with the dialog resource switch (selection) { case Meters: distance.reset(distance * meters); break; case Feet: distance.reset(distance * feet); break; // etc } return distance; } FWIW, the compile-time version of the above function looks very similar. You still have to have a case for each type, the only difference is that you have to decide during implementation what unit to convert everything to. Granted, the run-time version allows things like: // Note: conversion takes place behind the scenes here, // and you don't know which way the conversion happens. // Is this comparison done in meters or in feet? if (distance_in_meters > distance_in_feet) (Aside: Not that I have any understanding of the complexities involved, but wouldn't it be cool if the run-time component automatically chose the correct unit that would maximize precision? If it knew an overflow wouldn't happen, wouldn't doing the arithmetic at the largest level possible always be better? (e.g. feet + meters would be done in meters, feet + nanometers would be done in feet)
From my understanding, the run-time component allows mixed unit arithmetic. It allows for a nice syntax, but if one wants control over which way the conversion takes place, the user will still need to resort to being explicit (casts or constructors), at which point he has the same (or very similar) syntax as the compile-time version, except he's paying a run-time cost as well.
I'm not saying that a run-time version isn't needed, just that I don't understand the use cases yet, nor do I understand why a compile-time component isn't useful enough to be included in Boost as is.
As another example, consider simply a nifty small graph plotting widget. Clearly enough, this widget must handle whatever unit may arrive. Furthermore, users may decide to add or multiply some channels or combine them by some other means. Therefore the arithmetic operations for units are necessary at the runtime of the program.
Why wouldn't that widget use some base underlying unit and just convert all incoming values? --Michael Fawcett

Michael Fawcett wrote:
On 3/26/07, Martin Schulz <Martin.Schulz@synopsys.com> wrote:
B) The presentation layer consists of everthing that is visible to the user, namely the GUI or the like. People in academics can hardly imagine what expectations engineers might have towards the units of their data. (Before entering industry, I was no exception to that rule :-) ) This is partly due to historical reasons; the choice of units in the presentation layer may vary by corporate guidelines, personal preference, professional background of the user or geographical location. If you want to get around with a single piece of code then, you cannot live with a fixed units for the quantities in your program. (The presentation layer, that is -- computation kernels are another story.)
I'm a little unclear on why some people think run-time support is required for robust interaction with GUIs. It could very well be a misunderstanding on my part, but with a user interface that allows arbitrary units, wouldn't you still need something like this:
// Assume quantity<abstract> provides run-time support quantity<abstract> on_dialog_ok() { int selection = get_combobox_selection(ID_COMBO_DISTANCE); double value = get_editbox_value<double>(ID_EDIT_DISTANCE); quantity<abstract> distance;
// This must be kept in sync with the dialog resource switch (selection) { case Meters: distance.reset(distance * meters); break; case Feet: distance.reset(distance * feet); break; // etc } return distance; }
Nope. Your dialog manager would keep track of the unit and set appropriately upon initialization and/or user interaction. Your reader would then be as simple as setting the value directly to the quantity. Alternatively you might have something like this: quantity<abstract> on_dialog_ok() { quantity<abstract> dist; dist.set_unit(combo.selected_item().data()); dist.set_value(entry.value().as_double()); } No more switch statement smell.

Nope. Your dialog manager would keep track of the unit and set appropriately upon initialization and/or user interaction. Your reader would then be as simple as setting the value directly to the quantity. Alternatively you might have something like this:
quantity<abstract> on_dialog_ok() { quantity<abstract> dist; dist.set_unit(combo.selected_item().data()); dist.set_value(entry.value().as_double()); }
No more switch statement smell.
Yes, but then again, the unit of "dist" cannot be hardcoded in its C++ type anymore. I may be missing something, but I do not see how the above code could be made to work with the library under review. Yours, Martin.

Martin Schulz wrote:
Nope. Your dialog manager would keep track of the unit and set appropriately upon initialization and/or user interaction. Your reader would then be as simple as setting the value directly to the quantity. Alternatively you might have something like this:
quantity<abstract> on_dialog_ok() { quantity<abstract> dist; dist.set_unit(combo.selected_item().data()); dist.set_value(entry.value().as_double()); }
No more switch statement smell.
Yes, but then again, the unit of "dist" cannot be hardcoded in its C++ type anymore. I may be missing something, but I do not see how the above code could be made to work with the library under review.
Yeah, it doesn't. That isn't what is being discussed. What is being discussed is the lack of support for this feature, hence discussing how the feature would work is important.

On 3/27/07, Noah Roberts <roberts.noah@gmail.com> wrote:
Nope. Your dialog manager would keep track of the unit and set appropriately upon initialization and/or user interaction. Your reader would then be as simple as setting the value directly to the quantity. Alternatively you might have something like this:
quantity<abstract> on_dialog_ok() { quantity<abstract> dist; dist.set_unit(combo.selected_item().data()); dist.set_value(entry.value().as_double()); }
I see. Would the implementation allocate a derived class based on the unit type, or just dispatch during arithmetic based on unit type? I suppose it works out to about the same penalty either way, virtual function or if statement for every arithmetic. It would probably be a lot cleaner to go the derived class route though.
No more switch statement smell.
I agree on the smell...Thankfully it's very contained when using the compile-time version. It seems like with a run-time component that was interoperable with the compile-time one, you could have zero run-time overhead during computations as well as get rid of all switch statements at the boundaries (reading/writing from database or GUI). Something like: quantity<SI::length> on_dialog_ok() { quantity<abstract> dist; dist.set_unit(combo.selected_item().data()); dist.set_value(entry.value().as_double()); return quantity<SI::length>(dist); } --Michael Fawcett

AMDG Michael Fawcett <michael.fawcett <at> gmail.com> writes:
<snip>
No more switch statement smell.
I agree on the smell...Thankfully it's very contained when using the compile-time version. It seems like with a run-time component that was interoperable with the compile-time one, you could have zero run-time overhead during computations as well as get rid of all switch statements at the boundaries (reading/writing from database or GUI). Something like:
quantity<SI::length> on_dialog_ok() { quantity<abstract> dist; dist.set_unit(combo.selected_item().data()); dist.set_value(entry.value().as_double()); return quantity<SI::length>(dist); }
Here is how I would do it: quantity<SI::length> on_dialog_ok() { double conversion = combo.selected_item().data(); double value = entry.value().as_double(); return quantity<SI::length>::from_value(conversion * value); } Absolutely minimum overhead. In Christ, Steven Watanabe

On 3/28/07, Steven Watanabe <steven@providere-consulting.com> wrote:
Here is how I would do it:
quantity<SI::length> on_dialog_ok() { double conversion = combo.selected_item().data(); double value = entry.value().as_double(); return quantity<SI::length>::from_value(conversion * value); }
There may be some confusion (again) on my part as to what combo.selected_item.data() is returning, and how it's useful to the quantity directly. I was assuming from Noah's post that it was either the integer of the currently selected item in the combo box, or the string of the currently selected item in the combo box. I can see how both of those could be immediately useful to the run-time quantity. It would either know that the number, say 5, mapped to feet, or it would be passed "feet". What does your version do exactly? What is combo.selected_item().data() returning, why is it a double, and how does from_value use it? --Michael Fawcett

AMDG Michael Fawcett <michael.fawcett <at> gmail.com> writes:
On 3/28/07, Steven Watanabe <steven <at> providere-consulting.com> wrote:
Here is how I would do it:
quantity<SI::length> on_dialog_ok() { double conversion = combo.selected_item().data(); double value = entry.value().as_double(); return quantity<SI::length>::from_value(conversion * value); }
There may be some confusion (again) on my part as to what combo.selected_item.data() is returning, and how it's useful to the quantity directly. I was assuming from Noah's post that it was either the integer of the currently selected item in the combo box, or the string of the currently selected item in the combo box. I can see how both of those could be immediately useful to the run-time quantity. It would either know that the number, say 5, mapped to feet, or it would be passed "feet".
What does your version do exactly? What is combo.selected_item().data() returning, why is it a double, and how does from_value use it?
--Michael Fawcett
The confusion could be on my part. I was thinking that the combo box would allow arbitrary data to be associated with the each item. If this is not the case then you need either a map or a switch statement regardless of whether the units are compile time or runtime. In Christ, Steven Watanabe

Steven Watanabe wrote:
The confusion could be on my part. I was thinking that the combo box would allow arbitrary data to be associated with the each item. If this is not the case then you need either a map or a switch statement regardless of whether the units are compile time or runtime.
Well, not arbitrary but some data that is meaningful to the code using it that is not displayed to the user. In my case I was thinking a unit type that exists solely for providing conversions. Not always is that a multiplication. For instance gage pressures adds an extra value on top of any conversion when converted from an absolute. There are also cases, that though may be rather questionable, in which two disparate dimensions are used for the same kind of thing. For instance, a user can often request that pressures are reported in a "hydrostatic" pressure, which is a length. A hydrostatic unit might contain information necessary to do the dimensional conversion to be used in a pressure quantity. For this reason the unit type cannot be a primitive if the library is to encompass this use, which I think it should. Also, your idea has problems when you account for rounding. Rounding is often done before you display data to a user. In cases when this data can come from a calculation or from the user it is hard to decide when to round and when not. If you apply rounding and conversions on both input and output and add rounding into the mix you can have a user entered value that gets reported back to the user as something else. This is not acceptable in many cases. At any rate, there are ways to solve the whole thing regardless. The question is, what is the use of a library that only does static conversions. How often is one going to use multiple "systems"? The problem of writing expressions in one set of units that may be different than the rest of the system is easily accomplished with the use of static constants. I believe that the cases when a static system is better are actually very small and it is for that reason that I don't believe this library is a good candidate for boost inclusion since the object of boost is to provide generic and commonly useful libraries. The dimensional analysis part is very generic and useful but the static only unit part detracts from this usefulness.

AMDG Noah Roberts <roberts.noah <at> gmail.com> writes:
<snip>
In my case I was thinking a unit type that exists solely for providing conversions. Not always is that a multiplication.
<snip>
Ok. Good point. You can use type erasure to get this effect and be able to use the static conversions at runtime. template<class Quantity, class InputQuantity> Quantity quantity_converter(const typename Quantity::value_type& v) { return(quantity_cast<Quantity>(InputQuantity::from_value(v))); }
Also, your idea has problems when you account for rounding. Rounding is often done before you display data to a user. In cases when this data can come from a calculation or from the user it is hard to decide when to round and when not. If you apply rounding and conversions on both input and output and add rounding into the mix you can have a user entered value that gets reported back to the user as something else. This is not acceptable in many cases.
I may be wrong, but in order to avoid any excess loss of precision you have to store a set of all the base units and track the actual unit at runtime involving a merge with every multiplyor divide. This kind of overhead is not always acceptable.
At any rate, there are ways to solve the whole thing regardless. The question is, what is the use of a library that only does static conversions.
As has been stated many times conversions are not the primary point of the library.
How often is one going to use multiple "systems"?
Hopefully not very often. In Christ, Steven Watanabe

Steven Watanabe wrote:
I may be wrong, but in order to avoid any excess loss of precision you have to store a set of all the base units and track the actual unit at runtime involving a merge with every multiplyor divide. This kind of overhead is not always acceptable.
I don't understand what you are saying.
At any rate, there are ways to solve the whole thing regardless. The question is, what is the use of a library that only does static conversions.
As has been stated many times conversions are not the primary point of the library.
I've never seen that stated. I don't believe it either. If conversions are not the point of the library than why is a very significant portion of the library dealing with units and conversions?

AMDG Noah Roberts <roberts.noah <at> gmail.com> writes:
Steven Watanabe wrote:
I may be wrong, but in order to avoid any excess loss of precision you have to store a set of all the base units and track the actual unit at runtime involving a merge with every multiplyor divide. This kind of overhead is not always acceptable.
I don't understand what you are saying.
suppose that you store the actual value and the conversion factor. struct quantity { double conversion_factor; double value; }; Now multiplying two quantities requires two multiplications. Alternately, you convert to SI before doing the multiplication and return a different type. Either way you introduce extra operations thus reducing the precision of the result. The only way I can think of to enable runtime units without losing precision is struct base_unit { double conversion_factor; }; struct unit_impl { typedef boost::rational exponent_t; //keep this sorted to allow merge std::vector<std::pair<base_unit*, exponent_t> > impl; double conversion_factor; }; static std::set<unit_impl*> all_units; typedef boost::shared_pointer<unit_impl> unit; struct quantity { unit u; double value; }; Now all unit multiplications add the exponents of identical dimensions. Every time you create a new unit you look it up see whether an identical unit has already been created. If so than you return a pointer to the existing unit. This is so that you can explicitly set the conversion factors for complex units and thus get maybe another bit of precision.
I've never seen that stated. I don't believe it either. If conversions are not the point of the library than why is a very significant portion of the library dealing with units and conversions?
A lot of code is dedicated to conversions because they are rather difficult to implement given the current representation. Most of it is in detail/conversion_impl.hpp which is highly repetitious. If I ever get around to simplifying it the portion of the library dealing with it will appear much smaller. In Christ, Steven Watanabe

Michael Fawcett wrote:
On 3/27/07, Noah Roberts <roberts.noah@gmail.com> wrote:
Nope. Your dialog manager would keep track of the unit and set appropriately upon initialization and/or user interaction. Your reader would then be as simple as setting the value directly to the quantity. Alternatively you might have something like this:
quantity<abstract> on_dialog_ok() { quantity<abstract> dist; dist.set_unit(combo.selected_item().data()); dist.set_value(entry.value().as_double()); }
I see. Would the implementation allocate a derived class based on the unit type, or just dispatch during arithmetic based on unit type? I suppose it works out to about the same penalty either way, virtual function or if statement for every arithmetic. It would probably be a lot cleaner to go the derived class route though.
No virtual function would be required. template < typename D > struct unit { double convert(double base_value) const { return base_value * cv_fact; } private: double cv_fact; }; D is the dimension and simply makes sure you don't intermix dimensions as it would be unassignable to any quantity except ones in the same dimension.
participants (5)
-
Martin Schulz
-
Matthias Schabel
-
Michael Fawcett
-
Noah Roberts
-
Steven Watanabe