
Boost.Decimal Review I want to thank Matt Borland and Chris Kormanyos for their work.
Does this library bring real benefit to C++ developers for real world use-case?
Yes. It should be helpful in "human-centric calculations". It might be useful in financial applications, although I have some question about this below.
Do you have an application for this library?
Not at the moment.
Does the API match with current best practices?
My biggest point of disagreement is this single-header design. The
documentation says "The entire library should be accessed using the
convenience header `
Is the documentation helpful and clear?
Yes. The main point I'd like to discuss is in terms of explaining the trade-offs. The documentation says "Decimal floating point numbers avoid this issue by storing the significand in base-10 (decimal)." and the "issue" mentioned in the previous sentence is "representation errors". This seems misleading. It makes it sound like `decimalX` it's a variant of a bigint type and that attempts to avoid all representation errors. But if I understand correctly, decimal numbers avoid much more specific types of errors: representation errors when the base is not a multiple of a prime factor. It has the same representation errors as base-2 floating point numbers when 1) the base is not a multiple of a prime factor (although there are more multiples of prime factors) and 2) there are more and more rounding errors in the value as the exponent gets larger and larger. So the final trade-off is *less* representation errors (because there are more multiples of prime factors) at the cost of performance ("ratio to double" between ~30 and ~150 for most operations). It might be wrong about these details, but I still believe that sentence in the documentation is misleading. There's also a technical question I still don't understand about 1) the relationship between decimal numbers and fixed-point real numbers and 2) the impact of this relationship on potential applications. For instance, Bitcoin uses integers (Satoshis) to represent "decimal" units of BTC. This representation is just an integer and a fixed/implicit scaling factor/exponent. The same representation can be used for dollars: the internal units are cents and the main unit is a dollar. When we compare that with decimal floating point numbers, the trade-off becomes 1) faster operations (even faster than native base-2 floating point types) and 2) no precision errors up / exact accurary for numbers in the predefined scale, and 3) reliable precision regardless of the best exponent to represent the number; at the cost of a more limited range of values that can be represented. With that trade-off in mind, my intuition is decimal floating point numbers are not so useful for financial applications (the only application discussed in the "Examples" section). With 32-bit fixed-point numbers, I could represent more dollars (or cents) or BTC (or satoshis) than will ever exist (the limited range of values is not a problem) while getting the performance and precision benefits of integers. If the number is larger than that maximum, that means either I made a mistake (for instance, in currency conversion) and I should definitly fix this mistake (I can't lose $1000000 and ignore it because my best exponent is now Y for some reason - $1000000 is still a lot and I can't move on until this "precision error" is fixed). If this is not a mistake, then I'm in a subdomain where the precision errors don't matter anymore because I need those large exponents to represent something else. The problem is the extra precision from more multiples of prime factors are now not as relevant to me and base-2 numbers are probably fine (for instance, in some statistics about the data). I might be wrong about all of that, but if that's the case I assume other people will also be wrong about it. The docs could correct these mistakes or, if I'm wrong, provide other examples of useful applications. That's why I think the main application of the library is "human-centric calculations" and it's only useful in other fields when there's some intersection with that, like there is when reading data in base-10 format that might not fit a fixed-point representation. This discussion could also open possibilities for representations using integers in this library or other libraries in the sense that base-10 floating-point is, in a way, fixed-point numbers of a dynamic exponent. Other minor points: - It seems like most code blocks start with some whitespaces in the first line for some reason. - Constructing a decimal from floating point numbers leads to precision errors, and constructing it from independent integers makes it difficult to read. Since we construct it from strings (Literals Support), this could be listed in "Basic Usage". I couldn't figure out the functions to construct these values from literals without using the literals in the "Literals Support" section. - The signatures could contain the explicit requires clauses and concepts instead of "Where types `T` and `T2` are integral types" in the exposition. - "Lastly the sign follows the convention of `signbit`": This could be linked to https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/signbit or removed from the beginning of the exposition. The reader might think it is important information to understand the rest of the exposition and stop reading the documentation to google it. Also, `bool sign` only needs to be explained so much because it's unintuitive. If it were called `bool is_negative`, or something like that, there would be nothing to explain. - The "3.2.8 Note" could just be part of the reference documentation. This way the user would already have this information when looking at the function. It would also be useful to explain how that behavior compares with other numeric types. - Typo in "Integeral" - I missed some better explanation about the difference between fast and regular types. Like in Boost.Unordered where they explain the data structures and then the trade-off. - I'm used to the API reference being at the end of the documentation. I was surprised to see there's a lot of exposition after the reference: Examples, Benchmarks, "Design Decisions", ... - "Comparisons" section: "between vec[i] and vec[i + 1]": where the numbers come from in the implementation is irrelevant. Only the number of replicates (20,000,000) is relevant. "This is repeated 5 times to generate stable results" doesn't add entropy to help stability. It just means you have 100,000,000 replicates with a bias. You could just generate 100,000,000 values. It's also not relevant to the experiment that the values are in a vector. In fact, unless there's a technical reason for that, generating the numbers for the tests and discarding them is much easier. In any case, 100,000,000 is way too much for any probability distribution. After much less than that you'll have a stable p-value. - Comparison results: The tables need some explaining so the reader doesn't have to look at each column and header to infer what's happening. It's usual to include something like "The second column means ____. Lower values are better.". It's also common practice to include the standard deviation in the table. - The tables are not sorted by the values. This is particularly confusing because `GCC _Decimal<X>` always beats `decimal32`. This is not discussed in the document. `GCC_Decimal<X>` isn't even defined and discussed. - There should be a conclusion about the benchmark results. Did anything change from one platform to the other, etc? If I read the results correctly, the conclusion is I should always use `decimalX_fast`, then `GCC_DecimalX` if I need other properties, then `decimalX` if not using GCC. - If section "Basic Operations" is about "+, -, *, /", then the first comparison subsection "Comparisons" should be renamed to reflect what operations it compares ">, >=, <, <=, ==, and !=" (relational operators). - Both columns in the table have the same information in a different scale proportion. This means all benchmarks for a platform could be in a single bar plot where the x-axis has each number type, and for each number type, we have the ratio to double for each operation in a bar.
Did you try to use it? What problems or surprises did you encounter?
Yes. I compiled the code and ran the examples.
What is your evaluation of the implementation?
It seems like performance is still much worse than other implentations. I don't think this is a reason to reject the library. Just I hope the authors would keep an eye on that, implementing improvements over time, if possible. Or maybe some other native types or implementations could be wrapped in the library.
Please explicitly state that you either *accept* or *reject* the inclusion of this library into boost.
Also please indicate the time & effort spent on the evaluation and give
I recommend accepting the library. the reasons for your decision. I read the documentation and the source code. I compiled the library and ran the examples. I spent time on and off over the last week. In total, I spent about 2 days evaluating the library.
Disclaimer
I'm affiliated with the C++ Alliance. Em qua., 15 de jan. de 2025 às 06:34, John Maddock via Boost < boost@lists.boost.org> escreveu:
The review of the proposed Decimal Number library by Matt Borland and Chris Kormanyos begins today and runs until 22nd Jan.
You will find documentation here: https://cppalliance.org/decimal/decimal.html
And the code repository is here: https://github.com/cppalliance/decimal/
Boost.Decimal is an implementation of IEEE 754 https://standards.ieee.org/ieee/754/6210/ and ISO/IEC DTR 24733 https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2849.pdf Decimal Floating Point numbers. The library is header-only, has no dependencies, and requires C++14.
Please provide feedback on the following general topics:
- What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? Do you already use it in industry? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain?
Ensure to explicitly include with your review: ACCEPT, REJECT, or CONDITIONAL ACCEPT (with acceptance conditions).header-only, has no dependencies, and requires C++14.
Best, John Maddock (review manager).
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Alan Freitas https://alandefreitas.github.io/alandefreitas/ https://github.com/alandefreitas