
Herewith is my review of Beman's Boost.Endian library. ____________________ ____________________ What is your evaluation of the design? _____ The two-argument overloads of the [un]conditional swapping functions seem awkward at first blush, but are really the best interface for those operations. Changing to value returning functions opens the door to mistakes like the following: uint32_t const in(123); int32_t const out(reorder(in)); There might be a warning, but many programmers routinely ignore such. As is, the compiler will complain about the conditional functions: uint32_t const in(123); int32_t out; native_to_big(in, out); // Error: ambiguous The unconditional functions do not provide that benefit because they aren't function templates: uint32_t const in(123); int32_t out; reorder(in, out); _____ An interesting variation that I've used in the past is a set of cast-style function templates that permit byte swapping one type and returning the result as a different type, after verifying that the resulting value is in range of the target type. Thus, uint32_t const in(std::numeric_limits<uint32_t>::max()); int32_t const out(to_host<int32_t>(in)); // throws TMP allows streamlining the range checks done, of course. _____ Another variation I've used is a checked, two-argument function template with each potentially having a different type: template <class T, class U> void to_host(T & _result, U _value); (I always prefer output parameters to be first because their position is consistent and it leaves open the possibility of defaulted parameters in the general case. I'd prefer that in Boost.Endian, but not everyone likes that convention.) There are two possibilities for dealing with the case when T and U are different. One is that they be required to have the same, supported size and that they are both signed or both unsigned. The other is to do a runtime check that the swapped value is within the range of the target type. _____ I like that the aligned integer types have the longer names because they incur padding and are non-portable. _____ The overhead implied by the comments in the "Comment on naming" section of the Endian Integer Types page is an excellent reason for splitting the endian class template into endian_buffer and endian_integer as previously suggested. For C++11, the latter can derive from the former. For C++03, if POD-ness is important, you'll need to duplicate the former's code in the latter, but it's worthwhile. ____________________ ____________________ What is your evaluation of the implementation? There's already been a good deal of discussion on efficiency and reuse during the review, so I won't rehash it. The 64b reorder(), for example, looks like a lot of code for an inline function. I haven't compiled it to inspect the resulting assembly to see how much or little results. My concern is for the possibility of code bloat arising from the desire to keep this a header-only library. If those were function templates, then the compiler can choose whether to inline or not, despite the definition being in a header. The implementation of class endian is uncommented, yet non-trivial. A future maintainer, even Beman, will benefit from some helpful information inserted now. The implementation of class endian would be simplified, it appears to me, by passing the endianness along to a detail::store() function template and letting it select the big/little endian store logic. ____________________ ____________________ What is your evaluation of the documentation? It's incomplete. There are no examples of the conversion functions. There is no tutorial. There is no guidance on selecting between the alternatives. There is no comparison with traditional htnol()-style code. __________ Endian Home _____ Introduction to endianness s/modern Apple, Linux, or Windows computer/computer/ (the OS doesn't matter) s|a[sic] Oracle/Sun Solaris computer| (the OS doesn't matter) s/CPU's/CPUs/ (plural, not possessive) _____ Introduction to the Boost.Endian library The first and last sentences should be merged: "Boost.Endian is a header only library providing facilities for managing integer endianness." Endian conversions for native integers: The last sentence makes both absolute and relative statements. The relative statements are, presumably, relative to endian integer types, but that's not made clear. I suggest rewording as, "This approach is simple and efficient, but when compared to endian integer types, is less flexible regarding size and alignment, can be harder to manage, is more error prone because of the need to apply endian conversions in unrelated logic paths, all of which can lead to logic errors that are difficult to debug." Endian integer types: The second sentence has a similar problem, and the last sentence could be combined to correspond with the list of the previous paragraph. Try this variation: "This approach is simple and, while it can be less efficient than endian conversions, it avoids strict alignment requirements allowing opportunities to pack data more tightly." I'd follow with this variation of the third sentence: "Furthermore, types of length from 1 to 8 bytes are supported, rather than just the usual 1, 2, 4, and 8 byte lengths." __________ Conversion Functions This page is odd. The synopsis does not include links to the description of the functions. When I first saw the page, I thought there simply was no documentation for using the various functions. That documentation does appear in the latter part of the page, but it isn't obvious. There should be information explaining why conditional and unconditional functions are useful and each should refer to its counterpart. That is, reorder(uint16_t &) should note that if the intent is not to reorder the bytes regardless of the platform, that bit_to_native(uint16_t &), for example, is appropriate. The unconditional nature or reorder() escaped me until I realized that these functions always swap bytes, unlike the htonl() family, which I'd never before encountered as necessary. Thus, use cases for unconditional byte swapping are warranted. __________ Endian Integer Types The Conversion Functions page links to the description of big and little endian on the Endian Home page, but the Integer Types page does not. The example shown on the Integer Types page should be recreated using the Conversion Functions to show the difference in usage. That example should use sizeof(h) rather than sizeof(header) since that is a safer practice. _____ Limitations This section, while important, should not appear so soon on the page. s/compilers do layout memory/compilers lay out memory/ s/C++0x/C++11/ s/it will be possible to specify the default constructor as trivial/the default constructor is marked as trivial/ s/will no longer disqualify/do not disqualify/ s/will no longer be relying/does not rely/ _____ Feature set s/Big endian | little endian | native endian byte ordering./Big, little, and native endian byte ordering/ s/Signed | unsigned/Signed and unsigned/ s/Unaligned | aligned/Data may be aligned or unaligned/ s/1-8 byte (unaligned) | 2, 4, 8 byte (aligned)/2, 4, and 8 byte aligned types and 1-8 byte unaligned types/ _____ Typedefs The column order should correspond with the order of information in the typedef names: alignment, sign, endianness, size. The discussion of the difference between aligned and unaligned types should not be "buried" in the Typedefs section. _____ Comment on naming s/these type grows/these types grows/ Why does "the realization creep in" that the endian integer types are lousy arithmetic integers? Only profiling would reveal that. Consequently, you should call that out, perhaps in a "Design Details" section along with the aligned/unaligned discussion, thereby explicitly stating rather than leaving to each library user the discovery of the overhead of these types. _____ Synopsis The link from "cover_operators" to a header is surprising. _____ Members The description of the default constructor fails to indicate whether the value is left uninitialized, which I presume to be the case. The description of the conversion operator mixes T and value_type. While those two are the same, one has to examine the class definition to see that. Either s/T/value_type/ or s/value_type/T/. _____ FAQ s/and both speed and/plus both speed and/ s/disk records/records/ (applies to networking, too) s/Yes, for sure, compared/There is overhead compared/ s/are usually be faster/are usually faster/ s/types POD's?/types POD?/ s/endian integer types not being POD's/of endian types not being POD/ s/data files are portable/data files that are portable/ The "These types are really just byte-holders" entry should not be a FAQ. That information, along with a discussion of the overhead of the simple syntax, should be part of the "Design Details" section suggested above. _____ Design considerations for Boost.Endian integers Only "Design" is underlined in the section heading. s/memcpyable/memcpy-able/ _____ Experience s/used very successful/used very successfully/ _____ Compilation s/This is ensures that , and/This ensures that class endian is POD, even in C++03, and/ s/POD's/POD/ ____________________ ____________________ What is your evaluation of the potential usefulness of the library? Highly useful, but could be more so. ____________________ ____________________ Did you try to use the library? No. Sorry. ____________________ ____________________ How much effort did you put into your evaluation? I spent about an hour reading the documentation plus a lot of time on the mailing list participating in endian discussions. ____________________ ____________________ Are you knowledgeable about the problem domain? Yes. I've used the old standby functions, htonl(), ntohl(), and friends over many years. I've created template functions to automatically select the right function based upon the size of the type being swapped, etc. ____________________ ____________________ Do you think the library should be accepted as a Boost library? There are issues that keep me from saying yes at this time. There are too many suggested variations and ideas under consideration to accept the library in its present state. However, a mini-review should be sufficient to evaluate the final form, once Beman determines a course of action, and determine whether to accept it or not. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com ________________________________ IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

On Tue, Sep 13, 2011 at 3:18 PM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
Herewith is my review of Beman's Boost.Endian library.
____________________ ____________________ What is your evaluation of the design?
_____ The two-argument overloads of the [un]conditional swapping functions seem awkward at first blush, but are really the best interface for those operations. Changing to value returning functions opens the door to mistakes like the following:
uint32_t const in(123); int32_t const out(reorder(in));
Wow, you have a devious mind:-) I had to read that three times before I saw the problem!
There might be a warning, but many programmers routinely ignore such.
Documentation can help a bit. But there are enough different concerns about the conversion interfaces I'm not sure we can address them all.
As is, the compiler will complain about the conditional functions:
uint32_t const in(123); int32_t out; native_to_big(in, out); // Error: ambiguous
The unconditional functions do not provide that benefit because they aren't function templates:
uint32_t const in(123); int32_t out; reorder(in, out);
_____ An interesting variation that I've used in the past is a set of cast-style function templates that permit byte swapping one type and returning the result as a different type, after verifying that the resulting value is in range of the target type. Thus,
uint32_t const in(std::numeric_limits<uint32_t>::max()); int32_t const out(to_host<int32_t>(in)); // throws
TMP allows streamlining the range checks done, of course.
_____ Another variation I've used is a checked, two-argument function template with each potentially having a different type:
template <class T, class U> void to_host(T & _result, U _value);
(I always prefer output parameters to be first because their position is consistent and it leaves open the possibility of defaulted parameters in the general case. I'd prefer that in Boost.Endian, but not everyone likes that convention.)
There are two possibilities for dealing with the case when T and U are different. One is that they be required to have the same, supported size and that they are both signed or both unsigned. The other is to do a runtime check that the swapped value is within the range of the target type.
It isn't clear to me that the conversion library is the right place to do range checks. A major point of using endian conversion functions rather than endian integers is a need to perform arithmetic and related operations on native types. Doesn't the same apply to range checks? An argument to do range checking on the implicit conversions in the endian integer types might be stronger, but even there that feels more like something that should go in a separate checked integer library.
_____ I like that the aligned integer types have the longer names because they incur padding and are non-portable.
_____ The overhead implied by the comments in the "Comment on naming" section of the Endian Integer Types page is an excellent reason for splitting the endian class template into endian_buffer and endian_integer as previously suggested. For C++11, the latter can derive from the former. For C++03, if POD-ness is important, you'll need to duplicate the former's code in the latter, but it's worthwhile.
I'm not sure C++03 POD-ness is critical. AFAIK, the only impact would be use in unions. So maybe in C++03, endian_buffers but not endian_integers would be usable in POD's. One critical issue is improving the documentation, including fully functional (and tested) examples that demonstrate programming the common use cases using (1) conversions only, (2) endian_buffers, and (3) endian_integers.
____________________ ____________________ What is your evaluation of the implementation?
There's already been a good deal of discussion on efficiency and reuse during the review, so I won't rehash it.
The 64b reorder(), for example, looks like a lot of code for an inline function. I haven't compiled it to inspect the resulting assembly to see how much or little results. My concern is for the possibility of code bloat arising from the desire to keep this a header-only library. If those were function templates, then the compiler can choose whether to inline or not, despite the definition being in a header.
I'll give it some thought. What I'd really like to develop is a decent recipe for being able to support either header-only or compiled-library approaches.
The implementation of class endian is uncommented, yet non-trivial. A future maintainer, even Beman, will benefit from some helpful information inserted now.
Point taken.
The implementation of class endian would be simplified, it appears to me, by passing the endianness along to a detail::store() function template and letting it select the big/little endian store logic.
The same might applied to a load() function template? There have also been requests to build the implementation on top of the conversion functions. That implies that the conversion functions are expanded to cover unaligned cases, etc. For me, the key thing that needs to come out of this review is a complete public interfaced. The details of the implementation will get looked at, of course, but the public interface needs to be cast in stone.
____________________ ____________________ What is your evaluation of the documentation?
It's incomplete. There are no examples of the conversion functions. There is no tutorial. There is no guidance on selecting between the alternatives. There is no comparison with traditional htnol()-style code.
Agree. I was planning to do that before the review, but between evacuating before Irene, storm cleanup afterward, and a week of repeated power failures it just didn't happen.
...
I'm skipping the details of spelling and grammatical suggestions. They are appreciated and will get applied when the docs are updated.
Thus, use cases for unconditional byte swapping are warranted.
I've used unconditional swapping in a control path that only occurs when a swap is needed. For example, when doing file conversions the format of the current system isn't the determining factor. Rather, whether or not the input file ordering is the same as the ordering requested for the output file determines if a reorder is needed, and that isn't known until run time. The conditional reorder functions are specified in terms of the unconditional functions, too.
What is your evaluation of the potential usefulness of the library?
Highly useful, but could be more so.
____________________ ____________________ Did you try to use the library?
No. Sorry.
____________________ ____________________ How much effort did you put into your evaluation?
I spent about an hour reading the documentation plus a lot of time on the mailing list participating in endian discussions.
____________________ ____________________ Are you knowledgeable about the problem domain?
Yes. I've used the old standby functions, htonl(), ntohl(), and friends over many years. I've created template functions to automatically select the right function based upon the size of the type being swapped, etc.
____________________ ____________________ Do you think the library should be accepted as a Boost library?
There are issues that keep me from saying yes at this time. There are too many suggested variations and ideas under consideration to accept the library in its present state. However, a mini-review should be sufficient to evaluate the final form, once Beman determines a course of action, and determine whether to accept it or not.
Understood. I'm already committed to a min-review. Thanks for all the detailed comments. Much appreciated! --Beman
participants (2)
-
Beman Dawes
-
Stewart, Robert