[dynamic_bitset] bug in to_ulong() in dynamic_bitset.hpp

A failure of dynamic_bitset library test dyn_bitset_unit_tests3 on Tru64/cxx and HP-UX/aCC revealed what appears to be a bug in <boost/dynamic_bitset/dynamic_bitset.hpp> introduced after Boost 1.34. The bug affects platforms where 'long' type is larger than 'int'. On both Tru64 and HP-UX, 'int' is a 32-bit type. On Tru64, 'long' is a 64-bit type. On HP-UX, 'long' is a 64-bit type in 64-bit data model (+DD64, which is how Boost testing is done on this platform) and 32-bit type otherwise. The test also fails on other platforms and while I cannot verify it, from the test output it apears that these platforms are affected by the same bug. What fails is the two testcases below in b.to_long() suite in boost/libs/dynamic_bitset/dyn_bitset_unit_tests3.cpp invoked with run_test_cases<unsigned int>() : { std::string str(ul_width - 1, '1'); boost::dynamic_bitset<Block> b(str); Tests::to_ulong(b); } { std::string ul_str(ul_width, '1'); boost::dynamic_bitset<Block> b(ul_str); Tests::to_ulong(b); } The code in <boost/libs/dynamic_bitset/bitset_test.hpp> converts dynamic_bitset<unsigned int> argument to unsigned long and checks that the appropriate bits in the result are set: result_type num = lhs.to_ulong(); // Be sure the number is right if (sz == 0) BOOST_CHECK(num == 0); else { for (std::size_t i = 0; i < sz; ++i) BOOST_CHECK(lhs[i] == (i < n ? nth_bit(num, i) : 0)); } This check fails for the bits in the upper half of the result. Here is an excerpt from dynamic_bitset<Block, Allocator>::to_ulong() function in <boost/dynamic_bitset/dynamic_bitset.hpp> in the trunk: result_type result = 0; for (size_type i = 0; i <= last_block; ++i) { assert((size_type)bits_per_block * i < (size_type)ulong_width); result |= (m_bits[i] << (bits_per_block * i)); } In our case, last_block = 1 and bits_per_block = 32. m_bits[i] has 'unsigned int' type. On the second loop iteration, a 32-bit integer is left-shifted by 32 bit positions. This triggers undefined behaviour. From 5.8 - Shift operators [expr.shift] : "The behavior is undefined if the right operand is [...] greater than or equal to the length in bits of the promoted left operand." With cxx and aCC, such shift results in zeroing of all bits of the left operand. Here is the code from <boost/dynamic_bitset/dynamic_bitset.hpp> in Boost 1.34 : unsigned long result = 0; for (size_type i = 0; i <= last_block; ++i) { assert((size_type)bits_per_block * i < (size_type)ulong_width); // gps unsigned long piece = m_bits[i]; result |= (piece << (bits_per_block * i)); } Here on the second loop iteration, a 64-bit long is left-shifted by 32 bit positions which yields the intended result. x.cpp below illustrates the difference between trunk and Boost 1.34: x.cpp ----- #include <stdio.h> #include <limits.h> int main() { unsigned int m_bits[] = { UINT_MAX, UINT_MAX }; unsigned int bits_per_block = sizeof(int) * CHAR_BIT; unsigned long result = 0; for(int i = 0; i <= 1 ; ++i) { #ifndef Boost_1_34 result |= (m_bits[i] << (bits_per_block * i)); #else unsigned long piece = m_bits[i]; result |= (piece << (bits_per_block * i)); #endif printf("result = %llx\n", result); } } cxxosf.zko.hp.com> cxx x.cpp && a.out result = ffffffff result = ffffffff cxxosf.zko.hp.com> cxx x.cpp -DBoost_1_34 && a.out result = ffffffff result = ffffffffffffffff cxxosf.zko.hp.com> Replacing result |= (m_bits[i] << (bits_per_block * i)); with unsigned long piece = m_bits[i]; result |= (piece << (bits_per_block * i)); on the trunk makes the test succeed: verified on Tru64/cxx and on HP-UX/aCC. In <http://lists.boost.org/Archives/boost/2007/12/131126.php> I said:
What I was referring to is the code in 1.34 applying bitwise shift operator to a long operand. It would trigger an undefined behavior in C where "The operands shall have integer type", but it is legal in C++ where "The operands shall be of integral or enumeration type". So, I no longer claim that the code in 1.34 is relying on undefined behaviour. And the last thing: why the bug affects only platforms where 'long' type is larger than 'int'? Because if they are the same, as is the case on VMS or HP-UX in 32-bit data model, last_block = 0 and the single iteration of the for loop just assigns m_bits[0] to the result: the right operand of the shift operator is zero. To fix the problem, I suggest to replace the buggy code in <boost/dynamic_bitset/dynamic_bitset.hpp> in the trunk with that in 1.34. If there are no objections, I can do it. Thanks, Boris

{I'm reposting this (with slight changes); the original version doesn't seem to have made it in the moderation queue} Boris Gubenko wrote:
Sigh... it was introduced in the passage from rev. 1.10 to rev. 1.11 (see <http://boost.cvs.sourceforge.net/boost/boost/boost/dynamic_bitset/dynamic_bitset.hpp>).
Yeah. I'm well aware of this; you can see e.g. that operator <<= and operator >>= have comments, aimed at maintainers, whose purpose is exactly to prevent the introduction of such undefined behavior. And even in the code above, the intent of the assert was to check that I wouldn't incur in that error; but, the issue is, the assert assumes I'll shift an unsigned long, which I'm not doing (except when Block happens to be unsigned long). Of course using an intermediate unsigned long variable ('piece') does the trick, as would a static_cast<> of m_bits[i].
Yes. [...]
Yes, but please use the "result_type" typedef (or drop it altogether, if you prefer that). Thanks for looking into this, -- \|||/ Gennaro Prota - For hire (o o) https://sourceforge.net/projects/breeze/ --ooO-(_)-Ooo-----
participants (3)
-
Boris Gubenko
-
Gennaro Prota
-
Markus Schöpflin