On 5/2/2017 1:47 PM, Pavel Kretov via Boost wrote:
Hi Edward and Matt! Thank you for your replies. Let me comment them both in one (maybe, overly long) message.
Edward Diener wrote:
Are you suggesting that the <,>,<=, and >= operators always return 'false' if the sizes do not match, rather than assert ?
Definitely, no. I'm suggesting to discuss the rules of meaningful and useful comparison for dynamic bitsets of different lengths. Returning 'false' each time sizes are different would make 'operator <' not only inappropriate for std::set or std::map, but will force many STL algorithms to treat these objects as equivalent, since equivalence is defined is the standard library as:
equiv(a, b) = !(a<b) && !(b<a)
We need a more elaborate solution.
Shouldn't BOOST_ASSERT be used instead of the one from assert.h?
Possibly. I think the intention was always to assert, whereas BOOST_ASSERT lets the assertion be overridden. I do agree with you that the current way dynamic_bitset works with the assertion should have been documented.
I think, these assertions should be eliminated, not documented. We can get rid of them without breaking even a bit of backward-compatibility. Would you feel happy if an innocent-looking code like
std::string("c++") < "boost"
call std::terminate just because the class authors don't want to handle all possible inputs for some reason? Less-comparison must either be completely and safely defined, or not defined at all. That's how the math works.
If we cannot reliably define total ordering, we can, probably, return std::optional<bool> or even boost::tribool from 'operator <', or, at the very bad, put an exception to the throw()-list. Terminating the whole application immediately is *the worst possible* option, since it is not the expected behavior for the comparison operator.
I do not see the point of having <.<=,>, or >= for dynamic bitsets. I was not involved in the original discussion about what this was supposed to mean for dynamic_bitset but I would rather that they had not been implemented at all. Given that they were implemented, and I have no idea when that functionality would ever be used, I have no great stake in how they should work when the bitsets are of different sizes. I do agree with your point that using an assert when the bitsets are of different sizes is a very poor solution.
Matt Calabrese wrote:
I agree that such comparisons should be valid. We should represent a total order, just like standard containers do (assuming their elements do, of course). I'm actually rather surprised that this isn't already supported.
At the time, I can see two completely different, mutually exclusive total-order generalizations:
1. Mimic lexicographic string comparison: compare a[0] with b[0], if they're equal, compare a[1] with b[1], etc. This is also the way how std::vector<T> works if T is less-comparable (including the special case for std::vector<bool>).
This seems the most logical method.
2. Mimic little-endian (or big-endian) numeric comparison.
-bit-: (9) (0) a: 0000001101 b: 0001101 c: 00010000 a<b: false b<a: false equiv(a,b): true a<c: true b<c: true
I strongly prefer the former approach, since the latter makes equivalence (!(a<b) && !(b<a)) so badly different from equality (a==b). Equality already works in the lexicographic way, there is little reason to introduce incompatible ordering scheme to 'operator <', unless we're going to generalize bitwise operators to mimic numbers, too.
It must also be noted that std::bitset avoids defining 'operator <' for some reason. I think, they did it in order not to give users false expectations about the ordering. The very name 'bitset' is the source of confusion. I think, if we could rename boost::dynamic_bitset to simply boost::bitstring, anyone would have no second idea about how the class does its comparisons.
I do not think that bitsets are bitstrings in any way. I admit that I am baffled by the fact that, other than wanting to know if two bitsets are equal or unequal, any sort of lesser or greater type comparison makes any sense at all. But given that it is already implemented in dynamic_bitset I am certainly not against a change which would work with bitsets of different sizes.
I'm not sure, though I agree that it's definitely worth investigating with further discussion on the list and some example use-cases. I intuitively suspect it's a reasonable generalization to act as though implicit 0s were present for the shorter operand, based on an interpretation as a binary value. The side on which to pad should be consistent with the lexicographical ordering. Even this generalization is somewhat debatable though, since extraneous 0s contribute to value in a dynamic bitset.
If we continue thinking about dynamic_bitset as a bitstring (or, more generally, bitlist), it would be consistent to define bitwise operators in terms of 'zip' function, commonly found in a number of functional programming languages (some pseudo-Haskell code follows):
(&) :: [Bit] -> [Bit] -> [Bit] a & b = zipWith (&) a b
Would you please explain this in C++ rather than in Haskell ?
Usually, 'zip' combines lists until one of them is over (see [1] and [2], though). Hence, applying 'operator &' to bitsets of different size should create a new bitset with the length of the shorter. The great feature of this definition is that it naturally obeys the De Morghan's laws:
(~a & ~b) == ~(a | b) (~b | ~b) == ~(a & b)
I also tried different padding schemes and had no luck keeping them consistent to De Morghan laws. It anyone interested, I can post my generalization attempts on the list.
With best wishes and hope that my suboptimal language skills won't totally ruin my arguments,
Your English language skills are fine. If you create a PR I will look at it.
――― Pavel K.
[1]: https://softwareengineering.stackexchange.com/q/274983 (Why does 'zip' ignore the dangling tail of the collection?) [2]: http://stackoverflow.com/a/8513803 (TLDR: Boost's zip_iterator doesn't behave like Haskell's zip).