Re: [boost] [dynamic_bitset] Comparison operators without assertions

3 May 2017


      On 5/2/2017 1:47 PM, Pavel Kretov via Boost wrote:
...
Hi Edward and Matt!
Thank you for your replies. Let me comment them both in one (maybe,
overly long) message.
Edward Diener wrote:
...
Are you suggesting that the <,>,<=, and >= operators always return
'false' if the sizes do not match, rather than assert ?
Definitely, no. I'm suggesting to discuss the rules of meaningful and
useful comparison for dynamic bitsets of different lengths. Returning
'false' each time sizes are different would make 'operator <' not only
inappropriate for std::set or std::map, but will force many STL
algorithms to treat these objects as equivalent, since equivalence is
defined is the standard library as:
equiv(a, b) = !(a<b) && !(b<a)
We need a more elaborate solution.
...
...
Shouldn't BOOST_ASSERT be used instead of the one from assert.h?
Possibly. I think the intention was always to assert, whereas
BOOST_ASSERT lets the assertion be overridden. I do agree with you
that the current way dynamic_bitset works with the assertion should
have been documented.
I think, these assertions should be eliminated, not documented. We can
get rid of them without breaking even a bit of backward-compatibility.
Would you feel happy if an innocent-looking code like
std::string("c++") < "boost"
call std::terminate just because the class authors don't want to
handle all possible inputs for some reason? Less-comparison must
either be completely and safely defined, or not defined at all. That's
how the math works.
If we cannot reliably define total ordering, we can, probably, return
std::optional<bool> or even boost::tribool from 'operator <', or, at
the very bad, put an exception to the throw()-list. Terminating the
whole application immediately is *the worst possible* option, since it
is not the expected behavior for the comparison operator.
I do not see the point of having <.<=,>, or >= for dynamic bitsets. I  
was not involved in the original discussion about what this was supposed  
to mean for dynamic_bitset but I would rather that they had not been  
implemented at all. Given that they were implemented, and I have no idea  
when that functionality would ever be used, I have no great stake in how  
they should work when the bitsets are of different sizes. I do agree  
with your point that using an assert when the bitsets are of different  
sizes is a very poor solution.
...
Matt Calabrese wrote:
...
I agree that such comparisons should be valid. We should represent a
total order, just like standard containers do (assuming their
elements do, of course). I'm actually rather surprised that this
isn't already supported.
At the time, I can see two completely different, mutually exclusive
total-order generalizations:
1. Mimic lexicographic string comparison: compare a[0] with b[0], if
    they're equal, compare a[1] with b[1], etc. This is also the
    way how std::vector<T> works if T is less-comparable (including
    the special case for std::vector<bool>).
This seems the most logical method.
...
2. Mimic little-endian (or big-endian) numeric comparison.
-bit-: (9)      (0)
        a:  0000001101
        b:     0001101
        c:    00010000
      a<b:       false
      b<a:       false
      equiv(a,b): true
      a<c:        true
      b<c:        true
I strongly prefer the former approach, since the latter makes
equivalence (!(a<b) && !(b<a)) so badly different from equality
(a==b). Equality already works in the lexicographic way, there is
little reason to introduce incompatible ordering scheme
to 'operator <', unless we're going to generalize bitwise operators
to mimic numbers, too.
It must also be noted that std::bitset avoids defining 'operator <'
for some reason. I think, they did it in order not to give users false
expectations about the ordering. The very name 'bitset' is the source
of confusion. I think, if we could rename boost::dynamic_bitset to
simply boost::bitstring, anyone would have no second idea about how
the class does its comparisons.
I do not think that bitsets are bitstrings in any way.

I admit that I am baffled by the fact that, other than wanting to know  
if two bitsets are equal or unequal, any sort of lesser or greater type  
comparison makes any sense at all. But given that it is already  
implemented in dynamic_bitset I am certainly not against a change which  
would work with bitsets of different sizes.
...
...
I'm not sure, though I agree that it's definitely worth
investigating with further discussion on the list and some example
use-cases. I intuitively suspect it's a reasonable generalization to
act as though implicit 0s were present for the shorter operand,
based on an interpretation as a binary value. The side on which to
pad should be consistent with the lexicographical ordering. Even
this generalization is somewhat debatable though, since extraneous
0s contribute to value in a dynamic bitset.
If we continue thinking about dynamic_bitset as a bitstring (or, more
generally, bitlist), it would be consistent to define bitwise
operators in terms of 'zip' function, commonly found in a number of
functional programming languages (some pseudo-Haskell code follows):
(&) :: [Bit] -> [Bit] -> [Bit]
    a & b = zipWith (&) a b
Would you please explain this in C++ rather than in Haskell ?
...
Usually, 'zip' combines lists until one of them is over (see [1]
and [2], though). Hence, applying 'operator &' to bitsets of
different size should create a new bitset with the length of the
shorter. The great feature of this definition is that it naturally
obeys the De Morghan's laws:
(~a & ~b) == ~(a | b)
   (~b | ~b) == ~(a & b)
I also tried different padding schemes and had no luck keeping them
consistent to De Morghan laws. It anyone interested, I can post my
generalization attempts on the list.
With best wishes and hope that my suboptimal language
skills won't totally ruin my arguments,
Your English language skills are fine. If you create a PR I will look at it.
...
――― Pavel K.
[1]: https://softwareengineering.stackexchange.com/q/274983
      (Why does 'zip' ignore the dangling tail of the collection?)
[2]: http://stackoverflow.com/a/8513803
      (TLDR: Boost's zip_iterator doesn't behave like Haskell's zip).