bug found in boost::intrusive::detail::ihashtable<...>::rehash

I hope I'm posting it to the correct newsgroup. I wanted to mail this directly to the developer of boost::intrusive (Ion Gazta?aga), but I can't find his e-mail anywhere, or any way to directly report the bug. I am not a boost developer myself. Position of a contained element modulo m (new_buckets_len) is known to remain equal to its previous position modulo n (old_buckets_len) ONLY IF m < n AND n %m == 0, but the latter condition isn't being checked. The code assumes it can skip rehashing of m first buckets if m < n: //If we are shrinking the bucket array, just rehash the last nodes if(same_buffer && (old_buckets_len > new_buckets_len)){ n = new_buckets_len; } //Iterate through nodes for(; n < old_buckets_len; ++n){ ... } Also, the following comment: //If this is a buffer expansion don't move if it's not necessary if(same_buffer && new_n == n){ ++before_i; ... Needs to be changed simply to "don't move if it's not necessary" - it needs to be checked/may happen when shrinking the hash table also. Here's a suggested bugfix: - check if (old_bucket_len > new_buckets_len && 0 == old_bucket_len % new_buckets_len). - if true, i'th bucket needs to store the contents of all previous buckets at positions i + j*new_buckets_len for j in [0, old_buckets_len / new_buckets_len) spliced one after another. There is no need to check on a per-element basis. I believe this optimization should be done even when the rehash isn't in-place. A special case where this optimization is useful is when using power-of-2 num_buckets. - if false, buckets have to be re-scanned, using the existing code in the body of the existing loop, but starting from n = 0 always: for(; n < old_buckets_len; ++n){ ... } Please correct me if I'm wrong on this.. but just to be sure, I made a test program that verified rehash() can break the container's invariants when shrinking the container in place, which was indeed found to be the case. As a totally unrelated note, I don't understand why iunordered_set doesn't support a default constructor (so that buckets = 0 and num_buckets = 0, given the existence of a rehash() function. I see it like the reset() function in a scoped_ptr. If a "reset" style function is supported, the class shouldn't force you to initialize it at construction time. Then again, it's a matter of taste in design more than anything...

Hi Yoni, Yoni Lavi wrote:
I hope I'm posting it to the correct newsgroup. I wanted to mail this directly to the developer of boost::intrusive (Ion Gazta?aga), but I can't find his e-mail anywhere, or any way to directly report the bug. I am not a boost developer myself.
Posting to the boost.users or boost.devel newsgroups is a very effective way to report bugs.
Position of a contained element modulo m (new_buckets_len) is known to remain equal to its previous position modulo n (old_buckets_len) ONLY IF m < n AND n %m == 0, but the latter condition isn't being checked. The code assumes it can skip rehashing of m first buckets if m < n:
Thanks for reporting this.
Please correct me if I'm wrong on this.. but just to be sure, I made a test program that verified rehash() can break the container's invariants when shrinking the container in place, which was indeed found to be the case.
You are right. Take for instance integer values with trivial hash (bucket_count() == 5): {0, 5}, {11}, {}, {3}, {9} Now if we rehash it to bucket bucket_count() == 2, it should lead to something similar to: {0}, {5, 3, 9} Which means that the element with value 5 must be rehashed.
As a totally unrelated note, I don't understand why iunordered_set doesn't support a default constructor (so that buckets = 0 and num_buckets = 0, given the existence of a rehash() function. I see it like the reset() function in a scoped_ptr. If a "reset" style function is supported, the class shouldn't force you to initialize it at construction time. Then again, it's a matter of taste in design more than anything...
I have no problem to add it. Maybe the programmer wants to delay resource/bucket allocation and a default constructor can be handy. But constructs a container where no value can be inserted, and this seems dangerous. I could add several assertions to catch such errors in debug mode but, let me think about it a bit more. Thanks again for the report, Ion
participants (2)
-
Ion Gaztañaga
-
Yoni Lavi