Re: [boost] Re: [multi_index][hash][rfc] Hashed indices preview+Boost.Hash

----- Mensaje original ----- De: Daniel James <daniel@calamity.org.uk> Fecha: Domingo, Febrero 20, 2005 7:53 pm Asunto: [boost] Re: [multi_index][hash][rfc] Hashed indices preview +Boost.Hash
JOAQUIN LOPEZ MU?Z wrote:
The implementation I adopted (2 pointers per node + 2 pointers per bucket) has O(1) iteration, insert() and erase(), whatever the load conditions (provided the hash function behaves, of course.) I agree with you bidirectional iteration is of little value, and my main motivation was to achieve O(1). So, I'll be happy to go for a lighter implementation if these issues can be solved in another way --or if there's consensum that costly iteration under low load conditions is acceptable.
FWIW, if you can find a spare bit in a pointer, I've thought of a non-portable way to get something closer to O(1) without the extra memory load. The nodes are stored in a singly-linked list which includes the buckets, so the last node in a bucket points to the next bucket. The spare bit is used to indicate if a pointer is pointing to a node or a bucket. So the data structure is something like:
[...] Hi Daniel, We can get rid of the spare bit if only iterators store a pointer to the container: then we can test whether a pointer points to a true element or a bucket just by checking if it is in the range [buckets.begin(),buckets.end()). Still constant, though admittedly slower than the bit thing. Main problem with this is, as you point out, that traces of empty buckets will be left when erasing. Still, it seems somewhat better than the classic implementation, at least wrt to theoretical complexity (in practice it'd probably be significantly slower.) Joaquín M López Muñoz Telefónica, Investigación y Desarrpññp

JOAQUIN LOPEZ MU?Z wrote:
We can get rid of the spare bit if only iterators store a pointer to the container: then we can test whether a pointer points to a true element or a bucket just by checking if it is in the range [buckets.begin(),buckets.end()). Still constant, though admittedly slower than the bit thing.
I had thought of that, but dismissed it as too slow. But on second thoughts, maybe you could say: bool is_bucket(node_or_bucket* ptr) const { return static_cast<std::size_t>(ptr - buckets_) < bucket_count_; } I'm not sure how portable or fast/slow that would be.
Main problem with this is, as you point out, that traces of empty buckets will be left when erasing. Still, it seems somewhat better than the classic implementation, at least wrt to theoretical complexity (in practice it'd probably be significantly slower.)
Maybe not. The most important operation is a bucket lookup, here's the traditional single-linked implementation: node* ptr = bucket[i].head_; while(ptr && !key_equal_(get_key(*ptr), k)) ptr = ptr->next; This becomes: node* ptr = bucket[i].next_; while(!(ptr & 1) && !key_equal_(get_key(*ptr), k)) ptr = ptr->next; Which is pretty good. Iteration becomes faster (unless you have a lot of empty buckets), insert remains the same, erase suffers (unless you don't care about iteration). I'm just comparing with a singly linked implementation, not doubly-linked. Daniel
participants (2)
-
Daniel James
-
JOAQUIN LOPEZ MU?Z