
on Sat Nov 01 2008, "JOAQUIN M. LOPEZ MUÑOZ" <joaquin-AT-tid.es> wrote:
________________________________________ De: boost-bounces@lists.boost.org [boost-bounces@lists.boost.org] En nombre de David Abrahams [dave@boostpro.com] Enviado el: sábado, 01 de noviembre de 2008 0:47 Para: boost@lists.boost.org Asunto: Re: [boost] [MultiIndex] Random Access mental model?
on Fri Oct 31 2008, Joaquin M Lopez Munoz <joaquin-AT-tid.es> wrote:
Yep, this is done. Only one node per element is allocated. This is true regardless of the indices used, including random access indices. If you want to visualize this, each element node has as many node headers as indices are, each one stacked on top of the next.
I think random access shouldn't need to add a node header unless you want to do something like look up a key in the hash index and delete that element. Maybe a feature-model interface would be a more powerful way to approach this space, so we don't have to pay for what we don't use.
It is very difficult (I suspect impossible) to come up with a satisfactory global picture without this up-pointer. Consider just the following scenario:
I have a multi_index_container<hashed,random_access> m and an iterator to the hashed index it. m.erase(it) cannot be implemented without the up-pointer in the random-access index. (If you don't immediately realize why, picture yourself the hand-rolled composition of a hashed_map and a deque<pointer>: you can't erase an element from the combined structure given only a hash_map::iterator).
Yes, yes, I understood all that. It's exactly what I was referring to when I wrote "look up a key in the hash index and delete that element." If you don't immediately see why that is the same thing you're talking about, well, please tell me.
There are lots of additional problems,
I disagree with your premise. It's only a problem if it's a problem for me, and since I don't need that functionality I can use a lighter-weight data structure with no up-pointers, just like I'll use a vector instead of a deque when I don't need O(1) modification at the front.
but I think this is sufficiently serious to justify my decision. Basically, Boost.MultiIndex assumes O(1) inter-index iterator projection (getting from an iterator in an index to an iterator in another index pointing to the same element, in constant time).
That's fine; it's not the right library for my application, then.
[...]
The deque-based structure you propose would save, as you say, roughly one pointer per node with respect to sequenced indices and also with respect to random access indices, which have the up-pointer your proposed data structure lacks, while on the other hand it requires an extra pointer as the deque stores pointers while multi_index_containers enjoy the header stacking property referred to above.
So, roughly speaking, things are even.
Sorry, I don't see it. Since I need O(1) pop-front I am using sequenced<> which IIUC is a doubly-linked list. Therefore your nodes have this overhead:
next-in-list, prev-in-list, and next-in-hash-bucket = 3 pointers
and mine have only next-in-hash-bucket. Add one pointer of overhead in the deque and I am storing 2 pointers.
But your deque is a deque of *pointers*,
Uh, yes. One pointer stored in the deque per element.
so the (amortized) penalty of the deque is 2 pointers per element.
Maybe you should explain why you think so.
Please take a look at the attached figure, where the resulting data structures are depicted for:
* Boost.MultiIndex (hashed + sequenced), * Boost.MultiIndex (hashed + random access), * Hand-rolled composition (hash_map + deque).
That's not the data structure I was proposing. The deque elements would point directly to the elements in the hash_map, i.e. hash_map<key, value> + deque<std::pair<key const, value>*> The picture you drew is something like hash_map<key, value> + deque<std::pair<key const, value>**> I can't imagine why you added that extra node.
(The array labeled "deque<pointer>" does not do merit to the actual chunk-based structure used by most deque implementations, but you get the idea, there's on average one pointer per element).
That would only be true if the block size of the deque was 1, but who would build a deque that way? I'm very familiar with deque. There basically are two implementations, one of which uses a circular buffer for the array of block pointers, but fundamentally they have the same storage overhead. If there are N elements and a block can hold B elements the block array stores cieling(N/B) pointers and there are at most N*floor(N/B) + 2*B * sizeof(element) bytes of storage for blocks. As N grows, the 2*B constant becomes unimportant and you N have amortized N *sizeof(element) bytes of element storage and (N/B)*sizeof(block*) bytes of block storage. When the element is a pointer that's (N+N/B)*sizeof(pointer). The only way that can be equal to N*2*sizeof(pointer) is if B==1.
Do you see you've got four pointers (black dots) per element in the three structures?
Yes, but the deque you picture is pathologically inefficient. -- Dave Abrahams BoostPro Computing http://www.boostpro.com