Re: [boost] Shared Map

7 Aug 2007

      On 8/7/07, François Duranleau <duranlef@iro.umontreal.ca> wrote:
...
On Tue, 7 Aug 2007, Anthony Williams wrote:
...
How does this work with iterators? On a normal map you can get an iterator to
an entry and it remains valid until that entry is deleted. Can you do this
with your shared map? I'm inclined to think that any modification to the map
would have to result in all iterators being invalidated, as they may now
potentially refer to the wrong copy of a node.
I also wonder how he implements iterators without a pointer to the parent.
AFAIK, that is one way to allow stepping from one element to the next (or
the previous) from any node, with no knowledge of how we accessed that
node. Without a parent pointer, aside from using threaded trees (which
would created cycles of shared pointers), I don't know how to do it
efficiently (keeping a stack inside the iterator would make them costly to
copy, and would require O(log N) memory).
First, in reference to François, you are correct that the iterators contain
log(n) components, and thus copy of an iterator is log(n). In reply to
Anthony, given the current implementation, you are correct, as my
project only needs const_iterators.  However there is a solution as
follows:

First note that the iterators are not a single pointer as mentioned,
but a stack of  pointers,since they must maintain parent data in some
way.  The  iterator also maintainsa pointer to the collection it was
generated from. The stack of pointers allows amortized constant time
access to previous and next elements without starting over from root.
Now to deal with modification of the underlying collection, we put a
'version number' on each red black tree node, and when we generate the
iterator, place the version number alongside the pointers as we do our
find(), etc. When we copy red-black node, we inc the version number on
the source node (the copy starts at 0).  Now our iterators check the
pointer's versions as they  traverse them to make sure the cached and
current numbers are identical.  If this check ever fails, the iterator
resets itself by looking up the key in the collection anew, which is
transparent to the user (although it incurs another log(n) bit of
work, breaking the constant time requirement on transversal, but only
for the cases where a collection is under heavy modification during
transversal). This makes the shared_map have the same iterator
stability guarantees as std::map.  Additionally, I need to cache  the
'begin' iterator, since begin should be constant time, this is also a
mild pain but doable.  Note the  pointers cached by the iterator can't
become invalid since an iterator hold reference counts.  Also note:
the dereference operator for non const iterators may need to do a
log(n) copy down the tree if the node in question has a ref-count > 1,
otherwise changes to the mapped_type would be visible to others.  To
avoid needless work when the user isn't going to modify the
mapped_type  during a dereference, we could return a proxy object with
a cast and operator=, etc, but that's probably more hassle than it's
worth.

Another tricky question is const_iterators.  The general semantics for
const_iterators in std::map are that they have visibility to changes
to the map, but  do not allow changes to the map.  I can make my
const_iterators simpler (not check version number, etc), if I make
them see a 'frozen' version of the collection (effectively they
iterate over a
copy of the collection, since copies are free).  This is actually very
convenient in many cases, but causes the shared_map to have slightly
different semantics from std::map, and I'm not sure if it's a good
design decision.  On the other hand, there is a high cost to tracking
changes which could be avoided for const_iteators, and isn't scanning
the collection with a const_iterator while modifying  it a fairly
cracked thing to do anyway?  I can't think of any std::algorithm for
example that does any such thing.

-Jeremy

Re: [boost] Shared Map

Jeremy Bruestle