Re: [boost] [lockfree::fifo] Review

18 Dec 2009

      "Helge Bahmann" <hcb@chaoticmind.net> wrote in message 
news:alpine.DEB.1.10.0912101115460.456@m65s28.vlinux.de...
...
On Wed, 9 Dec 2009, Anthony Williams wrote:
...
...
So far I have not yet met any 64-bit platform that provides 64 bits of
(user-space!) address space (the same cannot be said of 32-bit
platforms), so I would say that any use of spare bits on 64 bit is
safe (for user-space!).
As Chris also noted, I would be wary of using them for ABA checking
though: it's easy to overflow a 16-bit counter quite quickly, and that's
all you've got even if the OS only uses 48 of your 64 bits for address
space.
This concern is certainly valid for basically any algorithm that uses 
"generation counters" or similar constructs for disambiguation (it could 
be argued that even 32 bits may not be enough). Using spare bits at all 
should however be considered a valid technique.
...
...
FWIW there are other ways to avoid ABA than tagging pointers, and
without DCAS on 32-bit, see e.g. this sample implementation of a
lock-free stack:
http://www.chaoticmind.net/~hcb/projects/boost.atomic/stack.hpp
Yup. Incidentally, you don't even need to count the nodes in push,
provided you're careful when reclaiming nodes in pop.
hm, I'm not sure what you mean, I don't think I am counting anything but 
the number of threads in the critical section?
...
The downside of this technique is that if the traffic is too heavy then
the ref count never reaches zero, so the nodes can never be reclaimed.
Right, and although I would argue that such a highly contended hotspot is 
a design flaw in itself,
You have created a proxy garbage collector that reminds me of one I did a 
long time ago which also suffered from this exact same problem. I was using 
it for memory reclamation:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/3d...

it went something like, membars aside for a moment:
_____________________________________________________________
struct node
{
    struct node* next;
};

struct proxy
{
    struct node* head; // = NULL
    uint32_t count; // = 0
};

void
proxy_acquire(struct proxy* const self)
{
    struct proxy cmp, xchg;

    cmp.head = self->head;
    cmp.count = self->count;

    do
    {
        xchg.head = cmp.head;
        xchg.count = cmp.count + 1;
    }

    while (! DWCAS(self, &cmp, &xchg));
}

struct node*
proxy_release(struct proxy* const self)
{
    struct proxy cmp, xchg;

    cmp.head = self->head;
    cmp.count = self->count;

    do
    {
        xchg.count = cmp.count - 1;

        if (! xchg.count)
        {
            xchg.head = NULL;
        }

        else
        {
            xchg.head = cmp.head;
        }
    }

    while (! DWCAS(self, &cmp, &xchg));

    if (! xchg.count)
    {
        return cmp.head;
    }

    return NULL;
}

struct node*
proxy_defer(struct proxy* const self,
            struct node* node)
{
    struct proxy cmp, xchg;

    cmp.head = self->head;
    cmp.count = self->count;

    do
    {
        node->next = cmp.head;

        xchg.count = cmp.count;

        if (! xchg.count)
        {
            xchg.head = NULL;
        }

        else
        {
            xchg.head = node;
        }
    }

    while (! DWCAS(self, &cmp, &xchg));

    if (! xchg.count)
    {
        return node;
    }

    return NULL;
}
_____________________________________________________________

This solves memory reclamation, but will pile up nodes during times of 
sustained load...
...
there are ways to do reliable reclaim even in that case (e.g. that allow 
reclaiming an element after the last thread that could actually have 
"seen" it leaves the critical section
Yes. My next incarnation of the proxy collector above was coded to 
explicitly handle this situation. It simply breaks off collector objects 
during deferment and backlinks them all together:

http://webpages.charter.net/appcore/misc/pc_sample_h_v1.html
...
-- can be done with checkpoints similar RCU, so it can even be more 
efficient than the atomic counter I used), yet I am not sure this is 
really worth it -- I would like to make an entirely different point that 
has been nagging me:
I don't think it's worth using RCU just to get around ABA in a lock-free 
stack. Heck, it does not even fit in with the general usage pattern which 
does not expect a lot of writes. Humm, well, of course you can take 
tremendous pressure off of RCU by combining it with version counters and 
only using them to determine when to defer nodes. For instance, you don't 
need to defer a node into RCU until a version counter is just about to 
rollover.

Re: [boost] [lockfree::fifo] Review

Chris M. Thomasson