Re: [Boost-users] Support for sp_counted_base for HP ItaniumaCCcompiler

25 Jul 2007

      You are right.
I fixed it and here is the new implementation:

inline void atomic_increment( int * pw )
{
    _Asm_fetchadd(_FASZ_W, _SEM_ACQ, (void*)pw, +1, _LDHINT_NONE);
}

inline int atomic_decrement( int * pw )
{
    int r = static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_ACQ, (void*)pw,
-1, _LDHINT_NONE));
    if (1 == r)
    {
        _Asm_mf();
    }

    return r - 1;
}

inline int atomic_conditional_increment( int * pw )
{
    int v = *pw;

    for (;;)
    {
        if (0 == v)
        {
            return 0;
        }

        _Asm_mov_to_ar((_Asm_app_reg)_AREG_CCV,
                       v,
                       (_Asm_fence)(_UP_CALL_FENCE | _UP_SYS_FENCE |
_DOWN_CALL_FENCE | _DOWN_SYS_FENCE));
        int r =_Asm_cmpxchg((_Asm_sz)_SZ_W,
                            (_Asm_sem)_SEM_ACQ, 
                            pw,
                            v + 1,
                            (_Asm_ldhint)_LDHINT_NONE);
        if (r == v)
        {
            return r + 1;
        }

        v = r;
    }
}

-----Original Message-----
From: Peter Dimov [mailto:pdimov@mmltd.net] 
Sent: Wednesday, July 25, 2007 5:55 PM
To: boost-users@lists.boost.org
Subject: Re: [Boost-users] Support for sp_counted_base for HP
ItaniumaCCcompiler

I'm not familiar with IA64 or the HP intrinsics, but a few quick
comments:

Baruch Zilber wrote:
...
inline void atomic_increment( int * pw )
{
    _Asm_mf();
    static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_REL, (void*)pw, +1,
_LDHINT_NONE) + 1);
}
The mf is redundant; _SEM_REL has the same effect. This should probably
be

inline void atomic_increment( int * pw )
{
    _Asm_fetchadd(_FASZ_W, _SEM_REL, pw, +1, _LDHINT_NONE);
}
...
inline int atomic_decrement( int * pw )
{
    _Asm_mf();
    return (static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_REL,
(void*)pw,
-1, _LDHINT_NONE) - 1) - 1);
}
The leading mf is redundant here, too. In addition, an acquire barrier
in 
the zero case is missing, and there are two -1's where probably just one
is 
needed.

In pseudocode, atomic_decrement needs to be:

int r = fetchadd4.rel( pw, -1 );
if( r == 1 ) ld4.acq( pw ); // or mf
return r - 1;

So, a wild guess:

inline int atomic_decrement( int * pw )
{
    int r = (int)_Asm_fetchadd( _FASZ_W, _SEM_REL, pw, -1, _LDHINT_NONE
);
    if( r == 1 ) _Asm_mf();
    return -1;
}

We might be able to replace the _Asm_mf with _Asm_ld, but I'm not sure 
whether it will generate ld4.acq.
...
inline int atomic_conditional_increment( int * pw )
{
    return _Asm_mov_to_ar((_Asm_app_reg)_AREG_CCV,
                          *pw,
                          (_Asm_fence)(_UP_CALL_FENCE | _UP_SYS_FENCE
| _DOWN_CALL_FENCE | _DOWN_SYS_FENCE)),
                         _Asm_mf(),
                         (_Asm_cmpxchg((_Asm_sz)4,
                                       (_Asm_sem)_SEM_REL,
                                        pw,
                                        *pw + 1,
                         (_Asm_ldhint)_LDHINT_NONE));
}
This doesn't look correct to me. In pseudocode, I believe that it needs
to 
be:

int v = *pw;

for(;;)
{
    if( v == 0 ) return 0;
    int r = cmpxchg( pw, v /*old*/, v+1 /*new*/ );
    if( r == v ) return r+1;
    v = r;
}

The above code seems to implement just the cmpxchg primitive. The mf is 
redundant in this case, too.

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp