Re: [Boost-users] Support for sp_counted_base for HP Itanium aCC compiler
Hi Baruch,
thank you for doing this.
Can you, please, post your code here (I guess, it is a header file like
sp_counted_base_gcc_ia64.hpp)?. It will be reviewed by the Boost community
and we'll review it at HP. And, then, it can be added to Boost, hopefully,
in Boost 1.35 time frame.
Thanks again,
Boris
I copy the file sp_counted_base_gcc_ia64.hpp to
sp_counted_base_acc_ia64.hpp and change the following methods:
atomic_increment(),atomic_decrement() and atomic_conditional_increment()
The change was:
#include
I'm not familiar with IA64 or the HP intrinsics, but a few quick comments: Baruch Zilber wrote:
inline void atomic_increment( int * pw ) { _Asm_mf(); static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_REL, (void*)pw, +1, _LDHINT_NONE) + 1); }
The mf is redundant; _SEM_REL has the same effect. This should probably be inline void atomic_increment( int * pw ) { _Asm_fetchadd(_FASZ_W, _SEM_REL, pw, +1, _LDHINT_NONE); }
inline int atomic_decrement( int * pw ) { _Asm_mf(); return (static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_REL, (void*)pw, -1, _LDHINT_NONE) - 1) - 1); }
The leading mf is redundant here, too. In addition, an acquire barrier in the zero case is missing, and there are two -1's where probably just one is needed. In pseudocode, atomic_decrement needs to be: int r = fetchadd4.rel( pw, -1 ); if( r == 1 ) ld4.acq( pw ); // or mf return r - 1; So, a wild guess: inline int atomic_decrement( int * pw ) { int r = (int)_Asm_fetchadd( _FASZ_W, _SEM_REL, pw, -1, _LDHINT_NONE ); if( r == 1 ) _Asm_mf(); return -1; } We might be able to replace the _Asm_mf with _Asm_ld, but I'm not sure whether it will generate ld4.acq.
inline int atomic_conditional_increment( int * pw ) { return _Asm_mov_to_ar((_Asm_app_reg)_AREG_CCV, *pw, (_Asm_fence)(_UP_CALL_FENCE | _UP_SYS_FENCE | _DOWN_CALL_FENCE | _DOWN_SYS_FENCE)), _Asm_mf(), (_Asm_cmpxchg((_Asm_sz)4, (_Asm_sem)_SEM_REL, pw, *pw + 1, (_Asm_ldhint)_LDHINT_NONE)); }
This doesn't look correct to me. In pseudocode, I believe that it needs to be: int v = *pw; for(;;) { if( v == 0 ) return 0; int r = cmpxchg( pw, v /*old*/, v+1 /*new*/ ); if( r == v ) return r+1; v = r; } The above code seems to implement just the cmpxchg primitive. The mf is redundant in this case, too.
You are right. I fixed it and here is the new implementation: inline void atomic_increment( int * pw ) { _Asm_fetchadd(_FASZ_W, _SEM_ACQ, (void*)pw, +1, _LDHINT_NONE); } inline int atomic_decrement( int * pw ) { int r = static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_ACQ, (void*)pw, -1, _LDHINT_NONE)); if (1 == r) { _Asm_mf(); } return r - 1; } inline int atomic_conditional_increment( int * pw ) { int v = *pw; for (;;) { if (0 == v) { return 0; } _Asm_mov_to_ar((_Asm_app_reg)_AREG_CCV, v, (_Asm_fence)(_UP_CALL_FENCE | _UP_SYS_FENCE | _DOWN_CALL_FENCE | _DOWN_SYS_FENCE)); int r =_Asm_cmpxchg((_Asm_sz)_SZ_W, (_Asm_sem)_SEM_ACQ, pw, v + 1, (_Asm_ldhint)_LDHINT_NONE); if (r == v) { return r + 1; } v = r; } } -----Original Message----- From: Peter Dimov [mailto:pdimov@mmltd.net] Sent: Wednesday, July 25, 2007 5:55 PM To: boost-users@lists.boost.org Subject: Re: [Boost-users] Support for sp_counted_base for HP ItaniumaCCcompiler I'm not familiar with IA64 or the HP intrinsics, but a few quick comments: Baruch Zilber wrote:
inline void atomic_increment( int * pw ) { _Asm_mf(); static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_REL, (void*)pw, +1, _LDHINT_NONE) + 1); }
The mf is redundant; _SEM_REL has the same effect. This should probably be inline void atomic_increment( int * pw ) { _Asm_fetchadd(_FASZ_W, _SEM_REL, pw, +1, _LDHINT_NONE); }
inline int atomic_decrement( int * pw ) { _Asm_mf(); return (static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_REL, (void*)pw, -1, _LDHINT_NONE) - 1) - 1); }
The leading mf is redundant here, too. In addition, an acquire barrier in the zero case is missing, and there are two -1's where probably just one is needed. In pseudocode, atomic_decrement needs to be: int r = fetchadd4.rel( pw, -1 ); if( r == 1 ) ld4.acq( pw ); // or mf return r - 1; So, a wild guess: inline int atomic_decrement( int * pw ) { int r = (int)_Asm_fetchadd( _FASZ_W, _SEM_REL, pw, -1, _LDHINT_NONE ); if( r == 1 ) _Asm_mf(); return -1; } We might be able to replace the _Asm_mf with _Asm_ld, but I'm not sure whether it will generate ld4.acq.
inline int atomic_conditional_increment( int * pw ) { return _Asm_mov_to_ar((_Asm_app_reg)_AREG_CCV, *pw, (_Asm_fence)(_UP_CALL_FENCE | _UP_SYS_FENCE | _DOWN_CALL_FENCE | _DOWN_SYS_FENCE)), _Asm_mf(), (_Asm_cmpxchg((_Asm_sz)4, (_Asm_sem)_SEM_REL, pw, *pw + 1, (_Asm_ldhint)_LDHINT_NONE)); }
This doesn't look correct to me. In pseudocode, I believe that it needs to be: int v = *pw; for(;;) { if( v == 0 ) return 0; int r = cmpxchg( pw, v /*old*/, v+1 /*new*/ ); if( r == v ) return r+1; v = r; } The above code seems to implement just the cmpxchg primitive. The mf is redundant in this case, too. This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement, you may review at http://www.amdocs.com/email_disclaimer.asp
Baruch Zilber wrote:
You are right. I fixed it and here is the new implementation:
inline void atomic_increment( int * pw ) { _Asm_fetchadd(_FASZ_W, _SEM_ACQ, (void*)pw, +1, _LDHINT_NONE); }
inline int atomic_decrement( int * pw ) { int r = static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_ACQ, (void*)pw, -1, _LDHINT_NONE)); if (1 == r) { _Asm_mf(); }
return r - 1; }
Hmm. Why did you switch to _SEM_ACQ? atomic_increment should use _REL because it's rumored to be cheaper, and atomic_decrement needs to use _REL for correctness.
I saw in HP aCC STD implementation that thay are uses ACQ in increase and decrease. If you think REL is better it can be change. Thanks -----Original Message----- From: Peter Dimov [mailto:pdimov@mmltd.net] Sent: Thursday, July 26, 2007 2:29 AM To: boost-users@lists.boost.org Subject: Re: [Boost-users] Support for sp_counted_base for HPItaniumaCCcompiler Baruch Zilber wrote:
You are right. I fixed it and here is the new implementation:
inline void atomic_increment( int * pw ) { _Asm_fetchadd(_FASZ_W, _SEM_ACQ, (void*)pw, +1, _LDHINT_NONE); }
inline int atomic_decrement( int * pw ) { int r = static_cast<int>(_Asm_fetchadd(_FASZ_W, _SEM_ACQ, (void*)pw, -1, _LDHINT_NONE)); if (1 == r) { _Asm_mf(); }
return r - 1; }
Hmm. Why did you switch to _SEM_ACQ? atomic_increment should use _REL because it's rumored to be cheaper, and atomic_decrement needs to use _REL for correctness. This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement, you may review at http://www.amdocs.com/email_disclaimer.asp
participants (2)
-
Baruch Zilber
-
Peter Dimov