
-----Original Message----- From: Peter Dimov Sent: Monday, November 07, 2011 22:38 To: Helge Bahmann Subject: Re: [boost] [atomic] review results Helge Bahmann wrote:
yes makes sense -- there was a concern raised by Andrey Semashev that the spinlock pool as implemented and used by shared_ptr presently may fail on Windows due to the pool being non-unique (not had a chance to test this yet), and I have found a way to produce a similar failure using dlopen, atomics private to shared libraries and RTLD_LOCAL -- currently I am therefore leaning on creating a shared library just for the spinlock pool, but since you wrote the initial implementation maybe you could comment as well?
This is a problem in principle, but requiring all users of shared_ptr to link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.

yes makes sense -- there was a concern raised by Andrey Semashev that the spinlock pool as implemented and used by shared_ptr presently may fail on Windows due to the pool being non-unique (not had a chance to test this yet), and I have found a way to produce a similar failure using dlopen, atomics private to shared libraries and RTLD_LOCAL -- currently I am therefore leaning on creating a shared library just for the spinlock pool, but since you wrote the initial implementation maybe you could comment as well? This is a problem in principle, but requiring all users of shared_ptr to link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers. btw, when reworking the spinlock pool, it might make sense to try two changes: * add some padding to ensure that every spinlock is in a separate cache line * use test-test-and-set lock cheers, tim

On Tuesday, November 08, 2011 00:02:34 Tim Blechmann wrote:
This is a problem in principle, but requiring all users of shared_ptr to link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.
Just some random thoughts: 1. If boost::atomic<> doesn't support multi-module applications, that is it silently works incorrectly in such environments, sould it really support these platforms? I mean, spinlock pool is used when no native atomic ops are available for the given type. Perhaps, it's better not to compile at all in such a case. 2. If (1) is true then boost::atomic<> usefullness is greatly reduced. Most of the time one would use boost::interprocess::atomic<>, even in a single process.

On Tuesday 08 November 2011 04:42:38 Andrey Semashev wrote: Peter Dimov <pdimov@pdimov.com> wrote:
This is a problem in principle, but requiring all users of shared_ptr to link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.
I share this sentiment as well -- however a separate library may not be too bad considering that the spinlock pool is unused if all operations are natively supported (thus the large majority of users could just "forget" linking the library and would never notice -- this would include atomic reference counts) Tim Blechmann wrote:
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.
While this problem is definitely an argument in favour of embedding the spinlock into each atomic, I am still concerned as this introduces incompatibility with std::atomic implementations which may eventually bite someone hard (data structure size, expectations -- changing boost to "using std::atomic" might have grave consequences, and it's a transition path I would like to keep open)
Just some random thoughts:
1. If boost::atomic<> doesn't support multi-module applications, that is it silently works incorrectly in such environments, sould it really support these platforms? I mean, spinlock pool is used when no native atomic ops are available for the given type. Perhaps, it's better not to compile at all in such a case.
silently working incorrectly is very very very bad -- failing to compile and/or link is much better in this case
2. If (1) is true then boost::atomic<> usefullness is greatly reduced. Most of the time one would use boost::interprocess::atomic<>, even in a single process.
yes, agreed Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well). In that case I could play "weak reference" tricks such that the pool is used when linked in, and for single-threaded applications no mutual exclusion is done and they work just fine nevertheless. Too whacky? Maybe piggy-backing into Boost.Thread without weak reference tricks might also be something to consider? Best regards Helge

hi helge,
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.
While this problem is definitely an argument in favour of embedding the spinlock into each atomic, I am still concerned as this introduces incompatibility with std::atomic implementations which may eventually bite someone hard (data structure size, expectations -- changing boost to "using std::atomic" might have grave consequences, and it's a transition path I would like to keep open)
not sure: * compilers are not required to implement atomics so that sizeof(atomic<T>) == sizeof (T). so the size of a data structure may change when switching compilers. * compilers do change the size of a data structure depending on compiler flags. some compilers have a notion of `packed' structs, that ensure the memory layout. however gcc and icc seem to require that all struct members are PODs, while clang++ seems to accept non-POD members ... * i could imagine that c++11 compilers may be smart enough to pad adjacent std::atomic<> to ensure that they are placed in separate cache lines.
2. If (1) is true then boost::atomic<> usefullness is greatly reduced. Most of the time one would use boost::interprocess::atomic<>, even in a single process.
yes, agreed
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).
this would imply that boost.thread cannot be used as static library any more. cheers, tim

Hi Tim, On Tuesday 08 November 2011 10:17:20 Tim Blechmann wrote:
hi helge,
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.
While this problem is definitely an argument in favour of embedding the spinlock into each atomic, I am still concerned as this introduces incompatibility with std::atomic implementations which may eventually bite someone hard (data structure size, expectations -- changing boost to "using std::atomic" might have grave consequences, and it's a transition path I would like to keep open)
not sure:
* compilers are not required to implement atomics so that sizeof(atomic<T>) == sizeof (T). so the size of a data structure may change when switching compilers.
Yes that's right, however I trust that ABI conventions will be worked out per platform eventually -- people will consider inability to mix icc/gcc/clang/whatever at least on a shared-library level as unacceptable, and currently it works rather well. Also note that the intent of C++11 is to make these low-level structures compatible with C1X so there really is not that much room. I think that there is value in trying to maintain sizeof(boost::atomic<T>) == sizeof(std::atomic<T>) whatever this is going to mean per platform, and part of the thing I have in my mind is indeed interprocess-safety. I also don't think that Boost.Atomic it is in that state currently, but I am a bit reluctant to make a decision that might shut down this path forever.
* compilers do change the size of a data structure depending on compiler flags. some compilers have a notion of `packed' structs, that ensure the memory layout. however gcc and icc seem to require that all struct members are PODs, while clang++ seems to accept non-POD members ...
well but __attribute__((packed)) is something you have to attach to each individual data structure so it is not a global change -- and a global flag that affects data structure layout is problematic in any case because you will have a hard time linking to and interfacing with any library.
* i could imagine that c++11 compilers may be smart enough to pad adjacent std::atomic<> to ensure that they are placed in separate cache lines.
uhh, I hope they *don't* do that as it has grave implications on the ABI (currently the compiler would not be allowed to "fill" the hole with other data members), and cacheline sizes vary by processor generation
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).
this would imply that boost.thread cannot be used as static library any more.
That would indeed be bad, but I don't understand why this would be in conflict, could you explain a bit further? Best regards Helge

hi helge,
I think that there is value in trying to maintain sizeof(boost::atomic<T>) == sizeof(std::atomic<T>) whatever this is going to mean per platform, and part of the thing I have in my mind is indeed interprocess-safety. I also don't think that Boost.Atomic it is in that state currently, but I am a bit reluctant to make a decision that might shut down this path forever.
then how do other compilers/library implementations behave? you said that gcc doesn't follow the suggestion of the standard to avoid per-process states. but what about msvc or clang/libc++?
* i could imagine that c++11 compilers may be smart enough to pad adjacent std::atomic<> to ensure that they are placed in separate cache lines. uhh, I hope they *don't* do that as it has grave implications on the ABI (currently the compiler would not be allowed to "fill" the hole with other data members), and cacheline sizes vary by processor generation
cacheline sizes vary, but a compiler might optimize for a commonly used value. however i've been hit more than once by compilers, which are changing the size of structs.
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).
this would imply that boost.thread cannot be used as static library any more.
That would indeed be bad, but I don't understand why this would be in conflict, could you explain a bit further?
don't you need a shared library to resolve the spinlock pool? cheers, tim

On Tuesday 08 November 2011 11:22:07 Tim Blechmann wrote:
hi helge,
I think that there is value in trying to maintain sizeof(boost::atomic<T>) == sizeof(std::atomic<T>) whatever this is going to mean per platform, and part of the thing I have in my mind is indeed interprocess-safety. I also don't think that Boost.Atomic it is in that state currently, but I am a bit reluctant to make a decision that might shut down this path forever.
then how do other compilers/library implementations behave? you said that gcc doesn't follow the suggestion of the standard to avoid per-process states.
well gcc certainly follows the recommendation of the standard by making the *lock-free* atomics both address-free and free from process state, I don't read the standard as suggesting (or even requiring) anything for emulated atomics
but what about msvc or clang/libc++?
libc++ atomic is not yet completed, but does not intend to use a per-object lock. As I read it, they do in fact consider mutex (pool?) instead of spinlock. msvc I don't know yet.
* i could imagine that c++11 compilers may be smart enough to pad adjacent std::atomic<> to ensure that they are placed in separate cache lines.
uhh, I hope they *don't* do that as it has grave implications on the ABI (currently the compiler would not be allowed to "fill" the hole with other data members), and cacheline sizes vary by processor generation
cacheline sizes vary, but a compiler might optimize for a commonly used value. however i've been hit more than once by compilers, which are changing the size of structs.
under what circumstances did the sizes change? Currently I cannot imagine this *not* being an ABI violation
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).
this would imply that boost.thread cannot be used as static library any more.
That would indeed be bad, but I don't understand why this would be in conflict, could you explain a bit further?
don't you need a shared library to resolve the spinlock pool?
Frankly I don't know the answer yet so have to experiment. Best regards Helge

then how do other compilers/library implementations behave? you said that gcc doesn't follow the suggestion of the standard to avoid per-process states.
well gcc certainly follows the recommendation of the standard by making the *lock-free* atomics both address-free and free from process state, I don't read the standard as suggesting (or even requiring) anything for emulated atomics
in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock-free atomics, but i am not a native speaker and i could be wrong.
cacheline sizes vary, but a compiler might optimize for a commonly used value. however i've been hit more than once by compilers, which are changing the size of structs.
under what circumstances did the sizes change? Currently I cannot imagine this *not* being an ABI violation
iirc a struct { int i; void*p}; had a size of 16 bytes or 12 bytes depending if it was packed or not. but also empty-base-class optimization should change the size of a class. cheers, tim

On Tuesday 08 November 2011 12:30:58 you wrote:
then how do other compilers/library implementations behave? you said that gcc doesn't follow the suggestion of the standard to avoid per-process states.
well gcc certainly follows the recommendation of the standard by making the *lock-free* atomics both address-free and free from process state, I don't read the standard as suggesting (or even requiring) anything for emulated atomics
in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock-free atomics, but i am not a native speaker and i could be wrong.
neither am I but from the following full excerpt as quoted by Peter Dimov: [ Note: Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. —end note ] I think that "address-free" clearly applies only to "lock-free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).
cacheline sizes vary, but a compiler might optimize for a commonly used value. however i've been hit more than once by compilers, which are changing the size of structs.
under what circumstances did the sizes change? Currently I cannot imagine this *not* being an ABI violation
iirc a struct { int i; void*p}; had a size of 16 bytes or 12 bytes depending if it was packed or not.
if you apply __attribute__((packed)) yes, but note that this is per-structure and (hopefully) not affected by compiler flags Best regards Helge

Helge Bahmann wrote:
On Tuesday 08 November 2011 12:30:58 you wrote:
well gcc certainly follows the recommendation of the standard by making the *lock-free* atomics both address- free and free from process state, I don't read the standard as suggesting (or even requiring) anything for emulated atomics
in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock- free atomics, but i am not a native speaker and i could be wrong.
neither am I but from the following full excerpt as quoted by Peter Dimov:
[ Note: Operations that are lock-free should also be address- free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. —end note ]
I think that "address-free" clearly applies only to "lock- free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).
My reading suggests that everything is related to lock-free operations. This is evident in the second sentence, since it begins with "that is", which is introducing clarification of the first sentence. The third sentence is stating a desired goal which is necessary for the sharing described in the last sentence. The question, then, is whether the last two sentences are specifically related to lock-free operations or are intended to apply more generally. While there is room for another interpretation and contrary intention, the fact that this is structured as a single note suggests it is a cohesive discussion all related to the thesis statement in the first sentence. I, therefore, conclude that the entire note is about lock-free operations. Since there is confusion, it would be appropriate to file a DR on the note to get clarity. However, since it is non-binding text, I'm not sure how the committee will handle it. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com ________________________________ IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock-free atomics, but i am not a native speaker and i could be wrong.
neither am I but from the following full excerpt as quoted by Peter Dimov:
[ Note: Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. —end note ]
I think that "address-free" clearly applies only to "lock-free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).
this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member. there is no standard-compliant way to test at compile- time if std::atomic<T> is lock-free. i cannot find any reference in the standard, that std::atomic<T> is lockfree, if there is a lockfree integral atomic type of the same size, so one cannot write a metafunction to dispatch at compile time .... tim

Tim Blechmann wrote:
this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member.
29.4/2: "The function atomic_is_lock_free (29.6) indicates whether the object is lock-free. In any given program execution, the result of the lock-free query shall be consistent for all pointers of the same type." The query is not per-instance. It can't be performed at compile time though (this is probably motivated by instruction set differences like the famous 386-486 divide where the program can only know at run time whether general atomics are available).
i cannot find any reference in the standard, that std::atomic<T> is lockfree, if there is a lockfree integral atomic type of the same size, ...
As a practical matter, it will be. Pedantically speaking, it is not possible to give this guarantee in the general case, because the integral type may have padding bits and trap representations. But I don't think that this is true on any platform that provides atomic operations, so in practice, we should be fine (although it's possible for an implementation to do the wrong thing in principle, but I doubt that many will).

this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member.
29.4/2: "The function atomic_is_lock_free (29.6) indicates whether the object is lock-free. In any given program execution, the result of the lock-free query shall be consistent for all pointers of the same type."
The query is not per-instance. It can't be performed at compile time though (this is probably motivated by instruction set differences like the famous 386-486 divide where the program can only know at run time whether general atomics are available).
isn't atomic_is_lock_free only defined for integral types? and why isn't atomic::is_lock_free a static member function? dispatching per-object would actually make sense, because there may be platforms which require objects to be aligned to certain memory boundaries for double-width CAS. cheers, tim

Tim Blechmann wrote:
isn't atomic_is_lock_free only defined for integral types?
I'm not aware of any such requirement.
and why isn't atomic::is_lock_free a static member function?
I don't know.
dispatching per-object would actually make sense, because there may be platforms which require objects to be aligned to certain memory boundaries for double-width CAS.
It can be read that way, but it doesn't make much sense from the user's perspective for objects to randomly be lock-free or non-lock-free. std::atomic<> should ensure the necessary alignment.

On Tuesday 08 November 2011 13:24:22 Tim Blechmann wrote:
in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock-free atomics, but i am not a native speaker and i could be wrong.
neither am I but from the following full excerpt as quoted by Peter Dimov:
[ Note: Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. —end note ]
I think that "address-free" clearly applies only to "lock-free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).
this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member. there is no standard-compliant way to test at compile- time if std::atomic<T> is lock-free.
i cannot find any reference in the standard, that std::atomic<T> is lockfree, if there is a lockfree integral atomic type of the same size, so one cannot write a metafunction to dispatch at compile time ....
that's both correct and very annoying indeed -- for Boost.Atomic this property of course holds true and I can create a "type trait"-like class to perform the required mapping (maybe would have made sense for C++11 as well) -- this assumption also sounds reasonable enough that there may be justification for boost to offer such a type trait for atomics of the platform compiler. lacking appropriate integer types there is BTW currently also no way to query for atomicity of objects of several sizes Best regards Helge

I think that "address-free" clearly applies only to "lock-free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).
this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member. there is no standard-compliant way to test at compile- time if std::atomic<T> is lock-free.
i cannot find any reference in the standard, that std::atomic<T> is lockfree, if there is a lockfree integral atomic type of the same size, so one cannot write a metafunction to dispatch at compile time .... that's both correct and very annoying indeed -- for Boost.Atomic this property of course holds true and I can create a "type trait"-like class to perform the required mapping (maybe would have made sense for C++11 as well) -- this assumption also sounds reasonable enough that there may be justification for boost to offer such a type trait for atomics of the platform compiler.
i double-checked: atomic_is_lock_free: takes the argument `atomic-type', not `atomic-integral': this means it should also support atomic<T>. to me this sounds like atomic<T>::is_lock_free should have the same semantics, althought it is not mentioned anywhere ... tim

Tim Blechmann wrote:
atomic_is_lock_free: takes the argument `atomic-type', not `atomic-integral': this means it should also support atomic<T>. to me this sounds like atomic<T>::is_lock_free should have the same semantics, althought it is not mentioned anywhere ...
All free functions have the same semantics as the corresponding member function. They are provided as a C-compatible interface, with the intent for the C standard to adopt them.

On Tuesday, November 08, 2011 10:53:09 Helge Bahmann wrote:
Hi Tim,
* compilers are not required to implement atomics so that sizeof(atomic<T>) == sizeof (T). so the size of a data structure may change when switching compilers.
Yes that's right, however I trust that ABI conventions will be worked out per platform eventually -- people will consider inability to mix icc/gcc/clang/whatever at least on a shared-library level as unacceptable, and currently it works rather well. Also note that the intent of C++11 is to make these low-level structures compatible with C1X so there really is not that much room.
I think that there is value in trying to maintain sizeof(boost::atomic<T>) == sizeof(std::atomic<T>) whatever this is going to mean per platform, and part of the thing I have in my mind is indeed interprocess-safety. I also don't think that Boost.Atomic it is in that state currently, but I am a bit reluctant to make a decision that might shut down this path forever.
I'm not trying to convert you or anything, but to me requirement of binary compatibility between std::atomic and boost::atomic looks excessive. The wish for binary compatibility between different implementations of std::atomic, as well as other STL components is quite understandable, although I doubt that in reality this is achieved beyond the most trivial cases, such as std::pair or std::auto_ptr instances. These implementations denote the same type and from the linker perspective different implementations of std::atomic will still be std::atomic and nothing else. ODR is violated in this case but it might slide. But when you try to operate on boost::atomic as it would be std::atomic, this just looks totally wrong to me. These types are different, so the linker (or compiler for that matter) will never let it go. So the developer will have to do reinterpret_casts explicitly and I believe no sane developer will do that and hope it to work.

On Tuesday, November 08, 2011 10:17:20 Tim Blechmann wrote:
hi helge,
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).
this would imply that boost.thread cannot be used as static library any more.
I was under impression that in multi-module applications Boost.Thread had to be linked in as a shared library anyway. Am I wrong?

Helge Bahmann wrote:
Tim Blechmann wrote:
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.
While this problem is definitely an argument in favour of embedding the spinlock into each atomic, I am still concerned as this introduces incompatibility with std::atomic implementations which may eventually bite someone hard (data structure size, expectations -- changing boost to "using std::atomic" might have grave consequences, and it's a transition path I would like to keep open)
You could just conditionally compile two choices when a platform's support is insufficient: a potentially incompatible implementation or nothing. If a user selects the latter, they may choose to use boost::interprocess::atomic instead. You might also provide the incompatible implementation via a distinct name in the latter case so the user could choose that instead. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com ________________________________ IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

On Tuesday, November 08, 2011 09:22:29 Helge Bahmann wrote:
On Tuesday 08 November 2011 04:42:38 Andrey Semashev wrote:
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).
I think this might be a good idea. Boost.Thread usage in multi-threaded apps is quite expected.
In that case I could play "weak reference" tricks such that the pool is used when linked in, and for single-threaded applications no mutual exclusion is done and they work just fine nevertheless.
Too whacky? Maybe piggy-backing into Boost.Thread without weak reference tricks might also be something to consider?
I didn't really understand the trick, but it seems odd to use atomic<> in single-threaded apps.

Andrey Semashev wrote:
1. If boost::atomic<> doesn't support multi-module applications, that is it silently works incorrectly in such environments, sould it really support these platforms?
"Doesn't support multi-module applications" is not the same as not supporting sharing an atomic between DLLs. Many multi-module applications will never need to do that. I can't think of an example that isn't either contrived or a bad idea.
2. If (1) is true then boost::atomic<> usefullness is greatly reduced.
I wouldn't say so.

On Tuesday, November 08, 2011 13:19:22 Peter Dimov wrote:
Andrey Semashev wrote:
1. If boost::atomic<> doesn't support multi-module applications, that is it silently works incorrectly in such environments, sould it really support these platforms?
"Doesn't support multi-module applications" is not the same as not supporting sharing an atomic between DLLs. Many multi-module applications will never need to do that. I can't think of an example that isn't either contrived or a bad idea.
Yes, I meant atomic<> crossing DLL boundaries, of course. However, I don't find such use cases contrived or bad.

Andrey Semashev wrote: On Tuesday, November 08, 2011 13:19:22 Peter Dimov wrote:
"Doesn't support multi-module applications" is not the same as not supporting sharing an atomic between DLLs. Many multi-module applications will never need to do that. I can't think of an example that isn't either contrived or a bad idea.
Yes, I meant atomic<> crossing DLL boundaries, of course. However, I don't find such use cases contrived or bad.
Can you give an example?

On Tuesday, November 08, 2011 21:16:51 Peter Dimov wrote:
Andrey Semashev wrote:
On Tuesday, November 08, 2011 13:19:22 Peter Dimov wrote:
"Doesn't support multi-module applications" is not the same as not supporting sharing an atomic between DLLs. Many multi-module applications will never need to do that. I can't think of an example that isn't either contrived or a bad idea.
Yes, I meant atomic<> crossing DLL boundaries, of course. However, I don't find such use cases contrived or bad.
Can you give an example?
Reference counting? A library might return a smart pointer to a structure with a reference counter. Linking reference counting methods or making them virtual may be undesirable to allow inlining.

Andrey Semashev wrote:
Reference counting? A library might return a smart pointer to a structure with a reference counter. Linking reference counting methods or making them virtual may be undesirable to allow inlining.
Yep. This is what I meant by "bad". Yes, shared_ptr does it, but it's still a bad idea. There is no reason to inline code that operates on atomics into several separate modules - it just creates the possibility of subtle ODR violations when the inlined code gets out of sync.

On Tuesday, November 08, 2011 21:37:48 Peter Dimov wrote:
Andrey Semashev wrote:
Reference counting? A library might return a smart pointer to a structure with a reference counter. Linking reference counting methods or making them virtual may be undesirable to allow inlining.
Yep. This is what I meant by "bad". Yes, shared_ptr does it, but it's still a bad idea. There is no reason to inline code that operates on atomics into several separate modules - it just creates the possibility of subtle ODR violations when the inlined code gets out of sync.
Surely there may be caveats, as well as in just about every design. But according to my practice one almost always compiles the application against a fixed version of Boost. It is the other cases I would call contrived and bad.

Andrey Semashev wrote:
Surely there may be caveats, as well as in just about every design. But according to my practice one almost always compiles the application against a fixed version of Boost. It is the other cases I would call contrived and bad.
The version of Boost is not the issue. The code that operates on the atomics is. The normal way is for the module that creates the lock-free data structure to provide out of line functions which the other modules would call. Inlining these creates the possibility of the modules to get out of sync with each other. For plugin-type DLL uses, this is basically guaranteed.

On Tuesday, November 08, 2011 22:51:22 Peter Dimov wrote:
The version of Boost is not the issue. The code that operates on the atomics is. The normal way is for the module that creates the lock-free data structure to provide out of line functions which the other modules would call. Inlining these creates the possibility of the modules to get out of sync with each other. For plugin-type DLL uses, this is basically guaranteed.
If in your case modules are separately built and there is a possibility of code inconsistensies then surely inlining is a bad idea. But that doesn't mean that inlining is generally a bad thing when multiple modules are involved. Anyway, it's going slightly off-topic.

Andrey Semashev wrote:
If in your case modules are separately built and there is a possibility of code inconsistensies then surely inlining is a bad idea. But that doesn't mean that inlining is generally a bad thing when multiple modules are involved.
We're not discussing whether it's generally a bad thing. The specific use case under discussion involves a function that does atomic operations. Inlining is typically done for performance reasons, and the atomic operations typically dominate, rendering the performance gain from the inlining irrelevant. So you're only left with the drawbacks.
Anyway, it's going slightly off-topic.
Not really.

On Monday, November 07, 2011 23:15:43 Peter Dimov wrote:
yes makes sense -- there was a concern raised by Andrey Semashev that the spinlock pool as implemented and used by shared_ptr presently may fail on Windows due to the pool being non-unique (not had a chance to test this yet), and I have found a way to produce a similar failure using dlopen, atomics private to shared libraries and RTLD_LOCAL -- currently I am therefore leaning on creating a shared library just for the spinlock pool, but since you wrote the initial implementation maybe you could comment as well? This is a problem in principle, but requiring all users of shared_ptr to
Helge Bahmann wrote: link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.
I think, support for multi-module applications is a must for both libraries, but Boost.SmartPtr is more lucky in that regard since it requires more wide- spread atomic ops to operate, so the shared spinlock pool is unlikely to become an issue. I see other ways to solve the problem, without dynamic linking. One could store a pointer to the pool in each atomic<> instance. Although this would effectively have the same drawbacks as having a spinlock per instance... Or, prehaps, we could initialize a named shared memory with the spinlock pool. That would involve dynamic initialization which should be protected against concurrent execution and may potentially fail if system limits are exceeded.

Andrey Semashev wrote:
I think, support for multi-module applications is a must for both libraries, but Boost.SmartPtr is more lucky in that regard since it requires more wide- spread atomic ops to operate, so the shared spinlock pool is unlikely to become an issue.
shared_ptr requires CAS, and if you have CAS, you have everything. It only needs to operate on single word integers though. Either way, it doesn't matter much because the "multi-module" platform is Windows, and shared_ptr doesn't use the spinlock pool for the reference count on Windows. It does use it for the atomic access functions though.
participants (5)
-
Andrey Semashev
-
Helge Bahmann
-
Peter Dimov
-
Stewart, Robert
-
Tim Blechmann