Fw: [atomic] review results

Peter Dimov

7 Nov 2011 7 Nov '11

9:15 p.m.

-----Original Message----- From: Peter Dimov Sent: Monday, November 07, 2011 22:38 To: Helge Bahmann Subject: Re: [boost] [atomic] review results Helge Bahmann wrote:

...

yes makes sense -- there was a concern raised by Andrey Semashev that the spinlock pool as implemented and used by shared_ptr presently may fail on Windows due to the pool being non-unique (not had a chance to test this yet), and I have found a way to produce a similar failure using dlopen, atomics private to shared libraries and RTLD_LOCAL -- currently I am therefore leaning on creating a shared library just for the spinlock pool, but since you wrote the initial implementation maybe you could comment as well?

This is a problem in principle, but requiring all users of shared_ptr to link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.

Show replies by date

Tim Blechmann

7 Nov 7 Nov

11:02 p.m.

...

...
yes makes sense -- there was a concern raised by Andrey Semashev that the spinlock pool as implemented and used by shared_ptr presently may fail on Windows due to the pool being non-unique (not had a chance to test this yet), and I have found a way to produce a similar failure using dlopen, atomics private to shared libraries and RTLD_LOCAL -- currently I am therefore leaning on creating a shared library just for the spinlock pool, but since you wrote the initial implementation maybe you could comment as well? This is a problem in principle, but requiring all users of shared_ptr to link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.

this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers. btw, when reworking the spinlock pool, it might make sense to try two changes: * add some padding to ensure that every spinlock is in a separate cache line * use test-test-and-set lock cheers, tim

Andrey Semashev

8 Nov 8 Nov

3:42 a.m.

On Tuesday, November 08, 2011 00:02:34 Tim Blechmann wrote:

...

...
This is a problem in principle, but requiring all users of shared_ptr to link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.

this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.

Just some random thoughts: 1. If boost::atomic<> doesn't support multi-module applications, that is it silently works incorrectly in such environments, sould it really support these platforms? I mean, spinlock pool is used when no native atomic ops are available for the given type. Perhaps, it's better not to compile at all in such a case. 2. If (1) is true then boost::atomic<> usefullness is greatly reduced. Most of the time one would use boost::interprocess::atomic<>, even in a single process.

Helge Bahmann

8:22 a.m.

On Tuesday 08 November 2011 04:42:38 Andrey Semashev wrote: Peter Dimov <pdimov@pdimov.com> wrote:

...

This is a problem in principle, but requiring all users of shared_ptr to link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.

I share this sentiment as well -- however a separate library may not be too bad considering that the spinlock pool is unused if all operations are natively supported (thus the large majority of users could just "forget" linking the library and would never notice -- this would include atomic reference counts) Tim Blechmann wrote:

...

...
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.

While this problem is definitely an argument in favour of embedding the spinlock into each atomic, I am still concerned as this introduces incompatibility with std::atomic implementations which may eventually bite someone hard (data structure size, expectations -- changing boost to "using std::atomic" might have grave consequences, and it's a transition path I would like to keep open)

...

Just some random thoughts:

1. If boost::atomic<> doesn't support multi-module applications, that is it silently works incorrectly in such environments, sould it really support these platforms? I mean, spinlock pool is used when no native atomic ops are available for the given type. Perhaps, it's better not to compile at all in such a case.

silently working incorrectly is very very very bad -- failing to compile and/or link is much better in this case

...

2. If (1) is true then boost::atomic<> usefullness is greatly reduced. Most of the time one would use boost::interprocess::atomic<>, even in a single process.

yes, agreed Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well). In that case I could play "weak reference" tricks such that the pool is used when linked in, and for single-threaded applications no mutual exclusion is done and they work just fine nevertheless. Too whacky? Maybe piggy-backing into Boost.Thread without weak reference tricks might also be something to consider? Best regards Helge

Tim Blechmann

9:17 a.m.

hi helge,

...

...
...
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.

While this problem is definitely an argument in favour of embedding the spinlock into each atomic, I am still concerned as this introduces incompatibility with std::atomic implementations which may eventually bite someone hard (data structure size, expectations -- changing boost to "using std::atomic" might have grave consequences, and it's a transition path I would like to keep open)

not sure: * compilers are not required to implement atomics so that sizeof(atomic<T>) == sizeof (T). so the size of a data structure may change when switching compilers. * compilers do change the size of a data structure depending on compiler flags. some compilers have a notion of `packed' structs, that ensure the memory layout. however gcc and icc seem to require that all struct members are PODs, while clang++ seems to accept non-POD members ... * i could imagine that c++11 compilers may be smart enough to pad adjacent std::atomic<> to ensure that they are placed in separate cache lines.

...

...
2. If (1) is true then boost::atomic<> usefullness is greatly reduced. Most of the time one would use boost::interprocess::atomic<>, even in a single process.

yes, agreed

Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).

this would imply that boost.thread cannot be used as static library any more. cheers, tim

Helge Bahmann

9:53 a.m.

Hi Tim, On Tuesday 08 November 2011 10:17:20 Tim Blechmann wrote:

...

hi helge,

...
...
...
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.

While this problem is definitely an argument in favour of embedding the spinlock into each atomic, I am still concerned as this introduces incompatibility with std::atomic implementations which may eventually bite someone hard (data structure size, expectations -- changing boost to "using std::atomic" might have grave consequences, and it's a transition path I would like to keep open)

not sure:

* compilers are not required to implement atomics so that sizeof(atomic<T>) == sizeof (T). so the size of a data structure may change when switching compilers.

Yes that's right, however I trust that ABI conventions will be worked out per platform eventually -- people will consider inability to mix icc/gcc/clang/whatever at least on a shared-library level as unacceptable, and currently it works rather well. Also note that the intent of C++11 is to make these low-level structures compatible with C1X so there really is not that much room. I think that there is value in trying to maintain sizeof(boost::atomic<T>) == sizeof(std::atomic<T>) whatever this is going to mean per platform, and part of the thing I have in my mind is indeed interprocess-safety. I also don't think that Boost.Atomic it is in that state currently, but I am a bit reluctant to make a decision that might shut down this path forever.

...

* compilers do change the size of a data structure depending on compiler flags. some compilers have a notion of `packed' structs, that ensure the memory layout. however gcc and icc seem to require that all struct members are PODs, while clang++ seems to accept non-POD members ...

well but __attribute__((packed)) is something you have to attach to each individual data structure so it is not a global change -- and a global flag that affects data structure layout is problematic in any case because you will have a hard time linking to and interfacing with any library.

...

* i could imagine that c++11 compilers may be smart enough to pad adjacent std::atomic<> to ensure that they are placed in separate cache lines.

uhh, I hope they *don't* do that as it has grave implications on the ABI (currently the compiler would not be allowed to "fill" the hole with other data members), and cacheline sizes vary by processor generation

...

...
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).

this would imply that boost.thread cannot be used as static library any more.

That would indeed be bad, but I don't understand why this would be in conflict, could you explain a bit further? Best regards Helge

Tim Blechmann

10:22 a.m.

hi helge,

...

I think that there is value in trying to maintain sizeof(boost::atomic<T>) == sizeof(std::atomic<T>) whatever this is going to mean per platform, and part of the thing I have in my mind is indeed interprocess-safety. I also don't think that Boost.Atomic it is in that state currently, but I am a bit reluctant to make a decision that might shut down this path forever.

then how do other compilers/library implementations behave? you said that gcc doesn't follow the suggestion of the standard to avoid per-process states. but what about msvc or clang/libc++?

...

...
* i could imagine that c++11 compilers may be smart enough to pad adjacent std::atomic<> to ensure that they are placed in separate cache lines. uhh, I hope they *don't* do that as it has grave implications on the ABI (currently the compiler would not be allowed to "fill" the hole with other data members), and cacheline sizes vary by processor generation

cacheline sizes vary, but a compiler might optimize for a commonly used value. however i've been hit more than once by compilers, which are changing the size of structs.

...

...
...
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).

this would imply that boost.thread cannot be used as static library any more.

That would indeed be bad, but I don't understand why this would be in conflict, could you explain a bit further?

don't you need a shared library to resolve the spinlock pool? cheers, tim

Helge Bahmann

10:54 a.m.

On Tuesday 08 November 2011 11:22:07 Tim Blechmann wrote:

...

hi helge,

...
I think that there is value in trying to maintain sizeof(boost::atomic<T>) == sizeof(std::atomic<T>) whatever this is going to mean per platform, and part of the thing I have in my mind is indeed interprocess-safety. I also don't think that Boost.Atomic it is in that state currently, but I am a bit reluctant to make a decision that might shut down this path forever.

then how do other compilers/library implementations behave? you said that gcc doesn't follow the suggestion of the standard to avoid per-process states.

well gcc certainly follows the recommendation of the standard by making the *lock-free* atomics both address-free and free from process state, I don't read the standard as suggesting (or even requiring) anything for emulated atomics

...

but what about msvc or clang/libc++?

libc++ atomic is not yet completed, but does not intend to use a per-object lock. As I read it, they do in fact consider mutex (pool?) instead of spinlock. msvc I don't know yet.

...

...
...
* i could imagine that c++11 compilers may be smart enough to pad adjacent std::atomic<> to ensure that they are placed in separate cache lines.

uhh, I hope they *don't* do that as it has grave implications on the ABI (currently the compiler would not be allowed to "fill" the hole with other data members), and cacheline sizes vary by processor generation

cacheline sizes vary, but a compiler might optimize for a commonly used value. however i've been hit more than once by compilers, which are changing the size of structs.

under what circumstances did the sizes change? Currently I cannot imagine this *not* being an ABI violation

...

...
...
...
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).

this would imply that boost.thread cannot be used as static library any more.

That would indeed be bad, but I don't understand why this would be in conflict, could you explain a bit further?

don't you need a shared library to resolve the spinlock pool?

Frankly I don't know the answer yet so have to experiment. Best regards Helge

Tim Blechmann

11:30 a.m.

...

...
then how do other compilers/library implementations behave? you said that gcc doesn't follow the suggestion of the standard to avoid per-process states.

well gcc certainly follows the recommendation of the standard by making the *lock-free* atomics both address-free and free from process state, I don't read the standard as suggesting (or even requiring) anything for emulated atomics

in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock-free atomics, but i am not a native speaker and i could be wrong.

...

...
cacheline sizes vary, but a compiler might optimize for a commonly used value. however i've been hit more than once by compilers, which are changing the size of structs.

under what circumstances did the sizes change? Currently I cannot imagine this *not* being an ABI violation

iirc a struct { int i; void*p}; had a size of 16 bytes or 12 bytes depending if it was packed or not. but also empty-base-class optimization should change the size of a class. cheers, tim

Helge Bahmann

11:41 a.m.

On Tuesday 08 November 2011 12:30:58 you wrote:

...

...
...
then how do other compilers/library implementations behave? you said that gcc doesn't follow the suggestion of the standard to avoid per-process states.

well gcc certainly follows the recommendation of the standard by making the *lock-free* atomics both address-free and free from process state, I don't read the standard as suggesting (or even requiring) anything for emulated atomics

in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock-free atomics, but i am not a native speaker and i could be wrong.

neither am I but from the following full excerpt as quoted by Peter Dimov: [ Note: Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. —end note ] I think that "address-free" clearly applies only to "lock-free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).

...

...
...
cacheline sizes vary, but a compiler might optimize for a commonly used value. however i've been hit more than once by compilers, which are changing the size of structs.

under what circumstances did the sizes change? Currently I cannot imagine this *not* being an ABI violation

iirc a struct { int i; void*p}; had a size of 16 bytes or 12 bytes depending if it was packed or not.

if you apply __attribute__((packed)) yes, but note that this is per-structure and (hopefully) not affected by compiler flags Best regards Helge

Stewart, Robert

12:17 p.m.

Helge Bahmann wrote:

...

On Tuesday 08 November 2011 12:30:58 you wrote:

...
...
well gcc certainly follows the recommendation of the standard by making the *lock-free* atomics both address- free and free from process state, I don't read the standard as suggesting (or even requiring) anything for emulated atomics

in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock- free atomics, but i am not a native speaker and i could be wrong.

neither am I but from the following full excerpt as quoted by Peter Dimov:

[ Note: Operations that are lock-free should also be address- free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. —end note ]

I think that "address-free" clearly applies only to "lock- free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).

My reading suggests that everything is related to lock-free operations. This is evident in the second sentence, since it begins with "that is", which is introducing clarification of the first sentence. The third sentence is stating a desired goal which is necessary for the sharing described in the last sentence. The question, then, is whether the last two sentences are specifically related to lock-free operations or are intended to apply more generally. While there is room for another interpretation and contrary intention, the fact that this is structured as a single note suggests it is a cohesive discussion all related to the thesis statement in the first sentence. I, therefore, conclude that the entire note is about lock-free operations. Since there is confusion, it would be appropriate to file a DR on the note to get clarity. However, since it is non-binding text, I'm not sure how the committee will handle it. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com ________________________________ IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Tim Blechmann

12:24 p.m.

...

...
in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock-free atomics, but i am not a native speaker and i could be wrong.

neither am I but from the following full excerpt as quoted by Peter Dimov:

[ Note: Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. —end note ]

I think that "address-free" clearly applies only to "lock-free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).

this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member. there is no standard-compliant way to test at compile- time if std::atomic<T> is lock-free. i cannot find any reference in the standard, that std::atomic<T> is lockfree, if there is a lockfree integral atomic type of the same size, so one cannot write a metafunction to dispatch at compile time .... tim

Peter Dimov

12:57 p.m.

Tim Blechmann wrote:

...

this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member.

29.4/2: "The function atomic_is_lock_free (29.6) indicates whether the object is lock-free. In any given program execution, the result of the lock-free query shall be consistent for all pointers of the same type." The query is not per-instance. It can't be performed at compile time though (this is probably motivated by instruction set differences like the famous 386-486 divide where the program can only know at run time whether general atomics are available).

...

i cannot find any reference in the standard, that std::atomic<T> is lockfree, if there is a lockfree integral atomic type of the same size, ...

As a practical matter, it will be. Pedantically speaking, it is not possible to give this guarantee in the general case, because the integral type may have padding bits and trap representations. But I don't think that this is true on any platform that provides atomic operations, so in practice, we should be fine (although it's possible for an implementation to do the wrong thing in principle, but I doubt that many will).

Tim Blechmann

1:14 p.m.

...

...
this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member.

29.4/2: "The function atomic_is_lock_free (29.6) indicates whether the object is lock-free. In any given program execution, the result of the lock-free query shall be consistent for all pointers of the same type."

The query is not per-instance. It can't be performed at compile time though (this is probably motivated by instruction set differences like the famous 386-486 divide where the program can only know at run time whether general atomics are available).

isn't atomic_is_lock_free only defined for integral types? and why isn't atomic::is_lock_free a static member function? dispatching per-object would actually make sense, because there may be platforms which require objects to be aligned to certain memory boundaries for double-width CAS. cheers, tim

Peter Dimov

5:25 p.m.

Tim Blechmann wrote:

...

isn't atomic_is_lock_free only defined for integral types?

I'm not aware of any such requirement.

...

and why isn't atomic::is_lock_free a static member function?

I don't know.

...

dispatching per-object would actually make sense, because there may be platforms which require objects to be aligned to certain memory boundaries for double-width CAS.

It can be read that way, but it doesn't make much sense from the user's perspective for objects to randomly be lock-free or non-lock-free. std::atomic<> should ensure the necessary alignment.

Helge Bahmann

1:25 p.m.

On Tuesday 08 November 2011 13:24:22 Tim Blechmann wrote:

...

...
...
in my reading the sentence `The implementation should not depend on any per- process state' is not restricted to lock-free atomics, but i am not a native speaker and i could be wrong.

neither am I but from the following full excerpt as quoted by Peter Dimov:

[ Note: Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. —end note ]

I think that "address-free" clearly applies only to "lock-free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).

this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member. there is no standard-compliant way to test at compile- time if std::atomic<T> is lock-free.

i cannot find any reference in the standard, that std::atomic<T> is lockfree, if there is a lockfree integral atomic type of the same size, so one cannot write a metafunction to dispatch at compile time ....

that's both correct and very annoying indeed -- for Boost.Atomic this property of course holds true and I can create a "type trait"-like class to perform the required mapping (maybe would have made sense for C++11 as well) -- this assumption also sounds reasonable enough that there may be justification for boost to offer such a type trait for atomics of the platform compiler. lacking appropriate integer types there is BTW currently also no way to query for atomicity of objects of several sizes Best regards Helge

Tim Blechmann

4:07 p.m.

...

...
...
I think that "address-free" clearly applies only to "lock-free", and if something is not "address-free" then the point whether it can be used interprocess is kind of moot as it is most certainly not going to be mapped at the same address (if you consider numerically identical addresses in different spaces to be the same at all).

this implies that one can only use atomic integral types in shared memory, but not std::atomic<>, because the std::atomic<> template has a per-instance is_lock_free member. there is no standard-compliant way to test at compile- time if std::atomic<T> is lock-free.

i cannot find any reference in the standard, that std::atomic<T> is lockfree, if there is a lockfree integral atomic type of the same size, so one cannot write a metafunction to dispatch at compile time .... that's both correct and very annoying indeed -- for Boost.Atomic this property of course holds true and I can create a "type trait"-like class to perform the required mapping (maybe would have made sense for C++11 as well) -- this assumption also sounds reasonable enough that there may be justification for boost to offer such a type trait for atomics of the platform compiler.

i double-checked: atomic_is_lock_free: takes the argument `atomic-type', not `atomic-integral': this means it should also support atomic<T>. to me this sounds like atomic<T>::is_lock_free should have the same semantics, althought it is not mentioned anywhere ... tim

Peter Dimov

5:29 p.m.

Tim Blechmann wrote:

...

atomic_is_lock_free: takes the argument `atomic-type', not `atomic-integral': this means it should also support atomic<T>. to me this sounds like atomic<T>::is_lock_free should have the same semantics, althought it is not mentioned anywhere ...

All free functions have the same semantics as the corresponding member function. They are provided as a C-compatible interface, with the intent for the C standard to adopt them.

Peter Dimov

11:25 a.m.

Tim Blechmann wrote:

...

you said that gcc doesn't follow the suggestion of the standard to avoid per-process states.

Again, this suggestion is for the lock-free case.

Andrey Semashev

7:22 p.m.

On Tuesday, November 08, 2011 10:53:09 Helge Bahmann wrote:

...

Hi Tim,

...
* compilers are not required to implement atomics so that sizeof(atomic<T>) == sizeof (T). so the size of a data structure may change when switching compilers.

Yes that's right, however I trust that ABI conventions will be worked out per platform eventually -- people will consider inability to mix icc/gcc/clang/whatever at least on a shared-library level as unacceptable, and currently it works rather well. Also note that the intent of C++11 is to make these low-level structures compatible with C1X so there really is not that much room.

I think that there is value in trying to maintain sizeof(boost::atomic<T>) == sizeof(std::atomic<T>) whatever this is going to mean per platform, and part of the thing I have in my mind is indeed interprocess-safety. I also don't think that Boost.Atomic it is in that state currently, but I am a bit reluctant to make a decision that might shut down this path forever.

I'm not trying to convert you or anything, but to me requirement of binary compatibility between std::atomic and boost::atomic looks excessive. The wish for binary compatibility between different implementations of std::atomic, as well as other STL components is quite understandable, although I doubt that in reality this is achieved beyond the most trivial cases, such as std::pair or std::auto_ptr instances. These implementations denote the same type and from the linker perspective different implementations of std::atomic will still be std::atomic and nothing else. ODR is violated in this case but it might slide. But when you try to operate on boost::atomic as it would be std::atomic, this just looks totally wrong to me. These types are different, so the linker (or compiler for that matter) will never let it go. So the developer will have to do reinterpret_casts explicitly and I believe no sane developer will do that and hope it to work.

Andrey Semashev

6:53 p.m.

On Tuesday, November 08, 2011 10:17:20 Tim Blechmann wrote:

...

hi helge,

...
Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).

this would imply that boost.thread cannot be used as static library any more.

I was under impression that in multi-module applications Boost.Thread had to be linked in as a shared library anyway. Am I wrong?

Stewart, Robert

12:20 p.m.

Helge Bahmann wrote:

...

Tim Blechmann wrote:

...
...
this could be avoided by using boost::interprocess::atomic<>, which will associate a spinlock with each instance ... or by using std::atomic on c++11 compilers.

While this problem is definitely an argument in favour of embedding the spinlock into each atomic, I am still concerned as this introduces incompatibility with std::atomic implementations which may eventually bite someone hard (data structure size, expectations -- changing boost to "using std::atomic" might have grave consequences, and it's a transition path I would like to keep open)

You could just conditionally compile two choices when a platform's support is insufficient: a potentially incompatible implementation or nothing. If a user selects the latter, they may choose to use boost::interprocess::atomic instead. You might also provide the incompatible implementation via a distinct name in the latter case so the user could choose that instead. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com ________________________________ IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Andrey Semashev

6:51 p.m.

On Tuesday, November 08, 2011 09:22:29 Helge Bahmann wrote:

...

On Tuesday 08 November 2011 04:42:38 Andrey Semashev wrote:

Another option that I have considered would be "piggy-backing" the spinlock pool into Boost.Thread -- the idea is that an application is either single-threaded, or if it is multi-threaded it is expected to link with Boost.Thread (even if nothing from Boost.Thread is used indeed, yes makes me feel uneasy as well).

I think this might be a good idea. Boost.Thread usage in multi-threaded apps is quite expected.

...

In that case I could play "weak reference" tricks such that the pool is used when linked in, and for single-threaded applications no mutual exclusion is done and they work just fine nevertheless.

Too whacky? Maybe piggy-backing into Boost.Thread without weak reference tricks might also be something to consider?

I didn't really understand the trick, but it seems odd to use atomic<> in single-threaded apps.

Peter Dimov

7:18 p.m.

Andrey Semashev wrote:

...

I didn't really understand the trick, but it seems odd to use atomic<> in single-threaded apps.

It's not odd if you don't know whether the app is single-threaded, like in shared_ptr's case.

Andrey Semashev

7:28 p.m.

On Tuesday, November 08, 2011 21:18:18 Peter Dimov wrote:

...

Andrey Semashev wrote:

...
I didn't really understand the trick, but it seems odd to use atomic<> in single-threaded apps.

It's not odd if you don't know whether the app is single-threaded, like in shared_ptr's case.

I see.

Peter Dimov

11:19 a.m.

Andrey Semashev wrote:

...

1. If boost::atomic<> doesn't support multi-module applications, that is it silently works incorrectly in such environments, sould it really support these platforms?

"Doesn't support multi-module applications" is not the same as not supporting sharing an atomic between DLLs. Many multi-module applications will never need to do that. I can't think of an example that isn't either contrived or a bad idea.

...

2. If (1) is true then boost::atomic<> usefullness is greatly reduced.

I wouldn't say so.

Andrey Semashev

7:01 p.m.

On Tuesday, November 08, 2011 13:19:22 Peter Dimov wrote:

...

Andrey Semashev wrote:

...
1. If boost::atomic<> doesn't support multi-module applications, that is it silently works incorrectly in such environments, sould it really support these platforms?

"Doesn't support multi-module applications" is not the same as not supporting sharing an atomic between DLLs. Many multi-module applications will never need to do that. I can't think of an example that isn't either contrived or a bad idea.

Yes, I meant atomic<> crossing DLL boundaries, of course. However, I don't find such use cases contrived or bad.

Peter Dimov

7:16 p.m.

Andrey Semashev wrote: On Tuesday, November 08, 2011 13:19:22 Peter Dimov wrote:

...

...
"Doesn't support multi-module applications" is not the same as not supporting sharing an atomic between DLLs. Many multi-module applications will never need to do that. I can't think of an example that isn't either contrived or a bad idea.

Yes, I meant atomic<> crossing DLL boundaries, of course. However, I don't find such use cases contrived or bad.

Can you give an example?

Andrey Semashev

7:26 p.m.

On Tuesday, November 08, 2011 21:16:51 Peter Dimov wrote:

...

Andrey Semashev wrote:

On Tuesday, November 08, 2011 13:19:22 Peter Dimov wrote:

...
...
"Doesn't support multi-module applications" is not the same as not supporting sharing an atomic between DLLs. Many multi-module applications will never need to do that. I can't think of an example that isn't either contrived or a bad idea.

Yes, I meant atomic<> crossing DLL boundaries, of course. However, I don't find such use cases contrived or bad.

Can you give an example?

Reference counting? A library might return a smart pointer to a structure with a reference counter. Linking reference counting methods or making them virtual may be undesirable to allow inlining.

Peter Dimov

7:37 p.m.

Andrey Semashev wrote:

...

Reference counting? A library might return a smart pointer to a structure with a reference counter. Linking reference counting methods or making them virtual may be undesirable to allow inlining.

Yep. This is what I meant by "bad". Yes, shared_ptr does it, but it's still a bad idea. There is no reason to inline code that operates on atomics into several separate modules - it just creates the possibility of subtle ODR violations when the inlined code gets out of sync.

Andrey Semashev

8:18 p.m.

On Tuesday, November 08, 2011 21:37:48 Peter Dimov wrote:

...

Andrey Semashev wrote:

...
Reference counting? A library might return a smart pointer to a structure with a reference counter. Linking reference counting methods or making them virtual may be undesirable to allow inlining.

Yep. This is what I meant by "bad". Yes, shared_ptr does it, but it's still a bad idea. There is no reason to inline code that operates on atomics into several separate modules - it just creates the possibility of subtle ODR violations when the inlined code gets out of sync.

Surely there may be caveats, as well as in just about every design. But according to my practice one almost always compiles the application against a fixed version of Boost. It is the other cases I would call contrived and bad.

Peter Dimov

8:51 p.m.

Andrey Semashev wrote:

...

Surely there may be caveats, as well as in just about every design. But according to my practice one almost always compiles the application against a fixed version of Boost. It is the other cases I would call contrived and bad.

The version of Boost is not the issue. The code that operates on the atomics is. The normal way is for the module that creates the lock-free data structure to provide out of line functions which the other modules would call. Inlining these creates the possibility of the modules to get out of sync with each other. For plugin-type DLL uses, this is basically guaranteed.

Andrey Semashev

9:22 p.m.

On Tuesday, November 08, 2011 22:51:22 Peter Dimov wrote:

...

The version of Boost is not the issue. The code that operates on the atomics is. The normal way is for the module that creates the lock-free data structure to provide out of line functions which the other modules would call. Inlining these creates the possibility of the modules to get out of sync with each other. For plugin-type DLL uses, this is basically guaranteed.

If in your case modules are separately built and there is a possibility of code inconsistensies then surely inlining is a bad idea. But that doesn't mean that inlining is generally a bad thing when multiple modules are involved. Anyway, it's going slightly off-topic.

Peter Dimov

11:14 p.m.

Andrey Semashev wrote:

...

If in your case modules are separately built and there is a possibility of code inconsistensies then surely inlining is a bad idea. But that doesn't mean that inlining is generally a bad thing when multiple modules are involved.

We're not discussing whether it's generally a bad thing. The specific use case under discussion involves a function that does atomic operations. Inlining is typically done for performance reasons, and the atomic operations typically dominate, rendering the performance gain from the inlining irrelevant. So you're only left with the drawbacks.

...

Anyway, it's going slightly off-topic.

Not really.

Andrey Semashev

3:31 a.m.

On Monday, November 07, 2011 23:15:43 Peter Dimov wrote:

...

...
yes makes sense -- there was a concern raised by Andrey Semashev that the spinlock pool as implemented and used by shared_ptr presently may fail on Windows due to the pool being non-unique (not had a chance to test this yet), and I have found a way to produce a similar failure using dlopen, atomics private to shared libraries and RTLD_LOCAL -- currently I am therefore leaning on creating a shared library just for the spinlock pool, but since you wrote the initial implementation maybe you could comment as well? This is a problem in principle, but requiring all users of shared_ptr to

Helge Bahmann wrote: link to a shared library is a non-starter. I wouldn't use such a shared_ptr, and I doubt many others will. And I wouldn't be surprised if this sentiment applies to Boost.Atomic as well.

I think, support for multi-module applications is a must for both libraries, but Boost.SmartPtr is more lucky in that regard since it requires more wide- spread atomic ops to operate, so the shared spinlock pool is unlikely to become an issue. I see other ways to solve the problem, without dynamic linking. One could store a pointer to the pool in each atomic<> instance. Although this would effectively have the same drawbacks as having a spinlock per instance... Or, prehaps, we could initialize a named shared memory with the spinlock pool. That would involve dynamic initialization which should be protected against concurrent execution and may potentially fail if system limits are exceeded.

Peter Dimov

11:22 a.m.

Andrey Semashev wrote:

...

I think, support for multi-module applications is a must for both libraries, but Boost.SmartPtr is more lucky in that regard since it requires more wide- spread atomic ops to operate, so the shared spinlock pool is unlikely to become an issue.

shared_ptr requires CAS, and if you have CAS, you have everything. It only needs to operate on single word integers though. Either way, it doesn't matter much because the "multi-module" platform is Windows, and shared_ptr doesn't use the spinlock pool for the reference count on Windows. It does use it for the atomic access functions though.

5012

Age (days ago)

5013

Last active (days ago)

List overview

Download

35 comments

5 participants

participants (5)

Andrey Semashev
Helge Bahmann
Peter Dimov
Stewart, Robert
Tim Blechmann