Re: [Boost-users] [atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example
On 01.07.2014 10:49, Andrey Semashev wrote:
On Tue, Jul 1, 2014 at 12:41 PM, Adam Romanek wrote:
On 01.07.2014 10:36, Andrey Semashev wrote:
On Tue, Jul 1, 2014 at 12:20 PM, Andrey Semashev wrote:
On Tue, Jul 1, 2014 at 11:20 AM, Adam Romanek wrote:
Andrey,
If you don't mind, please have a look at the conversation that I started on boost-users list [1] a few days back. Since you're the maintainer of Boost.Atomic I'd like to know your opinion on this issue. Or maybe I should repost this on boost-dev list?
I don't follow users list, mostly because I can't even keep up with the developers list. So it's better to reach me on the dev list.
The code in the StackOverflow topic indeed contains a potential problem. The fetch_sub(release) operation prevents previous operations from sinking past it and atomic_thread_fence(acquire) prevents the following operations from floating above it. However, these two barriers themselves can be reordered creating a window where operations before fetch_sub(release) and after atomic_thread_fence(acquire) may overlap. I believe, this is permitted on both the compiler and hardware levels. This doesn't mean that's what is actually happening and being flagged by TSan; it is still possible that some TSan bug is exposed. The fact that replacing atomic_thread_fence(acquire) with fetch_add(acquire) speaks in favor of the bug suspicion.
Actually, no, fetch_add(acquire) is significantly different from the fence. It operates on the same object as the previous fetch_sub(release), so it must observe its effects and cannot be reordered.
After reading your first response I thought you're suggesting that the "reference counting" example in Boost.Atomic docs is broken. But hopefully we agree that it's fine, right? :)
The example in the docs is indeed not correct. I believe, it should be corrected to use acq_rel memory order in intrusive_ptr_release. The fence can be removed then. I'll fix it as soon as I have some time.
So I guess this is a false positive from TSan, as suggested by others on SO. I'll ask this on TSan's mailing list, lets see what they say.
Ok, let me know if there are other considerations from TSan devs.
See above for my private conversation with Andrey Semashev on this topic (he does not follow the boost-users mailing list). Andrey says that the "reference counting example" from Boost.Atomic docs is broken. What do you think about this? WBR, Adam Romanek
On 1 Jul 2014 at 13:34, Adam Romanek wrote:
See above for my private conversation with Andrey Semashev on this topic (he does not follow the boost-users mailing list).
Andrey says that the "reference counting example" from Boost.Atomic docs is broken. What do you think about this?
I wouldn't be surprised. Few have access to non-Intel hardware for testing, and all Intel CPUs always acquire loads and always release stores. That makes use of atomics with anything but memory_order_seq_cst superfluous on Intel at the CPU level. Additionally, modern CPUs do very well with memory_order_seq_cst across the board, even an ARM Cortex A15. If I ever find myself tempted to not use memory_order_seq_cst before benchmarking I usually slap myself hard. The only real problem with its use is that compilers basically turn off optimisation around them, so you'd rather not have them in tight loops if possible. The ONLY thing which might penalise this approach in the future is memory transactions - I can see memory_order_seq_cst would have to abort more frequently than other orderings. But until such hardware capability is here, I really wouldn't worry. If you see problem code not using memory_order_seq_cst, change it to that and bypass the hassle. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
On 01.07.2014 14:15, Niall Douglas wrote:
On 1 Jul 2014 at 13:34, Adam Romanek wrote:
See above for my private conversation with Andrey Semashev on this topic (he does not follow the boost-users mailing list).
Andrey says that the "reference counting example" from Boost.Atomic docs is broken. What do you think about this?
I wouldn't be surprised. Few have access to non-Intel hardware for testing, and all Intel CPUs always acquire loads and always release stores. That makes use of atomics with anything but memory_order_seq_cst superfluous on Intel at the CPU level.
Actually Andrey analyzed this example once again and he said he was wrong. Let me quote him: "After studying the standard it looks like I was wrong after all. The paragraph 1.9/14 gives the guarantee that the compiler does not reorder fetch_sub and the fence, so the example in the docs is correct." I also got confirmation from a ThreadSanitizer developer that this is a false positive as "tsan ignores stand-alone memory fences ATM" [1]. WBR, Adam Romanek [1] https://groups.google.com/d/msg/thread-sanitizer/dZ3QiQE49ns/j7rKGHg08foJ
participants (2)
-
Adam Romanek
-
Niall Douglas