[sync] [thread] Faster mutex

Hi, One of the Mesa developers implemented a faster mutex: http://lists.freedesktop.org/archives/mesa-dev/2015-January/075260.html According to the sources of glibc, Mesa's mutex has a much smaller size (only 1 uint32_t vs 6 ints + 2 pointers), performs less actions before and after the actual syscall (no getting and setting of the mutex->__data.__owner, no type field checks) This gives an ~10% speedup. Just curious, maybe ideas from Mesa's mutex could be useful for some of the Boost libraries. -- Best regards, Antony Polukhin

On 30 Jan 2015 at 18:09, Antony Polukhin wrote:
Just curious, maybe ideas from Mesa's mutex could be useful for some of the Boost libraries.
I believe Vicente's current plan is that v5 Boost.Thread will extend the C++ 11 STL. Therefore boost::mutex would be implemented using std::mutex in v5 with the extensions Boost.Thread provides over the STL. Besides, last time I looked any STL mutex was already implemented using an atomic fastpath and lazy kernel wait object allocation after a spin. The win32 critical section has been that design since year dot. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Just curious, maybe ideas from Mesa's mutex could be useful for some of the Boost libraries.
I believe Vicente's current plan is that v5 Boost.Thread will extend the C++ 11 STL. Therefore boost::mutex would be implemented using std::mutex in v5 with the extensions Boost.Thread provides over the STL.
Besides, last time I looked any STL mutex was already implemented using an atomic fastpath and lazy kernel wait object allocation after a spin. The win32 critical section has been that design since year dot.
hmm, last time i checked libc++ and libstdc++ both used pthread_mutex_t. so i'd expect futex-based mutexes to perform better, as the pthreads layer is not required and some pthreads functionality (pthread_mutexattr_t) won't get into the code paths. so futex-based mutexes would actually be quite reasonable ... tim

On Friday 30 January 2015 18:09:40 Antony Polukhin wrote:
Hi,
One of the Mesa developers implemented a faster mutex: http://lists.freedesktop.org/archives/mesa-dev/2015-January/075260.html
According to the sources of glibc, Mesa's mutex has a much smaller size (only 1 uint32_t vs 6 ints + 2 pointers), performs less actions before and after the actual syscall (no getting and setting of the mutex->__data.__owner, no type field checks)
This gives an ~10% speedup.
Just curious, maybe ideas from Mesa's mutex could be useful for some of the Boost libraries.
I do have plans on implementing futex-based mutex and condition variable, but it's not my top priority. I think the first release of Boost.Sync will be based on pthread. Also, I've not decided if the futex-based implementation should be provided as an alternative since in some situations it might be needed to have access to the underlying primitives such as pthread_mutex_t.

On 2 Feb 2015 at 11:54, Andrey Semashev wrote:
According to the sources of glibc, Mesa's mutex has a much smaller size (only 1 uint32_t vs 6 ints + 2 pointers), performs less actions before and after the actual syscall (no getting and setting of the mutex->__data.__owner, no type field checks)
This gives an ~10% speedup.
Just curious, maybe ideas from Mesa's mutex could be useful for some of the Boost libraries.
I do have plans on implementing futex-based mutex and condition variable, but it's not my top priority. I think the first release of Boost.Sync will be based on pthread. Also, I've not decided if the futex-based implementation should be provided as an alternative since in some situations it might be needed to have access to the underlying primitives such as pthread_mutex_t.
I also think any futex based design will always have fairness problems. FreeBSD doesn't use futexes at all for this reason. In my C11 permit object, I use atomic increment gates to enforce fairness, but like with BSD's threading primitives they are orders of magnitude slower for mild contention over a futex which is essentially a multi state CAS lock. I might add that I recently installed PC-BSD 10.1 which is the very latest and was quite surprised at how snappy fast it feels compared to Linux and Windows 8.1 on the same hardware. Laptop battery life is now within 30 mins of Windows too as BSD can now power manage, and all the hardware but my wifi (a very recent Intel 7260) works straight out of the box. FreeBSD has made enormous strides recently as a desktop OS, if we could just get suspend and resume for those with Radeon video cards I think it would become my primary dev OS (the BSD Radeon kernel driver is the Linux Radeon driver running under a Linux kernel API emulation layer, and that emulation layer doesn't support suspend/resume yet). Right now it's too annoying to lose my present state of work end of each work day :( Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
participants (4)
-
Andrey Semashev
-
Antony Polukhin
-
Niall Douglas
-
Tim Blechmann