
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hi all, since my first review request for boost.lockfree [1] was completely ignored, here a second version for review. the main differences are bug fixes provided by various people, trying to run boost.lockfree on different compilers/platforms. the original review request was: i would like to submit a small library of lock-free data structures for review. boost.lockfree provides implementations of lock-free data structures. lock-free data structures can be accessed by multiple threads without the necessity of blocking synchronization primitives such as guards. lock-free data structures can be used in real-time systems, where blocking algorithms may lead to high worst-case execution times, to avoid priority inversion, or to increase the scalability for multi-processor machines. boost.lockfree provides: * boost::lockfree::fifo, a lock-free fifo queue * boost::lockfree::stack, a lock-free stack updated links: the code is available from the boost vault: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=boost_lockfree-241109.zip&directory=Concurrent%20Programming& and from my personal git repository: git://tim.klingt.org/boost_lockfree.git http://tim.klingt.org/git?p=boost_lockfree.git the documentation is available at: http://tim.klingt.org/boost_lockfree/ would be nice if someone could add this library to the review schedule or even would be willing to manage a review ... best, tim [1] http://article.gmane.org/gmane.comp.lib.boost.devel/192779 - -- tim@klingt.org http://tim.klingt.org Avoid the world, it's just a lot of dust and drag and means nothing in the end. Jack Kerouac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAksLtukACgkQdL+4qsZfVssO7wCfel+ro8wr9TXA5fCHdq3+Dv0z 47sAniqHVyaQ4m4vOwr1SyRp9liUr9Dh =FZao -----END PGP SIGNATURE-----

On Tue, Nov 24, 2009 at 3:35 AM, Tim Blechmann <tim@klingt.org> wrote:
I love lock free data structures, hope someone looks at this. I cannot until next Saturday night, family time of the year. >.< Will soon be going through computer withdrawel'ly yours, - OvermindDL1

On Tue, Nov 24, 2009 at 6:59 PM, OvermindDL1 <overminddl1@gmail.com> wrote:
http://tim.klingt.org/boost_lockfree/lockfree/reference.html#lockfree.refere...
Shouldn't ++x return the value *after* incrementing? #include<iostream> int main(void) { int x = 3; std::cout << ++x << std::endl; // prints "4" } I haven't tested the actual library though, so I hope this is just an error in the docs. Sincerely, AmkG

Tim Blechmann wrote:
Hopefully it will be added to the review queue; you can try to contact the current review wizard otherwise.
I assume your data structures allow multiple concurrent writers. Have you considered benchmarking them against implementations that don't allow concurrent writes, since those are usually simpler? If they're not as fast, it might be interesting to provide both. In any case, I think benchmarks would be required to properly review that library since it is in such a performance sensitive domain. I also see your library provides an "atomic_int" type, which is a subset of what the standard library provides in C++0x. I believe there already are projects for implementing the standard atomic int types within Boost, anyone has some info on this?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
these data structures are lock-free multi-writer/multi-reader implementations. i suppose, the algorithm could be simplified for single writers, but i haven't done so ...
well, one aspect of lock-free data structures are throughput, the other one is worst-case execution time ... the implementation currently focuses on worst-case execution time ... e.g. intel's tbb contains a concurrent_queue class, which uses spinlocks ... it probably has a higher throughput, but its api calls are blocking
the whole low-level part of boost.lockfree should be replaced, once a c++0x-style atomic library is available ... cheers, tim - -- tim@klingt.org http://tim.klingt.org The first question I ask myself when something doesn't seem to be beautiful is why do I think it's not beautiful. And very shortly you discover that there is no reason. John Cage. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAksMItkACgkQdL+4qsZfVsvEmQCfW1cUbOxXGVa9ylw51NH7uMKB bMoAn0B2cq6KUmXuV6KLCtiatwGYg54m =TxMc -----END PGP SIGNATURE-----

Zitat von Tim Blechmann <tim@klingt.org>:
since my first review request for boost.lockfree [1] was completely ignored
I have stated that multiple times, I think a mailing list is the wrong format for boost. important posts are missed in the noise. is this still a mailing list for historical reasons only or was there a decision against a message board? to your library, I have a few pre-review comments and questions if you don't mind:
i would like to submit a small library of lock-free data structures for review.
could you post a link that lists known lock-free data structures so people not already familiar with lock-free data structures can see how your library matches that?
I only had a look at the documentation sor far. there are a few things I found surprising: - noncopyable: is the only reason for this that containers can't be copied atomically? - fifo: the STL fifo is called std::queue and has a very different interface than your fifo class. I guess you have to change some things, e.g. std::queue::pop() has a void result, but it could resemble the STL interface, can't it? - that free_list thing...looks very much like an allocator. I understand that you want to make sure that push/pop doesn't call the default allocator (-> mutex lock), but couldn't you make lock-free containers accept stateful allocators and provide a default allocator that keeps free-d objects in a free_list? am I missing a reason why you need a seperate allocation interface? I haven't found the concept the free_list_t template parameter in the documentation, can it do anythiong beyond what the interface of a std::allocator provides? thanks for posting this, very interesting.

typically with lockfree it is hard to truly free anything because some thread may still be lingering on the node you are freeing. So you need a free list that maybe recycles but never actually frees. IIRC that is the case with this library, unless it has changed since I first saw it. Tony

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
well, there are bigger projects, like the linux kernel, which are using mailinglists quite successfully ... imo, nothing is wrong with mailinglists, but the boost lists are not read well enough by the boost devs :/
something like these? http://en.wikipedia.org/wiki/Non-blocking_synchronization http://www.audiomulch.com/~rossb/code/lockfree
yes ... in theory it would be possible to copy a stack or fifo (possibly the freelist code would need some changes), but i decided to make them copyable, to avoid that people are using and copying the stack/fifo at the same time ... during a copy constructor, not enqueue/dequeue would be allowed
i am open for naming suggestions ... push/pop or enqueue/dequeue, fifo or queue, all of this is open for discussion. the interface itself should probably stay as it is, since the stl interface doesn't really match the interface of lockfree data structures
these stack/fifo implementations use a freelist to avoid returning the internal memory to the os. this is required since boost.lockfree doesn't use any safe memory reclamation algorithm such as hazard pointers or pass-the-buck. so freelists are not really a replacement for allocators, but rather a wrapper ... there are two freelist implementations provided, static_freelist and caching_freelist (names are open for discussion). static_freelist allocates a number of nodes from the allocator and `enqueue' only hits the freelist, but not the allocator (to avoid blocking in the allocator code), while caching_freelist may allocate from the allocator, if no node is available from the freelist. the static_freelist is especially useful for real-time systems. using a stateful allocator doesn't improve the situation, since two threads may concurrently call enqueue and thus some synchronization mechanism for the allocator would be required. cheers, tim - -- tim@klingt.org http://tim.klingt.org /"\ ASCII Ribbon Campaign \ / no HTML in email & vCards X no proprietary attachments / \ use open standards -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAksNAJMACgkQdL+4qsZfVstQkACfQnHC+qRLuv9Ip0/2YRr9UWB8 YzgAn0oIl+B4nI5jVJ42aZA8oIkObtnj =nIPJ -----END PGP SIGNATURE-----

On Tuesday, November 24, 2009, <strasser@uni-bremen.de> wrote:
Speaking for myself, nobody has ever explained to me why a message board would be significantly different, so I don't see any upside in switching. -- -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Am Friday 27 November 2009 06:35:48 schrieb Dave Abrahams:
most important, visibility of important threads over a longer period of time. there would be a subforum "Proposed libraries", with threads with new messages at the top. I understand that there is no difference from a technical viewpoint and that some mailreaders can display a mailing list exactly like that, but I don't think a proposal like lockfree would have been fergotten on a board. you rarely see replies to threads more than two weeks old on this mailing list, so people do read it either on a day to day basis, or not at all.

strasser@uni-bremen.de wrote:
[off-topic] I don't know about any specific decisions, but message boards (internet forums) are often terrible for following threads of conversation (at least technical ones). Using the mailing list directly might not be the best alternative either, but accessing mailing lists over nntp using gmane works really, really well for me. HTH / Johan

Tim Blechmann wrote:
<snip> Sorry for doing it in stages, but reviewing lock-free algorithms is not easy. I just finished reviewing your fifo's "enqueue" function. First, I've noticed that to alleviate the ABA problem on platforms where sizeof(void*) is 64 bits, you're using "packed" pointers, i.e. using some of the bits of the pointer returned by malloc to store the ABA tag with the assumption that the bits are always 0. It is not safe to do with every libc implementation - some libc implementations are meddling with the bits themselves and if you change them, then malloc/free will not function properly. Let me write your code down here so that I can enumerate the lines: bool enqueue(T const & t) { node * n = alloc_node(t); if (n == NULL) return false; //Write barrier should be placed here for (;;) { atomic_node_ptr tail (tail_); //1 read_memory_barrier(); //2 atomic_node_ptr next (tail->next); //3 if (likely(tail == tail_)) //4 { if (next.get_ptr() == 0) //5 { if ( tail->next.cas(next, n) ) //6 { tail_.cas(tail, n); //7 return true; //8 } } else tail_.cas(tail, next.get_ptr()); //9 } } } First, you're missing a write barrier after a call to alloc_node(). That call is writing "NULL" to node's next pointer, but that write may not happen until after you've inserted your node, which is too late. Second, I think you must declare head_ and tail_ as volatiles. Also, I believe the next pointer should point to a volatile object, not a regular object. I've implemented Michael-Scott queue myself and remember that getting volatiles and barriers right was the toughest part. I'll look into enqueue later. HTH, Andy.

Tim Blechmann wrote:
<snip> Sorry for doing it in stages, but reviewing lock-free algorithms is not easy. I just finished reviewing your fifo's "enqueue" function. First, I've noticed that to alleviate the ABA problem on platforms where sizeof(void*) is 64 bits, you're using "packed" pointers, i.e. using some of the bits of the pointer returned by malloc to store the ABA tag with the assumption that the bits are always 0. It is not safe to do with every libc implementation - some libc implementations are meddling with the bits themselves and if you change them, then malloc/free will not function properly. Let me write your code down here so that I can enumerate the lines: bool enqueue(T const & t) { node * n = alloc_node(t); if (n == NULL) return false; //Write barrier should be placed here for (;;) { atomic_node_ptr tail (tail_); //1 read_memory_barrier(); //2 atomic_node_ptr next (tail->next); //3 if (likely(tail == tail_)) //4 { if (next.get_ptr() == 0) //5 { if ( tail->next.cas(next, n) ) //6 { tail_.cas(tail, n); //7 return true; //8 } } else tail_.cas(tail, next.get_ptr()); //9 } } } First, you're missing a write barrier after a call to alloc_node(). That call is writing "NULL" to node's next pointer, but that write may not happen until after you've inserted your node, which is too late. Second, I think you must declare head_ and tail_ as volatiles. Also, I believe the next pointer should point to a volatile object, not a regular object. I've implemented Michael-Scott queue myself and remember that getting volatiles and barriers right was the toughest part. I'll look into enqueue later. HTH, Andy.

Tim Blechmann schrieb:
I have been using this library for the past three weeks and have found it incredibly useful. Thank you for your excellent work, Tim! I have a pre-review question though. Is it possible to decouple container and the predefined wait lists so an user may use his own free list type? I have a use case which requires various container instances to share only one freelist instance, and I would like to implement this via an own free list type. -Christopher

i do understand the need of sharing freelists between different stacks/fifos, however i am not sure, what is the best way to implement it. currently, the freelist implementation can be selected via a template argument using a tag class. i chose this design to hide the actual impementation of the freelist from the user, so that no user would get the idea to implement his own freelist, which returns nodes to the os. otoh, it would be possible to publish the freelist interface so that people can implement their own freelist types. with a more generic interface, the freelists may be adapted to use array indices instead of pointers, which would help with the pointer compression problem pointed out by andrew venikov. i would be curious to hear, what other people think about the freelist implementation, if the interface should be public or not and how freelists could be shared. thanks, tim -- tim@klingt.org http://tim.klingt.org Just what the hell is the experimental tradition? Morton Feldman

hi christopher,
i reworked the freelist implementations, changing the api to: template <typename T, bool fixed_size = false, typename Alloc = std::allocator<T> > class internal_freelist; the fixed_size template argument specifies, if `allocate' may hit the os memory allocator, when the freelist is empty. if it is set to `true', all memory needs to be allocated in the constructor, calling the default constructor is impossible (raises a static assertion failure). the fifo class has a freelist_t template argument for specifying the freelist implementation: template <typename T, typename freelist_t = internal_freelist<T, false, std::allocator<T> > > class fifo; as a helper, a small wrapper class exists for supporting external (or shared) freelists: template <typename FreelistType> class external_freelist; with this api, you can use a shared freelist like: typedef fifo<int>::freelist freelist; // freelist type freelist fl(128); // shared typedef external_freelist<freelist> freelist_wrapper; // two fifos with, sharing the same freelist: fifo<int, freelist_wrapper> f1(freelist_wrapper(fl)); fifo<int, freelist_wrapper> f2(freelist_wrapper(fl)); it is also possible to get a handle to the internal freelist of a fifo: fifo<int> f(64); typedef fifo<int>::freelist internal_freelist; typedef external_freelist<internal_freelist> freelist_wrapper; // use freelist of f for f2 internal_freelist & fl (f.get_freelist()); fifo<int, freelist_wrapper> f2(freelist_wrapper(fl)); the code is available from my git repository [1], branch topic/freelist_tweaks. christopher (and others), what do you think about this approach? thanks, tim [1] http://tim.klingt.org/git?p=boost_lockfree.git -- tim@klingt.org http://tim.klingt.org The price an artist pays for doing what he wants is that he has to do it. William S. Burroughs

Tim Blechmann schrieb: [snip]
That's exactly what I need. Many thanks for your work, Tim! Just one comment: in the default ctor of the stack you pass 128 to the ctor of the freelist. I think you should rather call the default ctor. If the user really wants a static freelist, the static size should be stated explicitly. For non-static freelists, there is in my opinion, no need to pre-allocate nodes at all. -Christopher
participants (12)
-
Alan Manuel Gloria
-
Andrew Venikov
-
Christopher Schmidt
-
Dave Abrahams
-
Gottlob Frege
-
Johan Nilsson
-
Mathias Gaunard
-
OvermindDL1
-
Ronald Garcia
-
Stefan Strasser
-
strasser@uni-bremen.de
-
Tim Blechmann