Fwd: [sync?] "Distributed" value (or something...)

---------- Forwarded message ---------- From: Klaim - Joël Lamotte <mjklaim@gmail.com> Date: Tue, Jul 29, 2014 at 9:44 PM Subject: [boost][sync?] "Distributed" value (or something...) To: Boost users list <boost-users@lists.boost.org> I would like to ask Boost developers and users: 1. Am I the only one seeing the following pattern in my concurrent systems code? 2. Is there already some tools available in Boost that does what I describe here and that I am not aware of? 3. If not, would there be any interest in the tool I'm describing here or not? Is it a dumb idea? (I keep getting back to this idea but I'm surprised not finding anything similar online) The pattern that I am seeing occurs when: 1. I have a value or a set of values in an object or several objects which value(s) describe some kind of configuration that several other systems will read and use. 2. The object(s) will be updated only by one thread (at a time). 3. When the object's value is modified, all the systems that rely on the object's value have to be notified so that they update themselves using that value. 4. The systems I am refering to can use different threads for updating and we can assume that they are running concurrently. 5. It is not important that the source object's value is synched with the reader's values at all time, however it is important that the sequence of changes to the source object's value is signaled to the readers exactly in the same order and "as fast as possible", but delay is ok. The simplest solution to solve this is to have a mutex locking the value and the systems checking the value regularly. (that is, using syncrhonized_value for example) However the locking and looping reads might not be necessary if it is done in another way (if I am correct, I still lack experience in the domain). The solution that I am thinking about would look like this. ///////// // The object containing the value used by the other systems. class System_A { distributed_source<int> m_foo_count; // changes to this object will be sent to "readers"... public: System_A() : m_foo_count(42) ... {} //... // thread-safe read-only access distributed_value<int> make_foo_count_reader() const { return distributed_value<int>{ m_foo_count }; } }; ///////// // The different systems acquire a read access to the value. class System_B { distributed_value<int> foo_count; // read-only value reflecting System_A::m_foo_count's value since last update. public: SystemB( const System_A& system_a, ... ) : m_foo_count( system_a.make_foo_count_reader() ) { // setup the initial config... on_foo_count_changed( *m_foo_count ); // trigger some processing when SystemA::foo_count's value have been changed: m_foo_count.on_changed( [this]( int new_value ){ // called after the value have been changed m_work_queue.push( [=]{ // will be executed on the next update cycle for this system... on_foo_count_changed( new_value ); / }); }); } //... private: void update_cycle( const UpdateInfo& info ) { // ... // the following is just to clarify: const int foo_before = *m_foo_count; // the value is not updated yet, but we can still use it m_work_queue.execute(); const int foo_after = *m_foo_count; // the value might have been udpated if( foo_before != foo_after ) { // the value have been updated (though you don't need to do this if you used the callback instead). } //... }; }; //////// // At some point the value is changed by the system owning the source object... System_A::change_foo( int new_value ) { m_work_queue.push( [=]{ // will be executed on the next update cycle of this system... // ... *m_foo_count = new_value; // assign the new value then send the new value to the readers... }); } System_A::something_complex( std::function<void(int&)> some_work ) { m_work_queue.push( [=]{ // ... m_foo_count.update( some_work ); // execute on the current value then send the resulting value to the readers }); } Some notes: a. The names are just invented while I write this email. b. I suppose that the distributed_source might have an interface similar to synchronized_value, anyway I didn't get farther than this idea. c. I see several different ways to implement this idea, some might be more efficient than others but I didn't do any tests yet (by lack of time). Thoughts?

On Wed, Jul 30, 2014 at 1:51 AM, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
I would like to ask Boost developers and users:
1. Am I the only one seeing the following pattern in my concurrent systems code? 2. Is there already some tools available in Boost that does what I describe here and that I am not aware of? 3. If not, would there be any interest in the tool I'm describing here or not? Is it a dumb idea? (I keep getting back to this idea but I'm surprised not finding anything similar online)
The pattern that I am seeing occurs when: 1. I have a value or a set of values in an object or several objects which value(s) describe some kind of configuration that several other systems will read and use. 2. The object(s) will be updated only by one thread (at a time). 3. When the object's value is modified, all the systems that rely on the object's value have to be notified so that they update themselves using that value. 4. The systems I am refering to can use different threads for updating and we can assume that they are running concurrently. 5. It is not important that the source object's value is synched with the reader's values at all time, however it is important that the sequence of changes to the source object's value is signaled to the readers exactly in the same order and "as fast as possible", but delay is ok.
I'm not sure I fully understood the idea but is it implementable in RCU manner? I think you can have a mutex-protected shared_ptr to the value, which every reader can obtain and use. When the reader is not using the value it demotes it to weak_ptr. Next time the reader wants to use the object it attempts to promote it to shared_ptr, which will only succeed if the value has expired. This approach does not involve any callbacks, which might be a good or bad thing depending on the use case. In any case, I think this would be a fairly high level primitive (especially if async callbacks are involved), so I'm not sure Boost.Sync is the right place for it.

On Wed, Jul 30, 2014 at 12:45 PM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
On Wed, Jul 30, 2014 at 1:51 AM, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
I would like to ask Boost developers and users:
1. Am I the only one seeing the following pattern in my concurrent systems code? 2. Is there already some tools available in Boost that does what I describe here and that I am not aware of? 3. If not, would there be any interest in the tool I'm describing here or not? Is it a dumb idea? (I keep getting back to this idea but I'm surprised not finding anything similar online)
The pattern that I am seeing occurs when: 1. I have a value or a set of values in an object or several objects which value(s) describe some kind of configuration that several other systems will read and use. 2. The object(s) will be updated only by one thread (at a time). 3. When the object's value is modified, all the systems that rely on the object's value have to be notified so that they update themselves using that value. 4. The systems I am refering to can use different threads for updating and we can assume that they are running concurrently. 5. It is not important that the source object's value is synched with the reader's values at all time, however it is important that the sequence of changes to the source object's value is signaled to the readers exactly in the same order and "as fast as possible", but delay is ok.
I'm not sure I fully understood the idea but is it implementable in RCU manner? I think you can have a mutex-protected shared_ptr to the value, which every reader can obtain and use. When the reader is not using the value it demotes it to weak_ptr. Next time the reader wants to use the object it attempts to promote it to shared_ptr, which will only succeed if the value has expired.
Read that as "has not expired".
This approach does not involve any callbacks, which might be a good or bad thing depending on the use case. In any case, I think this would be a fairly high level primitive (especially if async callbacks are involved), so I'm not sure Boost.Sync is the right place for it.

On Wed, Jul 30, 2014 at 10:45 AM, Andrey Semashev <andrey.semashev@gmail.com
wrote:
I'm not sure I fully understood the idea but is it implementable in RCU manner?
Thanks for the keyword, I think it's indeed related but I wasn't aware of the idea. I'll have to read more (http://en.wikipedia.org/wiki/Read-copy-update) to be sure.
I think you can have a mutex-protected shared_ptr to the value, which every reader can obtain and use. When the reader is not using the value it demotes it to weak_ptr. Next time the reader wants to use the object it attempts to promote it to shared_ptr, which will only succeed if the value has not expired.
That's close to the implementation I was thinking about but I also would like to avoid any lock if avoidable. (I lack practice in this domain to say if it's avoidable).
This approach does not involve any callbacks, which might be a good or bad thing depending on the use case.
I think in my case the writes are so rarely changed but have so much impact, I'm assuming that it's more efficient if a callback is called once the value is written so that the systems don't have to test the value regularly. I might be wrong though, my application is not complete yet.
In any case, I think this would be a fairly high level primitive (especially if async callbacks are involved), so I'm not sure Boost.Sync is the right place for it.
I was assuming it would be close to synchronized_value but it's a bit more high-level indeed.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On Wed, Jul 30, 2014 at 12:58 PM, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
On Wed, Jul 30, 2014 at 10:45 AM, Andrey Semashev <andrey.semashev@gmail.com
wrote:
I think you can have a mutex-protected shared_ptr to the value, which every reader can obtain and use. When the reader is not using the value it demotes it to weak_ptr. Next time the reader wants to use the object it attempts to promote it to shared_ptr, which will only succeed if the value has not expired.
That's close to the implementation I was thinking about but I also would like to avoid any lock if avoidable. (I lack practice in this domain to say if it's avoidable).
To be clear, that mutex would have to be locked only when the value is changed (first by the writer to update the shared_ptr, then by every reader when he discovers that his weak_ptr has expired) or the first time when the value and readers are initialized. While the value is stable readers don't lock the mutex and only work with their weak_ptrs/shared_ptrs. If updates are rare, this can be a win.
This approach does not involve any callbacks, which might be a good or bad thing depending on the use case.
I think in my case the writes are so rarely changed but have so much impact, I'm assuming that it's more efficient if a callback is called once the value is written so that the systems don't have to test the value regularly. I might be wrong though, my application is not complete yet.
The problem with the callbacks is that you have to choose the thread to invoke them in. In the simplest case this could be the writer thread, in other cases a separate thread pool might be more desirable.

On Wed, Jul 30, 2014 at 11:14 AM, Andrey Semashev <andrey.semashev@gmail.com
wrote:
The problem with the callbacks is that you have to choose the thread to invoke them in. In the simplest case this could be the writer thread, in other cases a separate thread pool might be more desirable.
Indeed. In the general case, a nice interface would let the user decide how the callback would be called. Maybe using an optional Executor concept object as first argument to the callback subscription function would be enough (as for future.then() ) In my specific use case, just calling the callbacks in the thread that update the value is ok because I always assume (in all my code) that a callback will do it's own synchronization work if it's necessary. That's why in my initial example there is a work queues used in the callback. Now that I think about it, using an executor is a generalization of the same idea, only the point of decision of how to execute the callback's code is changed. Assuming the work queue match the Executor concept, this code: m_foo_count.on_changed( [this]( int new_value ){ m_work_queue( [=]{ on_foo_count_changed( new_value ); }); }); could be simplified to: m_foo_count.on_changed( m_work_queue, [this]( int new_value ){ on_foo_count_changed( new_value ); }); (or the equivalent using bind) But the generated code would be like the first snippet, just a wrapping.

Andrey Semashev wrote:
The problem with the callbacks is that you have to choose the thread to invoke them in. In the simplest case this could be the writer thread, in other cases a separate thread pool might be more desirable.
I'd say that the problem with the callback architecture is that you still need to communicate the change to the readers, which would require them to still poll an atomic variable at the very least. And if they poll, there's no need for callbacks. Just have the writer increment a version counter. weak_ptr's would be slightly less performant because they do a CAS on lock, instead of just an atomic read. I'm not sure if this would matter that much though.

On Wed, Jul 30, 2014 at 11:36 AM, Peter Dimov <lists@pdimov.com> wrote:
I'd say that the problem with the callback architecture is that you still need to communicate the change to the readers, which would require them to still poll an atomic variable at the very least. And if they poll, there's no need for callbacks. Just have the writer increment a version counter. weak_ptr's would be slightly less performant because they do a CAS on lock, instead of just an atomic read. I'm not sure if this would matter that much though.
If a copy of the new value is passed as argument to the callbacks, then I don't see the problem - except the cost of the copy of course.
participants (3)
-
Andrey Semashev
-
Klaim - Joël Lamotte
-
Peter Dimov