[signals] sandbox version status

The modified version of Boost.Signals in the sandbox now compiles and passes the tests on my setup. Basic thread-safety is also in there, minus tracking and the combiner/signal-to-signal connection issues that have been discussed here recently. The thread-safe version will not support grouping and naming of slots, so there is a preprocessor flag "BOOST_SIGNALS_NO_LEGACY_SUPPORT" which changes the template signature of signalN. If defined, the Group and GroupCompare parameters are replaced with a ThreadingModel parameter (I chose to put it there and not after SlotFunction, but that may change). It defaults to signals::single_threaded which results in non-thread-safe behaviour. When signals::multi_threaded is given as an argument, the signal should be thread-safe. That is for now completely untested, though ;). As I was curious about the call performance, I ran a simple test, successively connecting up to 4 slots to the signal and calling it 10000 - 10000000 times. Here are the results: # Calls CVS SBL SBNS SBNM 1 10000 0.005 0.003 0.001 0.002 1 100000 0.057 0.037 0.019 0.025 1 1000000 0.582 0.362 0.209 0.269 1 10000000 5.882 3.659 2.073 2.668 2 10000 0.007 0.004 0.002 0.003 2 100000 0.075 0.044 0.027 0.031 2 1000000 0.769 0.461 0.269 0.322 2 10000000 7.668 4.535 2.744 3.274 3 10000 0.009 0.005 0.003 0.003 3 100000 0.095 0.054 0.035 0.038 3 1000000 0.964 0.555 0.360 0.411 3 10000000 9.643 5.538 3.525 4.010 4 10000 0.011 0.006 0.003 0.004 4 100000 0.114 0.061 0.038 0.044 4 1000000 1.155 0.641 0.419 0.451 4 10000000 11.482 6.378 4.124 4.519 CVS = The current CVS HEAD version. SBL = The sandbox version with legacy support (named slots). SBNS = The sandbox version without legacy support, single-threaded. SBNM = The sandbox version without legacy support, thread-safe. I used Visual C++ 2005 Express SP1 with /O2 optimization on an Athlon X2 3800+ for these tests. Results are in seconds. The comparison to the original CVS version is not quite fair, as I used dynamic linking there and static linking for the rest of the tests. I just did it to make sure my code was not significantly slower. The overhead for thread-safety does not grow with the number of connected slots, as I use a per-signal mutex and no per-slot locking. That looks good here, but I'm not yet sure it is the best solution in the long run. It would be nice to not block the entire signal for the whole time. As the new single-threaded implementation is somewhat faster than the one supporting named slots, I will add a signals::no_name type that can be used for the Group template argument when legacy support is enabled. That will result in the faster implementation being used without losing compatibility to existing code with grouped slots. Reports about problems with certain compilers and any kind of other input would be very appreciated. Don't judge too harshly on the code, I know it needs a lot of cleanup ;). Regards Timmo Stange

On Sunday 11 February 2007 01:48 am, Timmo Stange wrote:
The modified version of Boost.Signals in the sandbox now compiles and passes the tests on my setup. Basic thread-safety is also in there, minus tracking and the combiner/signal-to-signal connection issues that have been discussed here recently.
Ack! You're pulling ahead of me :) I've just got to the point where I'm about to start going through the tests with thread_safe_signal. A bit of a redundant effort, I know, but once I got it to where I was planning to drop it, it seemed so close to being done. Better to have two thread-safe implementations than zero, I suppose.
The thread-safe version will not support grouping and naming of slots, so there is a preprocessor flag "BOOST_SIGNALS_NO_LEGACY_SUPPORT" which changes the template signature of signalN. If defined, the Group and GroupCompare parameters are replaced
I didn't really follow exactly why you decided to drop Group and GroupCompare. I was able to implement support for them pretty trivially in thread_safe_signal (see thread_safe_signals/detail/slot_groups.hpp and the connect() functions in thread_safe_signals/detail/signal_template.hpp).
The overhead for thread-safety does not grow with the number of connected slots, as I use a per-signal mutex and no per-slot locking. That looks good here, but I'm not yet sure it is the best solution in the long run. It would be nice to not block the entire signal for the whole time.
thread_safe_signals should be able to do multiple signal invocations concurrently, as well as disconnect and connect concurrently with invocation, due to my much-maligned per-slot locking :) and some other trickery. If you benchmark thread_safe_signals though, be sure to do it with multiple slots connected and multiple threads on a multicore cpu so its benchmark results look nice and inflated :) However, you should be able to get concurrent signal invocation even with a single mutex once boost supports a recursive_read_write_mutex, right? -- Frank

Frank Mori Hess wrote:
Ack! You're pulling ahead of me :) I've just got to the point where I'm about to start going through the tests with thread_safe_signal. A bit of a redundant effort, I know, but once I got it to where I was planning to drop it, it seemed so close to being done. Better to have two thread-safe implementations than zero, I suppose.
It's actually the best way to combine efforts in this case. There's not enough independent functionality to work separately on one implementation, I think, but we've each discovered some problems that the other might have missed. I was not aware that the combiner() member was public for example and would have perhaps missed that loophole for a long time.
I didn't really follow exactly why you decided to drop Group and GroupCompare. I was able to implement support for them pretty trivially in thread_safe_signal (see thread_safe_signals/detail/slot_groups.hpp and the connect() functions in thread_safe_signals/detail/signal_template.hpp).
I'm not so happy with the preprocessor flag solution and I might decide to dispatch to the right implementation based on the Group template parameter alone at compile time, choosing the faster approach for signals::no_name. The thought behind all this is the infamous template code bloat problem, which I carry around more like a religious belief, because my compiler handles it rather well, but as so many bright people have mentioned it, I take it as a given. For every template you use, the compiler may create code for each instantiation potentially in every compilation unit. It gets worse with polymorphism, because you make it harder for the compiler to optimize code redundancy away (as those functions are usually called through a fixed function pointer). I don't know if you looked at the Boost.Function code. They put a lot of effort into avoiding this problem (as opposed to Loki's generalized functor, which I looked at first). To make a long story short, Doug also worked hard on putting the majority of the code in non-templated classes and I wanted to retain that policy as much as possible (I'm usually the kind of guy that is fast to say those people should upgrade their compilers, so this is perhaps a good lesson for me ;)). The named_slot_map in Boost.Signals is non-templated and the only way I see for now to improve its performance is to make it templated. As for your solution: If I understand it correctly, your signal does not support the full ordering of the original version. You can establish an order (through at_back/at_front) within the groups and for the ungrouped slots in one signal. The standard containers don't support that notion. You'd need a (meta-)key with an infinite value range to support at_back/at_front in a std::map.
thread_safe_signals should be able to do multiple signal invocations concurrently, as well as disconnect and connect concurrently with invocation, due to my much-maligned per-slot locking :) and some other trickery. If you benchmark thread_safe_signals though, be sure to do it with multiple slots connected and multiple threads on a multicore cpu so its benchmark results look nice and inflated :)
However, you should be able to get concurrent signal invocation even with a single mutex once boost supports a recursive_read_write_mutex, right?
Yes, that would be possible at the expense of copying the combiner for the invocation, which is acceptable and parameterizable through the ThreadingModel anyway. Regards Timmo Stange

Timmo Stange wrote:
As for your solution: If I understand it correctly, your signal does not support the full ordering of the original version. You can establish an order (through at_back/at_front) within the groups and for the ungrouped slots in one signal. The standard containers don't support that notion. You'd need a (meta-)key with an infinite value range to support at_back/at_front in a std::map.
Sorry, I only looked at your group_key_less comparison here. I saw now that you use a list as container and a linear search in signalN::connect, which ensures the correct order. That works as expected, of course, but significantly changes the complexity of connect() to the worse compared to what the current docs say. Regards Timmo Stange

On Sunday 11 February 2007 06:13 pm, Timmo Stange wrote:
Timmo Stange wrote:
As for your solution: If I understand it correctly, your signal does not support the full ordering of the original version. You can establish an order (through at_back/at_front) within the groups and for the ungrouped slots in one signal. The standard containers don't support that notion. You'd need a (meta-)key with an infinite value range to support at_back/at_front in a std::map.
Sorry, I only looked at your group_key_less comparison here. I saw now that you use a list as container and a linear search in signalN::connect, which ensures the correct order. That works as expected, of course, but significantly changes the complexity of connect() to the worse compared to what the current docs say.
connect() is O(n) anyways, if you do automatic slot cleanup there. -- Frank

On Sunday 11 February 2007 06:36 pm, Frank Mori Hess wrote:
Sorry, I only looked at your group_key_less comparison here. I saw now that you use a list as container and a linear search in signalN::connect, which ensures the correct order. That works as expected, of course, but significantly changes the complexity of connect() to the worse compared to what the current docs say.
connect() is O(n) anyways, if you do automatic slot cleanup there.
I suppose a binary search would work too, since the list is kept ordered. That would reduce the insertion time when a group is specified to O(log n). Connect for ungrouped slots is already O(1). -- Frank

Frank Mori Hess wrote:
connect() is O(n) anyways, if you do automatic slot cleanup there.
I suppose a binary search would work too, since the list is kept ordered. That would reduce the insertion time when a group is specified to O(log n). Connect for ungrouped slots is already O(1).
You need a container with random access iterators in order for that to have the expected complexity. The automatic slot cleanup theoretically only needs to check one slot per connect() in order to avoid the problem we discussed. And it's not used in existing code anyway. Regards Timmo Stange

On Sunday 11 February 2007 07:15 pm, Timmo Stange wrote:
Frank Mori Hess wrote:
connect() is O(n) anyways, if you do automatic slot cleanup there.
I suppose a binary search would work too, since the list is kept ordered. That would reduce the insertion time when a group is specified to O(log n). Connect for ungrouped slots is already O(1).
You need a container with random access iterators in order for that to have the expected complexity.
The automatic slot cleanup theoretically only needs to check one slot per connect() in order to avoid the problem we discussed. And it's not used in existing code anyway.
Good points. But, in any case it seems preferable to support groups even with O(n) insertion time than to provide no support at all. It adds no overhead to the ungrouped connect(). -- Frank

On Sunday 11 February 2007 19:15 pm, Timmo Stange wrote:
Frank Mori Hess wrote:
connect() is O(n) anyways, if you do automatic slot cleanup there.
I suppose a binary search would work too, since the list is kept ordered. That would reduce the insertion time when a group is specified to O(log n). Connect for ungrouped slots is already O(1).
You need a container with random access iterators in order for that to have the expected complexity.
Okay, I've optimized the Group/GroupCompare support so it doesn't take linear time to insert a grouped slot anymore. I added a std::map which maintains a list iterator to the first slot in each group. See the grouped_list class in thread_safe_signals/detail/slot_groups.hpp. -- Frank

On Sunday 11 February 2007 05:58 pm, Timmo Stange wrote:
I didn't really follow exactly why you decided to drop Group and GroupCompare. I was able to implement support for them pretty trivially in thread_safe_signal (see thread_safe_signals/detail/slot_groups.hpp and the connect() functions in thread_safe_signals/detail/signal_template.hpp).
As for your solution: If I understand it correctly, your signal does not support the full ordering of the original version. You can establish an order (through at_back/at_front) within the groups and for the ungrouped slots in one signal. The standard containers don't support that notion. You'd need a (meta-)key with an infinite value range to support at_back/at_front in a std::map.
No, it does. There is no std::map, I don't rely on the container at all for ordering. I did try a std::multimap initially, before discovering that their "insert with hint" is not sufficient to maintain ordering within a group. The slot list is just a std::list and I decide the proper place to put each new connection when inserting into the list.
However, you should be able to get concurrent signal invocation even with a single mutex once boost supports a recursive_read_write_mutex, right?
Yes, that would be possible at the expense of copying the combiner for the invocation, which is acceptable and parameterizable through the ThreadingModel anyway.
I copy a shared_ptr to the combiner in the invocation, so a new full copy only gets made in set_combiner(). -- Frank

Frank Mori Hess wrote:
However, you should be able to get concurrent signal invocation even with a single mutex once boost supports a recursive_read_write_mutex, right? Yes, that would be possible at the expense of copying the combiner for the invocation, which is acceptable and parameterizable through the ThreadingModel anyway.
I copy a shared_ptr to the combiner in the invocation, so a new full copy only gets made in set_combiner().
Yes, I saw that. You probably know that already, but the shared_ptr reference counting is a relatively expensive operation. Copying a last_value<> instance locally is a no-op on the other hand. Regards Timmo Stange
participants (3)
-
Frank Mori Hess
-
Frank Mori Hess
-
Timmo Stange