Ah, that's weird, I'm on a intel quad core 2.4 ghz running vista 32 bit, compiled with MSVC 2008. I'll rebuild with 1.39 tomorrow and do a run, with signals2 included.
Rebuilt with boost 1.39 and signals2, getting results similar to yours now, so probably missed some optimization flag when building boost. Here's the source with signals2 included: #include <iostream> #include "boost/signals.hpp" #include <vector> #include "boost/function.hpp" #include "boost/timer.hpp" #include "boost/signals2/signal.hpp" #include <cstdlib> #include <algorithm> void foo( ) { } int main() { std::vector< boost::function< void ( void ) > > manualSignal; boost::signal< void ( void ) > boostSignalFragmented, boostSignalUnfragmented; boost::signals2::signal< void ( void ) > boostSignal2Fragmented, boostSignal2Unfragmented; typedef std::vector< boost::signals::connection > ConnectionVector; typedef std::vector< boost::signals2::connection > ConnectionVector2; ConnectionVector connections; ConnectionVector2 connections2; for( unsigned int i = 0; i < 10000; ++i ) { manualSignal.push_back( &foo ); boostSignal2Unfragmented.connect( &foo ); boostSignalUnfragmented.connect( &foo ); } for( unsigned int i = 0; i < 100000; ++i ) { connections.push_back( boostSignalFragmented.connect( &foo ) ); connections2.push_back( boostSignal2Fragmented.connect( &foo ) ); } for( unsigned int i = 0; i < 90000; ++i ) { { ConnectionVector::iterator index = connections.begin() + rand() % connections.size(); (*index).disconnect(); *index = *connections.rbegin(); connections.erase( connections.begin() + connections.size() - 1 ); } { ConnectionVector2::iterator index = connections2.begin() + rand() % connections2.size(); (*index).disconnect(); *index = *connections2.rbegin(); connections2.erase( connections2.begin() + connections2.size() - 1 ); } } { boost::timer tm; for( unsigned int i = 0; i < 1000; ++i ) { for( unsigned int j = 0; j < 10000; ++j ) manualSignal[ i ]( ); } double elapsed = tm.elapsed(); std::cout << "vector variant: " << elapsed << std::endl; } { boost::timer tm; for( unsigned int i = 0; i < 1000; ++i ) { boostSignalUnfragmented( ); } double elapsed = tm.elapsed(); std::cout << "boost::signal Unfragmented variant: " << elapsed << std::endl; } { boost::timer tm; for( unsigned int i = 0; i < 1000; ++i ) { boostSignalFragmented( ); } double elapsed = tm.elapsed(); std::cout << "boost::signal Fragmented variant: " << elapsed << std::endl; } { boost::timer tm; for( unsigned int i = 0; i < 1000; ++i ) { boostSignal2Unfragmented( ); } double elapsed = tm.elapsed(); std::cout << "boost::signal2 Unfragmented variant: " << elapsed << std::endl; } { boost::timer tm; for( unsigned int i = 0; i < 1000; ++i ) { boostSignal2Fragmented( ); } double elapsed = tm.elapsed(); std::cout << "boost::signal2 Fragmented variant: " << elapsed << std::endl; } } This yields: Vector: 0.038 boost::signal unfragmented : 0.936 boost::signal fragmented: 1.659 boost::signal2 unfragmented: 1.092 boost::signal2 fragmented: 9.793 What's surprising here is the long running time of the fragmented signal2 variant, which makes me belive there's a bug somewhere in it. Instead of connecting 100000 slots and disconnect 90000 randomly I connected 1000000 and disconnected 990000, the fragmented signal2 performance on invocation then jumps up to about 92 seconds for calling a signal with 10000 slots. I think a lot of the disparities between signal and the vector variant actually is due to the underlying container used, a std::map will simply never beat the vector, and in fact signal will most likely perform considerably worse when slots go further and further apart and paging kicks in. I think the underlying container in signals is to much of a factor to keep private, and should, imo, be a paramater. A lot of people seems to use signal for event systems, where fragmentation becomes very natural if there's a lot of actors which enters / leaves the system. I understand that the code for connections would have to have special implementations depending on the traits of the underlying container type, but keeping the same public interface would still be very much possible without to much of an effort, I think.