[asio] Performance of 1 io_service per core vs. 1 io service run by thread pool

Hi everyone, Has there been a formal comparison between the performance of an application having one io_service per core compared to a single io service run by a thread pool? Currently I'm implementing a server that will be run on a machine with multiple processors. My main concerns are that in the case of a single io service run by a thread pool, handler functions are wrapped in a strand therefore causing (unnecessary) synchronization between the running threads. In the case of an io service per core design, unbalanced work from connections being serviced by one io service instance over the others will cause that one thread (presumed to be running on just one processor) to do most of the work leaving the other threads idle. I'm mainly interested in a pro-cons discussion on what the best approach should be. Currently, my implementation uses a single io service with a thread pool running handlers synchronized with a strand -- though the performance is OK, I'm looking at whether trying the other approach would be worth it. Comments and insights would be most appreciated. -- Dean Michael Berris Software Engineer, Friendster, Inc. <dmberris@friendster.com> +639287291459

Dean Michael C. Berris wrote:
In the case of an io service per core design, unbalanced work from connections being serviced by one io service instance over the others will cause that one thread (presumed to be running on just one processor) to do most of the work leaving the other threads idle.
But in that case, you run n demultiplexers (select, epoll, whatever) instead of one...

On Fri, Feb 29, 2008 at 3:52 AM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
Dean Michael C. Berris wrote:
In the case of an io service per core design, unbalanced work from connections being serviced by one io service instance over the others will cause that one thread (presumed to be running on just one processor) to do most of the work leaving the other threads idle.
But in that case, you run n demultiplexers (select, epoll, whatever) instead of one...
This should exploit the multiple available processors better (assuming that each thread runs on a different processor), and even improve responsiveness of the system when the demultiplexers 'sleep' or wait on a condition to happen. The only problem I see in this design is if you already have a number of connections already bound to one io_service of which almost every connection has some pending things to do (read, write, process, etc.) and other connections on other io_service instances not doing much. For instance, socket 1, 2, 3 are bound to io_service A, socket 4, 5, 6 are bound to io_service B -- if 1, 2, 3 have a lot of activity, then the thread running io_service A's run method would be swamped trying to deal with the stuff 1, 2, 3 need to accomplish while the thread running io_service B's run method would technically be idle. I'm curious whether the cost of synchronization between threads all running the same io_service's run() method is greater than the possible performance hit of having multiple sockets (de)multiplexed in a single thread. Any ideas from the threading experts as to whether locking a mutex across multiple threads is worse than having a single thread have exclusive access to the resource (in this case a dispatch queue inside boost::asio::io_service)? -- Dean Michael C. Berris Software Engineer, Friendster, Inc. [http://blog.cplusplus-soup.com] [mikhailberis@gmail.com] [+63 928 7291459] [+1 408 4049523]

Dean Michael C. Berris wrote:
I'm mainly interested in a pro-cons discussion on what the best approach should be. Currently, my implementation uses a single io service with a thread pool running handlers synchronized with a strand -- though the performance is OK, I'm looking at whether trying the other approach would be worth it.
Comments and insights would be most appreciated.
I think this thread might be of interest from the boost.asio.user group, "io_service-per-CPU design in HTTP server 2" <http://thread.gmane.org/gmane.comp.lib.boost.asio.user/1300> Jamie

Hi Jamie! On Fri, Feb 29, 2008 at 10:44 AM, Jamie Allsop <ja11sop@yahoo.co.uk> wrote:
Dean Michael C. Berris wrote:
I'm mainly interested in a pro-cons discussion on what the best approach should be. Currently, my implementation uses a single io service with a thread pool running handlers synchronized with a strand -- though the performance is OK, I'm looking at whether trying the other approach would be worth it.
Comments and insights would be most appreciated.
I think this thread might be of interest from the boost.asio.user group, "io_service-per-CPU design in HTTP server 2"
<http://thread.gmane.org/gmane.comp.lib.boost.asio.user/1300>
Yes, definitely interesting. I may take my questions specific to Asio there then. Thanks for the link! -- Dean Michael C. Berris Software Engineer, Friendster, Inc. [http://blog.cplusplus-soup.com] [mikhailberis@gmail.com] [+63 928 7291459] [+1 408 4049523]
participants (5)
-
Dean Michael Berris
-
Dean Michael Berris
-
Dean Michael C. Berris
-
Jamie Allsop
-
Mathias Gaunard