"[gsoc] Thread scheduler support for boost"

Hi Phil and Hartmut, I am following up with the earlier thread on of extending the current threadpool library present in boost. The current approach implements N:M model by internally using the pthread APIs (pthread_schedule() and its related classes). However, for high performance latency sensitive application code, we would not only like to parellize but would also like to (i) reduce load variance against the threads (ii) create asymmetric scheduling: where a set of threads are forced to run on a subset of processors (iii) prioritize the tasks amongst threads.. One that I am thinking is to create a userland scheduler through which all scheduling decisions are made. For OS specific case, the userland scheduler can be made to use features exposed by the OS such as different scheduling algos, processor affinity,... Please let me know if you think that this direction is admissible as a plausible extension for the Boost library. I am working on this and would be sending you more details on how i think on implementing this. Regards, Indradip.
Message: 4 Date: Wed, 25 Mar 2009 07:36:12 -0500 From: "Hartmut Kaiser" <hartmut.kaiser@gmail.com> Subject: Re: [boost] "[gsoc] Thread scheduler support for boost" To: <boost@lists.boost.org> Message-ID: <49ca2540.47c1f10a.61b0.ffffdbac@mx.google.com> Content-Type: text/plain; charset="us-ascii"
However, I thought of your suggested extension of threadpool class exploiting processor affinity techniques and it sounds very interesting as well but I guess we would run into the same portability issues with regard to availability of the lower level API exposing the hooks to be able to develop such a facility. Another issue would be to qualify and justify how much improvement would this type of parallelism enable on top of the instruction level parallelism already in place for all modern processors. Although, I would still like to find out if you have any concrete lead in mind.
FWIW, processor affinity is supported on many operating systems, so it shouldn't be problematic to come up with a portable implementation. Moreover, I believe Oliver already included something like that into his thread_pool library (see http://tinyurl.com/cqgt5u, file boost-threadpool.v24.tar.gz).
I know for sure that there are those hard engineering problems in high performance computing domain to ensure QOS guarantees to distributed real-time embedded systems. They look into processor utilization optimizations in the face of uncertainty of computational load on the system. These usually require some kind of mathematical validation of the scheme for the system under construction. I guess that would be beyond the scope of the GSOC project, given the time frame.
I'm sure that controlling processor affinity is something to consider not only for real time applications or high end computing. With the rising number of cores in a chip we need to be able to influence this for everyday tasks as well. But that's only IHMO.
M:N type of models would also have the same portability issue as it requires changing kernel code as well as the userland code in thread library.
You shouldn't shy away of some feature just because Boost requires portable implementations. Most of the time this is possible to achieve, as modern OS's have very similar functionality, just exposed using different API's.
Regards Hartmut

Hi, I'm wondering this shouldn't be considered as an extension of the Boost.Thread library more that the Boost.ThreadPool library. The ThreadPool library can be improved once these extension are in place, but also other application or libraries using threads. Let me know if I'm missing someething? Best, Vicente ----- Original Message ----- From: "Indradip Ghosh" <indraghosh2k@gmail.com> To: <boost@lists.boost.org> Sent: Monday, March 30, 2009 8:14 AM Subject: [boost] "[gsoc] Thread scheduler support for boost" Hi Phil and Hartmut, I am following up with the earlier thread on of extending the current threadpool library present in boost. The current approach implements N:M model by internally using the pthread APIs (pthread_schedule() and its related classes). However, for high performance latency sensitive application code, we would not only like to parellize but would also like to (i) reduce load variance against the threads (ii) create asymmetric scheduling: where a set of threads are forced to run on a subset of processors (iii) prioritize the tasks amongst threads.. One that I am thinking is to create a userland scheduler through which all scheduling decisions are made. For OS specific case, the userland scheduler can be made to use features exposed by the OS such as different scheduling algos, processor affinity,... Please let me know if you think that this direction is admissible as a plausible extension for the Boost library. I am working on this and would be sending you more details on how i think on implementing this. Regards, Indradip.
Message: 4 Date: Wed, 25 Mar 2009 07:36:12 -0500 From: "Hartmut Kaiser" <hartmut.kaiser@gmail.com> Subject: Re: [boost] "[gsoc] Thread scheduler support for boost" To: <boost@lists.boost.org> Message-ID: <49ca2540.47c1f10a.61b0.ffffdbac@mx.google.com> Content-Type: text/plain; charset="us-ascii"
However, I thought of your suggested extension of threadpool class exploiting processor affinity techniques and it sounds very interesting as well but I guess we would run into the same portability issues with regard to availability of the lower level API exposing the hooks to be able to develop such a facility. Another issue would be to qualify and justify how much improvement would this type of parallelism enable on top of the instruction level parallelism already in place for all modern processors. Although, I would still like to find out if you have any concrete lead in mind.
FWIW, processor affinity is supported on many operating systems, so it shouldn't be problematic to come up with a portable implementation. Moreover, I believe Oliver already included something like that into his thread_pool library (see http://tinyurl.com/cqgt5u, file boost-threadpool.v24.tar.gz).
I know for sure that there are those hard engineering problems in high performance computing domain to ensure QOS guarantees to distributed real-time embedded systems. They look into processor utilization optimizations in the face of uncertainty of computational load on the system. These usually require some kind of mathematical validation of the scheme for the system under construction. I guess that would be beyond the scope of the GSOC project, given the time frame.
I'm sure that controlling processor affinity is something to consider not only for real time applications or high end computing. With the rising number of cores in a chip we need to be able to influence this for everyday tasks as well. But that's only IHMO.
M:N type of models would also have the same portability issue as it requires changing kernel code as well as the userland code in thread library.
You shouldn't shy away of some feature just because Boost requires portable implementations. Most of the time this is possible to achieve, as modern OS's have very similar functionality, just exposed using different API's.
Regards Hartmut
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

I am following up with the earlier thread on of extending the current threadpool library present in boost. The current approach implements N:M model by internally using the pthread APIs (pthread_schedule() and its related classes). However, for high performance latency sensitive application code, we would not only like to parellize but would also like to (i) reduce load variance against the threads (ii) create asymmetric scheduling: where a set of threads are forced to run on a subset of processors (iii) prioritize the tasks amongst threads.. One that I am thinking is to create a userland scheduler through which all scheduling decisions are made. For OS specific case, the userland scheduler can be made to use features exposed by the OS such as different scheduling algos, processor affinity,...
Please let me know if you think that this direction is admissible as a plausible extension for the Boost library. I am working on this and would be sending you more details on how i think on implementing this.
Sounds good, but very ambitious. As long as you don't want to allow for yielding in user land, everything is straightforward. But often yielding is required, which adds complexity quite a bit (you need to start fiddling with the stack, maintain task dependencies, etc. - think Cilk (http://www.cilk.com)). An additional source of complexity is synchronization. Once in user land, you probably don't want to use OS synchronization primitives anymore. It seems to be appropriate to use (user land) yielding as opposed to (OS based) blocking, instead. This allows to fill up the allotted (OS thread) time slice even if a user land thread needs to be yielded. That means such a library needs a proper set of synchronization primitives to go with (preferably interface compatible with boost::thread::mutex, et.al). Regards Hartmut
Regards, Indradip.
Message: 4 Date: Wed, 25 Mar 2009 07:36:12 -0500 From: "Hartmut Kaiser" <hartmut.kaiser@gmail.com> Subject: Re: [boost] "[gsoc] Thread scheduler support for boost" To: <boost@lists.boost.org> Message-ID: <49ca2540.47c1f10a.61b0.ffffdbac@mx.google.com> Content-Type: text/plain; charset="us-ascii"
However, I thought of your suggested extension of threadpool class exploiting processor affinity techniques and it sounds very interesting as well but I guess we would run into the same portability issues with regard to availability of the lower level API exposing the hooks to be able to develop such a facility. Another issue would be to qualify and justify how much improvement would this type of parallelism enable on top of the instruction level parallelism already in place for all modern processors. Although, I would still like to find out if you have any concrete lead in mind.
FWIW, processor affinity is supported on many operating systems, so it shouldn't be problematic to come up with a portable implementation. Moreover, I believe Oliver already included something like that into his thread_pool library (see http://tinyurl.com/cqgt5u, file boost-threadpool.v24.tar.gz).
I know for sure that there are those hard engineering problems in high performance computing domain to ensure QOS guarantees to distributed real-time embedded systems. They look into processor utilization optimizations in the face of uncertainty of computational load on the system. These usually require some kind of mathematical validation of the scheme for the system under construction. I guess that would be beyond the scope of the GSOC project, given the time frame.
I'm sure that controlling processor affinity is something to consider not only for real time applications or high end computing. With the rising number of cores in a chip we need to be able to influence this for everyday tasks as well. But that's only IHMO.
M:N type of models would also have the same portability issue as it requires changing kernel code as well as the userland code in thread library.
You shouldn't shy away of some feature just because Boost requires portable implementations. Most of the time this is possible to achieve, as modern OS's have very similar functionality, just exposed using different API's.
Regards Hartmut
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (3)
-
Hartmut Kaiser
-
Indradip Ghosh
-
vicente.botet