
However, I thought of your suggested extension of threadpool class exploiting processor affinity techniques and it sounds very interesting as well but I guess we would run into the same portability issues with regard to availability of the lower level API exposing the hooks to be able to develop such a facility. Another issue would be to qualify and justify how much improvement would this type of parallelism enable on top of the instruction level parallelism already in place for all modern processors. Although, I would still like to find out if you have any concrete lead in mind.
FWIW, processor affinity is supported on many operating systems, so it shouldn't be problematic to come up with a portable implementation. Moreover, I believe Oliver already included something like that into his thread_pool library (see http://tinyurl.com/cqgt5u, file boost-threadpool.v24.tar.gz).
I know for sure that there are those hard engineering problems in high performance computing domain to ensure QOS guarantees to distributed real-time embedded systems. They look into processor utilization optimizations in the face of uncertainty of computational load on the system. These usually require some kind of mathematical validation of the scheme for the system under construction. I guess that would be beyond the scope of the GSOC project, given the time frame.
I'm sure that controlling processor affinity is something to consider not only for real time applications or high end computing. With the rising number of cores in a chip we need to be able to influence this for everyday tasks as well. But that's only IHMO.
M:N type of models would also have the same portability issue as it requires changing kernel code as well as the userland code in thread library.
You shouldn't shy away of some feature just because Boost requires portable implementations. Most of the time this is possible to achieve, as modern OS's have very similar functionality, just exposed using different API's. Regards Hartmut