Typically the way things work is that as long as a lock (or whatever it is you're using -- you didn't actually say) goes uncontended, or minimally contended such that a spin resolves it, then it will allocate no kernel handles. A kernel handle is only required when the lock is contended for sufficiently long that it needs to put the thread to sleep.
thank you ver much for your response, I'll try to detail the issue better. The original software is an conccurrent transactional server which preallocates a certain number of ftherads (say 100 ~ 130) and then accepts and serves TCP/IP connections in a request-reply fashion. Obviuosly, it heavily uses boost::shared_ptr to handle these connections, but in the end I managed to use other shared pointers (for testing purposes only) and the boost::thread object alone and a single shared static mutex object so, the sample program starrts, creates some threads which simply does some memory allocations loops and s. In the main function I enclosed all the code between brackets, and I print the handle count before and after, ans it's not the same, something like int main() { GetProcessHandleCount().... { [creates threads and waits for their termination through ->join] } GetProcessHandleCount().... } The difference between the first and the last count is not always the same, it depends on how much concccurent threads you create and how much iterations of the same loop each thread does. thank you in advance L. Trivelli