Mutexes claim a Windows event handle

Hi! I have a problem with boost::recursive_mutex "sometimes" keeping a handle to a Windows Event. My application uses a lot of objects, each containing a recursive_mutex. The objects are accessed from different threads using a scoped_lock (see code below). In Boost 1.34 there was no problem, but after we switched to 1.39 it seems that the process collects more and more handles to Windows Events. Only when the objects that contain the recursive_mutex are destroyed, the handles are released again. The problem is that I must support 100,000 objects containg multiple mutexes, so if each mutex can cost one handle the total number of handles in use may become huge. I tried the latest Boost version (1.43) but that makes no difference. The strange thing is that not every object keeps a handle, i.e.: the number of handles collected is less than the number of objects created. When increasing the nrActions the number of handles collected comes closer to the number of objects. I could understand if a recursive_mutex costs a handle, except that in boost 1.34 they didn't so I wonder if this was an accident or a deliberate change. And most important: Is there a way to get the same behavior as in 1.34 where a mutex did not cost me one handle each? Thanks for your advice! Code example: #include <list> #include <boost/bind.hpp> #include <boost/thread.hpp> #include <boost/thread/recursive_mutex.hpp> #include <boost/enable_shared_from_this.hpp> class ActiveObject : public boost::enable_shared_from_this<ActiveObject> { private: boost::recursive_mutex mActionLock; void action() { boost::recursive_mutex::scoped_lock asyncLock(mActionLock); } public: void startAction() { boost::recursive_mutex::scoped_lock syncLock(mActionLock); boost::thread th(boost::bind(&ActiveObject::action, shared_from_this())); } }; int main() { { std::list< boost::shared_ptr<ActiveObject> > jobList; for (int nrObjects = 0; nrObjects < 1000; ++nrObjects) { boost::shared_ptr<ActiveObject> pActiveObject( new ActiveObject() ); jobList.push_back(pActiveObject); for (int nrActions = 0; nrActions < 3; ++nrActions) { pActiveObject->startAction(); } } // At this point, a lot of handles are in use. } // Here, the handles are released again. exit(0); } This message and attachment(s) are intended solely for use by the addressee and may contain information that is privileged, confidential or otherwise exempt from disclosure under applicable law. If you are not the intended recipient or agent thereof responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by telephone and with a 'reply' message. Thank you for your co-operation.

I tried the latest Boost version (1.43) but that makes no difference. The strange thing is that not every object keeps a handle, i.e.: the number of handles collected is less than the number of objects created. When increasing the nrActions the number of handles collected comes closer to the number of objects.
I could understand if a recursive_mutex costs a handle, except that in boost 1.34 they didn't so I wonder if this was an accident or a deliberate change. And most important: Is there a way to get the same behavior as in 1.34 where a mutex did not cost me one handle each?
Typically implementations lazily create the Event object when the mutex cannot be aquired. I guess that Boost Thread's implementation is doing this in the same fashion as Window's CRITICAL_SECTION. I don't have Boost 1.34 around, but I would be surprised if there never was an Event object related to the mutex.. Have nothing else changed in your application, that increases the chance of thread contention? / Christian

"Peters, Richard" <richard.peters@oce.com> writes:
I have a problem with boost::recursive_mutex "sometimes" keeping a handle to a Windows Event. My application uses a lot of objects, each containing a recursive_mutex. The objects are accessed from different threads using a scoped_lock (see code below). In Boost 1.34 there was no problem, but after we switched to 1.39 it seems that the process collects more and more handles to Windows Events. Only when the objects that contain the recursive_mutex are destroyed, the handles are released again. The problem is that I must support 100,000 objects containg multiple mutexes, so if each mutex can cost one handle the total number of handles in use may become huge.
The old boost::mutex implementation used CRITICAL_SECTION on Windows; the new version uses atomics and a lazy-allocated Event, which is only allocated in the case of actual contention on the mutex. CRITICAL_SECTION objects have similar behaviour.
I tried the latest Boost version (1.43) but that makes no difference. The strange thing is that not every object keeps a handle, i.e.: the number of handles collected is less than the number of objects created. When increasing the nrActions the number of handles collected comes closer to the number of objects.
That's the lazy allocation at work --- you only need the Event if there's contention for the mutex.
I could understand if a recursive_mutex costs a handle, except that in boost 1.34 they didn't so I wonder if this was an accident or a deliberate change. And most important: Is there a way to get the same behavior as in 1.34 where a mutex did not cost me one handle each?
No, there is not. Why do you need so many mutexes? Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

At Tue, 01 Jun 2010 07:42:27 +0100, Anthony Williams wrote:
Richard Peters wrote:
I could understand if a recursive_mutex costs a handle, except that in boost 1.34 they didn't so I wonder if this was an accident or a deliberate change. And most important: Is there a way to get the same behavior as in 1.34 where a mutex did not cost me one handle each?
No, there is not.
Why do you need so many mutexes?
Yeah, as a threading non-expert but somewhat educated bystander, what Richard described smells like a design with many locks at object granularity, which is almost always a mistake. Have you considered revisiting the architecture? -- Dave Abrahams Meet me at BoostCon: http://www.boostcon.com BoostPro Computing http://www.boostpro.com

Why do you need so many mutexes?
Yeah, as a threading non-expert but somewhat educated bystander, what Richard described smells like a design with many locks at object granularity, which is almost always a mistake. Have you considered revisiting the architecture?
--
What is the alternative to placing the mutex at object level? Having some kind of mutex pool? / Christian

At Tue, 1 Jun 2010 12:30:20 -0500, Christian Holmquist wrote:
Why do you need so many mutexes?
Yeah, as a threading non-expert but somewhat educated bystander, what Richard described smells like a design with many locks at object granularity, which is almost always a mistake. Have you considered revisiting the architecture?
--
What is the alternative to placing the mutex at object level? Having some kind of mutex pool?
“at object level” != “object granularity” By the latter I mean a system where every object has its own mutex, without regard to the invariants *between* objects in the system that must be protected. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Why do you need so many mutexes?
Yeah, as a threading non-expert but somewhat educated bystander, what Richard described smells like a design with many locks at object granularity, which is almost always a mistake. Have you considered revisiting the architecture?
--
What is the alternative to placing the mutex at object level? Having some kind of mutex pool?
“at object level” != “object granularity”
By the latter I mean a system where every object has its own mutex, without regard to the invariants *between* objects in the system that must be protected.
--
I re-read the original post and see what you mean: " The problem is that I must support 100,000 objects containg multiple mutexes " ^^^^^^^^ Indeed, that'll lead to more problems than the waste of resources. / Christian

Anthony Williams wrote:
Richard Peters wrote:
I could understand if a recursive_mutex costs a handle, except that in boost 1.34 they didn't so I wonder if this was an accident or a deliberate change. And most important: Is there a way to get the same behavior as in 1.34 where a mutex did not cost me one handle each?
No, there is not.
Why do you need so many mutexes?
Yeah, as a threading non-expert but somewhat educated bystander, what Richard described smells like a design with many locks at object granularity, which is almost always a mistake. Have you considered revisiting the architecture?
I'll explain our architecture at a high level, and see if we can come up with a better way of locking. Those 100.000 'objects' are in fact print jobs in a print controller. Each job contains the structure describing what the job looks like (sheets and pages and such, among with their attributes like stapling or how the bitmaps should be rotated), stuff for maintaining progress and accounting, a pipeline of objects doing all sorts of manipulations like making it a duplex job or creating booklets of it, etc. A job is constructed by either ripping a document (for instance postscript) or by scanning a document. This ripping or scanning is kind of repeated after a user edited the job on the user interface. All the while, the printer wants to get information from the job, in order to be able to print the job while it's still ripping. Now in this job we have a few mutexes for , but a single mutex for all jobs together doesn't cut the case either: a job which is scanning might be busy with building a booklet (which is a time-consuming operation), and all the while printing of another job must be able to continue. Anyway, we ran tests with boost 1.34, and indeed, that version also consumes as much handles. We also found that we overestimated our maximum job capacity, at about 10.000 jobs, our memory is full, and the number of handles is not yet critically high. Any ideas of how to reduce our number of mutexes would be welcome, though. best regards, Richard This message and attachment(s) are intended solely for use by the addressee and may contain information that is privileged, confidential or otherwise exempt from disclosure under applicable law. If you are not the intended recipient or agent thereof responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by telephone and with a 'reply' message. Thank you for your co-operation.

----- Original Message ----- From: "Peters, Richard" <richard.peters@oce.com> To: <boost@lists.boost.org> Sent: Thursday, June 03, 2010 1:50 PM Subject: Re: [boost] Mutexes claim a Windows event handle
I'll explain our architecture at a high level, and see if we can come up with a better way of locking.
Those 100.000 'objects' are in fact print jobs in a print controller. Each job contains the structure describing what the job looks like (sheets and pages and such, among with their attributes like stapling or how the bitmaps should be rotated), stuff for maintaining progress and accounting, a pipeline of objects doing all sorts of manipulations like making it a duplex job or creating booklets of it, etc. A job is constructed by either ripping a document (for instance postscript) or by scanning a document. This ripping or scanning is kind of repeated after a user edited the job on the user interface. All the while, the printer wants to get information from the job, in order to be able to print the job while it's still ripping.
Do you confirm that you want to have up to 100.000 printer jobs concurrently? Best, Vicente

"Peters, Richard" <richard.peters@oce.com> writes:
Those 100.000 'objects' are in fact print jobs in a print controller. Each job contains the structure describing what the job looks like (sheets and pages and such, among with their attributes like stapling or how the bitmaps should be rotated), stuff for maintaining progress and accounting, a pipeline of objects doing all sorts of manipulations like making it a duplex job or creating booklets of it, etc. A job is constructed by either ripping a document (for instance postscript) or by scanning a document. This ripping or scanning is kind of repeated after a user edited the job on the user interface. All the while, the printer wants to get information from the job, in order to be able to print the job while it's still ripping.
So both the printer and the code creating the job need access to the same job object. OK.
Now in this job we have a few mutexes for , but a single mutex for all jobs together doesn't cut the case either: a job which is scanning might be busy with building a booklet (which is a time-consuming operation), and all the while printing of another job must be able to continue.
You shouldn't hold a mutex lock across a time-consuming operation, so the time-consuming nature of bulding a booklet shouldn't matter.
Anyway, we ran tests with boost 1.34, and indeed, that version also consumes as much handles. We also found that we overestimated our maximum job capacity, at about 10.000 jobs, our memory is full, and the number of handles is not yet critically high. Any ideas of how to reduce our number of mutexes would be welcome, though.
Firstly, the code preparing the data should do as much as possible without holding any locks. When it has prepared a chunk of data that the printer should see (e.g. a complete page), it can then acquire the relevant lock, publish the data and release the lock. This may happen multiple times for a single job. Likewise on the printer side --- it should do as much as possible without holding the lock. When it needs more data it should acquire the lock, get the data and then release the lock before processing the data. If you do this then the time spent holding each lock should be short. You may then be able to get away with a single lock on the printer. If you still need more mutexes (possibly because there are lots of jobs being produced at the same time by multiple threads) then you can create a table of mutexes. You choose a mutex by hashing a unique job identifier (e.g. its memory address) to get an index into the table. This limits the number of mutexes to the size of the mutex table, but potentially means that on occasion multiple jobs executing concurrently will use the same mutex. A similar scheme is to use one mutex per producer thread. The mutex from the thread that is producing a job is somehow associated with the job, so the printer can lock that mutex when it needs to access the job data. Once the job creation has finished the mutex association can be broken, and the printer need never lock the mutex again. This way producer threads won't ever contend for the same mutex, so you get all the benefits of the per-job mutex without the need for such a plethora of mutexes. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976
participants (5)
-
Anthony Williams
-
Christian Holmquist
-
David Abrahams
-
Peters, Richard
-
vicente.botet