[thread] thread_specific_ptr leaks on VS2012 when using std::async()

Hi, The title says it all. I think I found a memory leak with the help of boost.log author Andrei Semashev. I was using Boost.Log in Visual Studio 2012 (released one) and tried to use Visual Leak Detector because Visual Studio basic leak detector was reporting leaks. I isolated the leaks to usage of log and switched to VLD to check if the basic detector wasn't giving me false positive. I contacted Andrei to see if he could check it but he don't have VS2012 available so I setup some tests to isolate the problem. In the end of our exchange we concluded it was a false positive (it only leak on end of program). However it is still an annoying noise in the leak report logs, so people not understanding the source of this problem might be interested in this email. I attach the whole email exchange to this email, tell me if leak logs are necessary too. The last version of my test is at the end of the email (first program). Basically, the leak appear only when using std::async() in combination of boost::thread_specific_ptr : - using exclusively boost threading libraries (both with dynamic and static linking) result in no leak - using exclusively std threading libraries but not std::async() result in no leak - using std::async() result in a leak The second test program shows that using Intel Threading Building Blocks, I get leaks only if : - I use std::async() in combination with boost::thread_specific_ptr, as before (I mean I use tbb without a leak, then I add a std::async() call to the same task function and there is a leak) - I don't use std::async() but I don't manage the lifetime of tbb task scheduler We think the threads spawned by both tbb (that is supposed to be a superset of ppl) and std::async() (that is supposed to use ppl/concrt task scheduler) are not ended yet when the program reach the end of main (which seems logical as the task scheduler is then managed as a static global), making boost::thread_specific_ptr not release some data on thread destruction, as it is notified too late and any leak detection tool willl generate noise due to this. A way to fix this is to explicitely end the task scheduler, being tbb or ppl/concrt. I will use tbb task scheduler explicitely initialized/terminated in my project so it will solve the problem on my side. Joel Lamotte ------------------------------------ // this program was used to see the memory leak #define TEST_ASYNC // comment this to make it linear and not leak //#define WITH_BOOST_THREADS #define WITH_STD_ASYNC // uncomment this to leak //#define WITH_BOOST_LOG // boost log also leak if std::async is used #include <boost/thread.hpp> #ifdef WITH_BOOST_THREADS # include <boost/thread/future.hpp> # include <boost/chrono.hpp> # define THIS_THREAD boost::this_thread # define THREAD_LIB boost::thread # define CHRONO_LIB boost::chrono # define MY_FUTURE boost::future #else # include <future> # include <thread> # include <chrono> # define THIS_THREAD std::this_thread # define THREAD_LIB std::thread # define CHRONO_LIB std::chrono # define MY_FUTURE std::future #endif #ifdef WITH_BOOST_LOG # include <boost/log/trivial.hpp> # define MY_LOG BOOST_LOG_TRIVIAL(info) #else # include <iostream> # define MY_LOG std::cout << '\n' #endif #define BASIC_VS_LEAK_DETECTOR 0 #define VISUAL_LEAK_DETECTOR 1 #define LEAK_DETECTOR VISUAL_LEAK_DETECTOR #if LEAK_DETECTOR == BASIC_VS_LEAK_DETECTOR # define _CRTDBG_MAP_ALLOC # include <stdlib.h> # include <crtdbg.h> #else # include "vld.h" #endif struct my_data { char data[1024]; my_data() { MY_LOG << "CONSTRUCTOR my_data - " << THIS_THREAD::get_id(); } ~my_data() { MY_LOG << "DESTRUCTOR my_data - " << THIS_THREAD::get_id(); } }; boost::thread_specific_ptr< my_data > ts_ptr; void my_thread_foo() { ts_ptr.reset(new my_data()); } int work() { for( int i = 0; i < 42; ++i ) { MY_LOG << "THIS IS A MESSAGE - I = " << i; MY_LOG << "ts_ptr = " << ts_ptr.get(); my_thread_foo(); THIS_THREAD::sleep_for( CHRONO_LIB::milliseconds(10) ); } return 0; } int main(int argc, char *argv[]) { #if LEAK_DETECTOR == BASIC_VS_LEAK_DETECTOR // First simple memory leak detection for Visual Studio // use _crtBreakAlloc to put a breakpoint on the provided memory leak id allocation //_crtBreakAlloc = 1149; _CrtSetDbgFlag ( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF ); #endif //int* k = new int[42]; // to check that memory leak detection does work MY_LOG << "Hello, World!"; #ifdef TEST_ASYNC # ifndef WITH_BOOST_THREADS std::vector< std::future<int> > work_in_progress(2); # ifdef WITH_STD_ASYNC for( auto& ftr : work_in_progress ) { ftr = std::async( work ); } # else for( auto& ftr : work_in_progress ) { std::packaged_task< int() > task(work); ftr = task.get_future(); std::thread thread( std::move(task) ); thread.detach(); } # endif for( auto& ftr : work_in_progress ) { ftr.get(); } # else std::vector< boost::unique_future<int> > work_in_progress(42); for( auto& ftr : work_in_progress ) { boost::packaged_task<int> task(work); // boost's packaged_task don't take the full signature as parameter, it takes the return type only ftr = task.get_future(); boost::thread thread( boost::move(task) ); thread.detach(); } for( auto& ftr : work_in_progress ) { ftr.get(); } # endif #else for( int i = 0; i < 42; ++i ) { work(); } #endif MY_LOG << "END"; return EXIT_SUCCESS; } -------------------------------------- // this program tests if the memory leaks appear with Threading Building Blocks #define BASIC_VS_LEAK_DETECTOR 0 #define VISUAL_LEAK_DETECTOR 1 #define LEAK_DETECTOR VISUAL_LEAK_DETECTOR #define WITH_ROOT_TASK //#define WITH_BOOST_LOG //#define WITH_STD_ASYNC // uncomment to leak #ifdef WITH_STD_ASYNC # include <future> #endif #include <thread> #include <atomic> #include <boost/thread.hpp> #include <tbb/tbb.h> #include <tbb/task_scheduler_init.h> #ifdef WITH_BOOST_LOG # include <boost/log/trivial.hpp> # define MY_LOG BOOST_LOG_TRIVIAL(info) #else # include <iostream> # define MY_LOG std::cout << '\n' #endif #if LEAK_DETECTOR == BASIC_VS_LEAK_DETECTOR # define _CRTDBG_MAP_ALLOC # include <stdlib.h> # include <crtdbg.h> #else # include "vld.h" #endif namespace test { struct my_data { char data[1024]; }; boost::thread_specific_ptr< my_data > ts_ptr; void my_thread_foo() { ts_ptr.reset(new my_data()); } class UpdateTask : public tbb::task { public: UpdateTask() : idx( s_idx++ ) { MY_LOG << "[UpdateTask " << idx << " ] " << std::endl; } ~UpdateTask() { MY_LOG << "[/UpdateTask " << idx << " ]" << std::endl; } tbb::task* execute() { MY_LOG << "[" << std::this_thread::get_id() << "] Tick "<< m_count <<std::endl; my_thread_foo(); ++m_count; if( m_count < 100 ) { // make sure this is re-executed this->increment_ref_count(); this->recycle_as_safe_continuation(); } return nullptr; } private: static std::atomic<int> s_idx; std::atomic<int> m_count; const int idx; }; std::atomic<int> UpdateTask::s_idx = 0; class TaskScheduler // using TBB { public: #ifdef WITH_ROOT_TASK TaskScheduler() : m_root_task( *new (tbb::task::allocate_root()) tbb::empty_task ) { m_root_task.increment_ref_count(); } ~TaskScheduler() { m_root_task.destroy(m_root_task); } #endif void register_update_task() { #ifdef WITH_ROOT_TASK m_root_task.increment_ref_count(); UpdateTask& updater = *new(m_root_task.allocate_child()) UpdateTask(); #else UpdateTask& updater = *new(tbb::task::allocate_root()) UpdateTask(); #endif m_task_list.push_back( updater ); } void run_and_wait() { #ifdef WITH_ROOT_TASK m_root_task.spawn( m_task_list ); m_root_task.wait_for_all(); #else tbb::task::spawn_root_and_wait( m_task_list ); #endif } private: tbb::task_list m_task_list; // uncomment the following line to not leak tbb //tbb::task_scheduler_init m_engine_instance; // We explicitly create and destroy the TBB task scheduler to avoid memory leak warning noises. #ifdef WITH_ROOT_TASK tbb::task& m_root_task; #endif }; void tbb_test() { TaskScheduler task_scheduler; for( int i = 0; i < 50; ++i ) { task_scheduler.register_update_task(); } task_scheduler.run_and_wait(); } } int main(int argc, char *argv[]) { #if LEAK_DETECTOR == BASIC_VS_LEAK_DETECTOR // First simple memory leak detection for Visual Studio // use _crtBreakAlloc to put a breakpoint on the provided memory leak id allocation //_crtBreakAlloc = 148; _CrtSetDbgFlag ( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF ); #endif #ifdef WITH_STD_ASYNC std::atomic<bool> finished = false; std::async( [&]{ while( !finished ) { MY_LOG << "POP"; test::my_thread_foo(); std::this_thread::sleep_for( std::chrono::milliseconds(500) ); } } ); #endif test::tbb_test(); #ifdef WITH_STD_ASYNC finished = true; #endif MY_LOG << "\nEND"; std::this_thread::sleep_for( std::chrono::seconds(4) ); return 0; }

Le 05/09/12 10:33, Klaim - Joël Lamotte a écrit :
I attach the whole email exchange to this email, tell me if leak logs are necessary too.
The last version of my test is at the end of the email (first program). Basically, the leak appear only when using std::async() in combination of boost::thread_specific_ptr : - using exclusively boost threading libraries (both with dynamic and static linking) result in no leak - using exclusively std threading libraries but not std::async() result in no leak - using std::async() result in a leak
The second test program shows that using Intel Threading Building Blocks, I get leaks only if : - I use std::async() in combination with boost::thread_specific_ptr, as before (I mean I use tbb without a leak, then I add a std::async() call to the same task function and there is a leak) - I don't use std::async() but I don't manage the lifetime of tbb task scheduler
We think the threads spawned by both tbb (that is supposed to be a superset of ppl) and std::async() (that is supposed to use ppl/concrt task scheduler) are not ended yet when the program reach the end of main (which seems logical as the task scheduler is then managed as a static global),
Shouldn't the future destructor synchronize with the hidden thread used by std::async()? Best, Vicente

Before answering, I need to point this: I sent this problem MS and apparently there is a bug that will be fixed in future STL versions of Visual Studio. See: http://connect.microsoft.com/VisualStudio/feedback/details/761209/boost-thre... "Hi, Thanks for reporting this bug. We've fixed it, and the fix will be available in the next release of our C++ Standard Library implementation. The problem was that constructing a std::thread initialized an "at_thread_exit_mutex", which was never destroyed. We've added an atexit() call to clean up this mutex, and we've additionally changed std::mutex and std::condition_variable's internal allocations to be marked as "CRT blocks", which prevents them from being reported as leaks by the CRT's leak-tracking machinery. Note: Connect doesn't notify me about comments. If you have any further questions, please E-mail me. Stephan T. Lavavej Senior Developer - Visual C++ Libraries stl@microsoft.com" On Mon, Mar 4, 2013 at 11:46 PM, Vicente J. Botet Escriba < vicente.botet@wanadoo.fr> wrote:
Shouldn't the future destructor synchronize with the hidden thread used by std::async()?
This document ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3451.pdf ) says: "Practical field experience with futures has demonstrated one major problem, which was discussed in May 2012 at the SG1 meeting in Bellevue. ~future/~shared_future Must Never Block “Blocking is evil.” – Various The good news is that the Standard’s specification of future and shared_future destructors specifies that they never block. This is vital. The bad news is that the Standard specifies that the associated state of an operation launched by std::async (only!) does nevertheless cause future destructors to block. This is very bad for several reasons, three of which are summarized below in rough order from least to most important. Example 1: Consistency First, consider these two pieces of code: // Example 1 // (a) // (b) { { async( []{ f(); } ); auto f1 = async( []{ f(); } ); async( []{ g(); } ); auto f2 = async( []{ g(); } ); } } Users are often surprised to discover that (a) and (b) do not have the same behavior, because normally we ignore (and/or don’t look at) return values if we end up deciding we’re not interested in the value and doing so does not change the meaning of our program. The two cannot have the same behavior if ~future joins." So I think you are right, the future's destructor should have been blocking when using async(). However, Herb Sutter commented on this document and the Visual Studio implementation on his blog: Question: http://herbsutter.com/2013/01/15/videos-panel-and-c-concurrency/#comment-810... "In September 2012 you posted a paper discussing the problems in the present standard, with a blocking ~future(). In my experiments I have found that it does no block! (VS 2012 – update 1) Has the standard already been revised? Nice, but I would’ve thought that the standard was already carved into stone? Shed some light, anyone? " Answer: http://herbsutter.com/2013/01/15/videos-panel-and-c-concurrency/#comment-813... "@Adam H: Right, VS 2012 is knowingly (temporarily) non-conforming on the ~future blocking issue in part because we knew this question is being actively reviewed in the committee and we feel the status quo is harmful to our customers. Of course, if the committee’s decision is to stay with the C++11 rules we’ll change our implementation to conform and have to do more to educate customers about the issue. If the view that this should be fixed prevails in the committee, we’ll look like geniuses for getting it “right” sooner, otherwise we’ll conform and look like doofuses for shipping with another bug we had to go back and fix. (We don’t do this often! Usually the right thing to do is just conform to the standard, perceived bugs and all. But this bug is bad and under discussion at the last two SG1 meetings…)" Which basically mean it's VS2012 not following the standard on this specific rule. I had to find again these data I read before to answer your question, but I might have missed data or understanding. Does it help? Joel Lamotte

[Klaim - Joël Lamotte]
Before answering, I need to point this: I sent this problem MS and apparently there is a bug that will be fixed in future STL versions of Visual Studio. See: http://connect.microsoft.com/VisualStudio/feedback/details/761209/boost-thre...
Yep. There's a separate bug that std::mutex shouldn't dynamically allocate memory at all, but we're not going to have time to fix that in the next release. STL

Le 05/03/13 00:38, Klaim - Joël Lamotte a écrit :
Before answering, I need to point this: I sent this problem MS and apparently there is a bug that will be fixed in future STL versions of Visual Studio. See: http://connect.microsoft.com/VisualStudio/feedback/details/761209/boost-thre... "Hi,
Thanks for reporting this bug. We've fixed it, and the fix will be available in the next release of our C++ Standard Library implementation.
The problem was that constructing a std::thread initialized an "at_thread_exit_mutex", which was never destroyed. We've added an atexit() call to clean up this mutex, and we've additionally changed std::mutex and std::condition_variable's internal allocations to be marked as "CRT blocks", which prevents them from being reported as leaks by the CRT's leak-tracking machinery.
Note: Connect doesn't notify me about comments. If you have any further questions, please E-mail me.
Stephan T. Lavavej Senior Developer - Visual C++ Libraries stl@microsoft.com"
On Mon, Mar 4, 2013 at 11:46 PM, Vicente J. Botet Escriba < vicente.botet@wanadoo.fr> wrote:
Shouldn't the future destructor synchronize with the hidden thread used by std::async()?
This document ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3451.pdf ) says:
"Practical field experience with futures has demonstrated one major problem, which was discussed in May 2012 at the SG1 meeting in Bellevue.
~future/~shared_future Must Never Block “Blocking is evil.” – Various
The good news is that the Standard’s specification of future and shared_future destructors specifies that they never block. This is vital. The bad news is that the Standard specifies that the associated state of an operation launched by std::async (only!) does nevertheless cause future destructors to block. This is very bad for several reasons, three of which are summarized below in rough order from least to most important.
Example 1: Consistency
First, consider these two pieces of code: // Example 1 // (a) // (b) { { async( []{ f(); } ); auto f1 = async( []{ f(); } ); async( []{ g(); } ); auto f2 = async( []{ g(); } ); } }
Users are often surprised to discover that (a) and (b) do not have the same behavior, because normally we ignore (and/or don’t look at) return values if we end up deciding we’re not interested in the value and doing so does not change the meaning of our program. The two cannot have the same behavior if ~future joins."
So I think you are right, the future's destructor should have been blocking when using async(). However, Herb Sutter commented on this document and the Visual Studio implementation on his blog:
Question: http://herbsutter.com/2013/01/15/videos-panel-and-c-concurrency/#comment-810...
"In September 2012 you posted a paper discussing the problems in the present standard, with a blocking ~future(). In my experiments I have found that it does no block! (VS 2012 – update 1) Has the standard already been revised? Nice, but I would’ve thought that the standard was already carved into stone? Shed some light, anyone? "
Answer: http://herbsutter.com/2013/01/15/videos-panel-and-c-concurrency/#comment-813...
"@Adam H: Right, VS 2012 is knowingly (temporarily) non-conforming on the ~future blocking issue in part because we knew this question is being actively reviewed in the committee and we feel the status quo is harmful to our customers. Of course, if the committee’s decision is to stay with the C++11 rules we’ll change our implementation to conform and have to do more to educate customers about the issue.
If the view that this should be fixed prevails in the committee, we’ll look like geniuses for getting it “right” sooner, otherwise we’ll conform and look like doofuses for shipping with another bug we had to go back and fix.
(We don’t do this often! Usually the right thing to do is just conform to the standard, perceived bugs and all. But this bug is bad and under discussion at the last two SG1 meetings…)"
Which basically mean it's VS2012 not following the standard on this specific rule.
I had to find again these data I read before to answer your question, but I might have missed data or understanding. Does it help?
Yes, I'm aware of Herb desiderata. Anyway C++11 has been released with these kind of futures blocking on the destructor. IMHO, MSVC should fix this issue. Best, Vicente

On Tue, Mar 5, 2013 at 1:04 PM, Vicente J. Botet Escriba < vicente.botet@wanadoo.fr> wrote:
Yes, I'm aware of Herb desiderata. Anyway C++11 has been released with these kind of futures blocking on the destructor. IMHO, MSVC should fix this issue.
Indeed at least for now it would make code using async easier to port (currently, after several tries of different multi-threading patterns, I ended up just never relying on async at all). However I don't know what is the current state of discussion of the issue is and if it's going to make future's destructor non-blocking whatever the use, then it seems that MS will not be willing to fix the issue at all even if it takes again tons of years to get the next standard. Joel Lamotte
participants (3)
-
Klaim - Joël Lamotte
-
Stephan T. Lavavej
-
Vicente J. Botet Escriba