[test] Malloc exceptions when using test framework on 64bit OS X 10.6.1

Hi, I've been making my first forays in to using Boost on the new OS X release and have encountered some strange behaviour on 64bit machines. Testing with Boost 1.37.0 and 1.40.0 I've found that everything compiles as expected, but attempting to run any unit tests results in an error of the form: *** No errors detected test_exceptions(92517) malloc: *** error for object 0x3000100200460: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug Running the test through gdb gives the following stack trace: Program received signal SIGABRT, Aborted. 0x00007fff83f98ff6 in __kill () (gdb) bt #0 0x00007fff83f98ff6 in __kill () #1 0x00007fff8403a072 in abort () #2 0x00007fff83f51095 in free () #3 0x000000010003668a in boost::unit_test::framework_impl::clear (this=0x1000698a0) at framework.ipp:133 #4 0x0000000100036788 in boost::unit_test::framework_impl::~framework_impl (this=0x1000698a0) at framework.ipp:122 #5 0x00000001000386f2 in __tcf_1 () at framework.ipp:222 #6 0x00007fff83f5d274 in __cxa_finalize () #7 0x00007fff83f5d18c in exit () #8 0x0000000100006e93 in start () at unit_test_parameters.ipp:76 This is using the vanilla GCC build that comes with the operating system (see below), default compile flags and strictly single threaded test applications. Using built-in specs. Target: i686-apple-darwin10 Configured with: /var/tmp/gcc/gcc-5646~6/src/configure --disable-checking --enable-werror --prefix=/usr --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin10 --with-gxx-include-dir=/include/c++/4.2.1 --program-prefix=i686-apple-darwin10- --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 Thread model: posix gcc version 4.2.1 (Apple Inc. build 5646) Some Googling suggests it might be problem with a missing virtual destructor but irritatingly, none of the threads I've found reach any kind of conclusion. I was hoping the collective genius of the list would be able to inform me whether: 1. This is a known bug; I'm happy to put my debugging hat on and dive in, but not if it's already being investigated by people more qualified to do so. 2. There are any work arounds for the problem. 3. There's no problem at all; I'm being a colossal idiot and have overlooked something obvious. Thanks, vrai.

On Sep 17, 2009, at 1:02 AM, Vrai Stacey wrote:
Hi,
I've been making my first forays in to using Boost on the new OS X release and have encountered some strange behaviour on 64bit machines. Testing with Boost 1.37.0 and 1.40.0 I've found that everything compiles as expected, but attempting to run any unit tests results in an error of the form:
*** No errors detected test_exceptions(92517) malloc: *** error for object 0x3000100200460: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug
<snip> I am seeing the same thing using Boost 1.38.0 built for 64-bit Intel on OS X 10.6.1. The test program worked fine on OS X 10.5.7, but I'm seeing the same malloc errors as you. For me, the error occurs in the file framework.ipp: class framework_impl : public test_tree_visitor { public: framework_impl() : m_master_test_suite( 0 ) , m_curr_test_case( INV_TEST_UNIT_ID ) , m_next_test_case_id( MIN_TEST_CASE_ID ) , m_next_test_suite_id( MIN_TEST_SUITE_ID ) , m_is_initialized( false ) , m_test_in_progress( false ) {} ~framework_impl() { clear(); } void clear() { while( !m_test_units.empty() ) { test_unit_store::value_type const& tu = *m_test_units.begin(); // the delete will erase this element from map if( test_id_2_unit_type( tu.second->p_id ) == tut_suite ) delete (test_suite const*)tu.second; else delete (test_case const*)tu.second; <<<<------------------------- This statement generates the error } } GDB won't let me put a breakpoint anywhere in this file. Maybe I can hack it to see something... - Rush

On Sep 18, 2009, at 4:57 PM, Rush Manbert wrote:
On Sep 17, 2009, at 1:02 AM, Vrai Stacey wrote:
Hi,
I've been making my first forays in to using Boost on the new OS X release and have encountered some strange behaviour on 64bit machines. Testing with Boost 1.37.0 and 1.40.0 I've found that everything compiles as expected, but attempting to run any unit tests results in an error of the form:
*** No errors detected test_exceptions(92517) malloc: *** error for object 0x3000100200460: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug
<snip>
I am seeing the same thing using Boost 1.38.0 built for 64-bit Intel on OS X 10.6.1. The test program worked fine on OS X 10.5.7, but I'm seeing the same malloc errors as you.
For me, the error occurs in the file framework.ipp:
class framework_impl : public test_tree_visitor { public: framework_impl() : m_master_test_suite( 0 ) , m_curr_test_case( INV_TEST_UNIT_ID ) , m_next_test_case_id( MIN_TEST_CASE_ID ) , m_next_test_suite_id( MIN_TEST_SUITE_ID ) , m_is_initialized( false ) , m_test_in_progress( false ) {}
~framework_impl() { clear(); }
void clear() { while( !m_test_units.empty() ) { test_unit_store::value_type const& tu = *m_test_units.begin();
// the delete will erase this element from map if( test_id_2_unit_type( tu.second->p_id ) == tut_suite ) delete (test_suite const*)tu.second; else delete (test_case const*)tu.second; <<<<------------------------- This statement generates the error } }
GDB won't let me put a breakpoint anywhere in this file. Maybe I can hack it to see something...
Okay, I was able to debug it by doing this:
The main unit test file is called UnitTestMain.cpp, and just includes

Rush Manbert wrote:
On Sep 18, 2009, at 4:57 PM, Rush Manbert wrote:
On Sep 17, 2009, at 1:02 AM, Vrai Stacey wrote:
Hi,
I've been making my first forays in to using Boost on the new OS X release and have encountered some strange behaviour on 64bit machines. Testing with Boost 1.37.0 and 1.40.0 I've found that everything compiles as expected, but attempting to run any unit tests results in an error of the form:
*** No errors detected test_exceptions(92517) malloc: *** error for object 0x3000100200460: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug
<snip>
I am seeing the same thing using Boost 1.38.0 built for 64-bit Intel on OS X 10.6.1. The test program worked fine on OS X 10.5.7, but I'm seeing the same malloc errors as you.
For me, the error occurs in the file framework.ipp:
class framework_impl : public test_tree_visitor { public: framework_impl() : m_master_test_suite( 0 ) , m_curr_test_case( INV_TEST_UNIT_ID ) , m_next_test_case_id( MIN_TEST_CASE_ID ) , m_next_test_suite_id( MIN_TEST_SUITE_ID ) , m_is_initialized( false ) , m_test_in_progress( false ) {}
~framework_impl() { clear(); }
void clear() { while( !m_test_units.empty() ) { test_unit_store::value_type const& tu = *m_test_units.begin();
// the delete will erase this element from map if( test_id_2_unit_type( tu.second->p_id ) == tut_suite ) delete (test_suite const*)tu.second; else delete (test_case const*)tu.second; <<<<------------------------- This statement generates the error } }
GDB won't let me put a breakpoint anywhere in this file. Maybe I can hack it to see something...
Okay, I was able to debug it by doing this:
The main unit test file is called UnitTestMain.cpp, and just includes
, and that file includes all of the *.ipp files. I preprocessed UnitTestMain.cpp in Xcode and deleted all of the preprocessor lines with sed: sed '/^#/ d'
UnitTestMain.cpp and I substituted the new UnitTestMain.cpp (the preprocessed file) for the original one.
After that I could set a breakpoint in framework_impl::register_test_unit(). What I see is that the test case being registered is the "this" pointer from an object that was created with new, so it looks like the error is a bug in Snow Leopard.
That's about all I know right now. I have never used boost unit test, so don't really know how to put together a reduced test case that can be submitted to Apple. Any takers? :-)
- Rush
This seems to be once again this bug in Boost.Test: http://www.nabble.com/-Boost.Test--valgrind-complains-about-invalid-reads-td... http://www.nabble.com/-Boost.Test--Error:-Non-aligned-pointer-being-freed-td... Try making the test_unit destructor virtual and your problem should be resolved. Regards, Peter.

This seems to be once again this bug in Boost.Test:
http://www.nabble.com/-Boost.Test--valgrind-complains-about-invalid-reads-td... http://www.nabble.com/-Boost.Test--Error:-Non-aligned-pointer-being-freed-td...
Try making the test_unit destructor virtual and your problem should be resolved.
Changing the test_unt destructor (in boost/test/unit_test_suite_impl.hpp) fixes the problem in both 1.37 and 1.40. However the issue remains unfixed in SVN as of the current time. Anyway, in the short term I can at least start finding 64bit problems that are my fault. Thanks for the help, vrai.

On Sep 19, 2009, at 12:42 AM, Peter Klotz wrote:
Rush Manbert wrote:
On Sep 18, 2009, at 4:57 PM, Rush Manbert wrote:
On Sep 17, 2009, at 1:02 AM, Vrai Stacey wrote:
Hi,
I've been making my first forays in to using Boost on the new OS X release and have encountered some strange behaviour on 64bit machines. Testing with Boost 1.37.0 and 1.40.0 I've found that everything compiles as expected, but attempting to run any unit tests results in an error of the form:
*** No errors detected test_exceptions(92517) malloc: *** error for object 0x3000100200460: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug
<snip>
I am seeing the same thing using Boost 1.38.0 built for 64-bit Intel on OS X 10.6.1. The test program worked fine on OS X 10.5.7, but I'm seeing the same malloc errors as you.
For me, the error occurs in the file framework.ipp:
class framework_impl : public test_tree_visitor { public: framework_impl() : m_master_test_suite( 0 ) , m_curr_test_case( INV_TEST_UNIT_ID ) , m_next_test_case_id( MIN_TEST_CASE_ID ) , m_next_test_suite_id( MIN_TEST_SUITE_ID ) , m_is_initialized( false ) , m_test_in_progress( false ) {}
~framework_impl() { clear(); }
void clear() { while( !m_test_units.empty() ) { test_unit_store::value_type const& tu = *m_test_units.begin();
// the delete will erase this element from map if( test_id_2_unit_type( tu.second->p_id ) == tut_suite ) delete (test_suite const*)tu.second; else delete (test_case const*)tu.second; <<<<------------------------- This statement generates the error } }
GDB won't let me put a breakpoint anywhere in this file. Maybe I can hack it to see something...
Okay, I was able to debug it by doing this: The main unit test file is called UnitTestMain.cpp, and just includes
, and that file includes all of the *.ipp files. I preprocessed UnitTestMain.cpp in Xcode and deleted all of the preprocessor lines with sed: sed '/^#/ d' UnitTestMain.cpp and I substituted the new UnitTestMain.cpp (the preprocessed file) for the original one. After that I could set a breakpoint in framework_impl::register_test_unit(). What I see is that the test case being registered is the "this" pointer from an object that was created with new, so it looks like the error is a bug in Snow Leopard. That's about all I know right now. I have never used boost unit test, so don't really know how to put together a reduced test case that can be submitted to Apple. Any takers? :-) - Rush This seems to be once again this bug in Boost.Test:
http://www.nabble.com/-Boost.Test--valgrind-complains-about-invalid-reads-td... http://www.nabble.com/-Boost.Test--Error:-Non-aligned-pointer-being-freed-td...
Try making the test_unit destructor virtual and your problem should be resolved.
Thank you, Peter. That does resolve the problem. Do you have any idea whether 1.40 still has this problem? - Rush

Rush Manbert wrote:
On Sep 19, 2009, at 12:42 AM, Peter Klotz wrote:
Rush Manbert wrote:
On Sep 18, 2009, at 4:57 PM, Rush Manbert wrote:
On Sep 17, 2009, at 1:02 AM, Vrai Stacey wrote:
Hi,
I've been making my first forays in to using Boost on the new OS X release and have encountered some strange behaviour on 64bit machines. Testing with Boost 1.37.0 and 1.40.0 I've found that everything compiles as expected, but attempting to run any unit tests results in an error of the form:
*** No errors detected test_exceptions(92517) malloc: *** error for object 0x3000100200460: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug
<snip>
I am seeing the same thing using Boost 1.38.0 built for 64-bit Intel on OS X 10.6.1. The test program worked fine on OS X 10.5.7, but I'm seeing the same malloc errors as you.
For me, the error occurs in the file framework.ipp:
class framework_impl : public test_tree_visitor { public: framework_impl() : m_master_test_suite( 0 ) , m_curr_test_case( INV_TEST_UNIT_ID ) , m_next_test_case_id( MIN_TEST_CASE_ID ) , m_next_test_suite_id( MIN_TEST_SUITE_ID ) , m_is_initialized( false ) , m_test_in_progress( false ) {}
~framework_impl() { clear(); }
void clear() { while( !m_test_units.empty() ) { test_unit_store::value_type const& tu = *m_test_units.begin();
// the delete will erase this element from map if( test_id_2_unit_type( tu.second->p_id ) == tut_suite ) delete (test_suite const*)tu.second; else delete (test_case const*)tu.second; <<<<------------------------- This statement generates the error } }
GDB won't let me put a breakpoint anywhere in this file. Maybe I can hack it to see something...
Okay, I was able to debug it by doing this: The main unit test file is called UnitTestMain.cpp, and just includes
, and that file includes all of the *.ipp files. I preprocessed UnitTestMain.cpp in Xcode and deleted all of the preprocessor lines with sed: sed '/^#/ d' UnitTestMain.cpp and I substituted the new UnitTestMain.cpp (the preprocessed file) for the original one. After that I could set a breakpoint in framework_impl::register_test_unit(). What I see is that the test case being registered is the "this" pointer from an object that was created with new, so it looks like the error is a bug in Snow Leopard. That's about all I know right now. I have never used boost unit test, so don't really know how to put together a reduced test case that can be submitted to Apple. Any takers? :-) - Rush This seems to be once again this bug in Boost.Test:
http://www.nabble.com/-Boost.Test--valgrind-complains-about-invalid-reads-td...
http://www.nabble.com/-Boost.Test--Error:-Non-aligned-pointer-being-freed-td...
Try making the test_unit destructor virtual and your problem should be resolved.
Thank you, Peter. That does resolve the problem.
Do you have any idea whether 1.40 still has this problem?
Yes, 1.40.0 also suffers from this bug. Maybe we can finally convince Gennadiy to fix this issue in SVN. This would stop forcing people to use their own patches each time a new version of Boost is released. Regards, Peter.

On Mon, Sep 21, 2009 at 22:06, Peter Klotz
Rush Manbert wrote:
On Sep 19, 2009, at 12:42 AM, Peter Klotz wrote:
Rush Manbert wrote:
On Sep 18, 2009, at 4:57 PM, Rush Manbert wrote:
On Sep 17, 2009, at 1:02 AM, Vrai Stacey wrote:
Hi,
I've been making my first forays in to using Boost on the new OS X release and have encountered some strange behaviour on 64bit machines. Testing with Boost 1.37.0 and 1.40.0 I've found that everything compiles as expected, but attempting to run any unit tests results in an error of the form:
*** No errors detected test_exceptions(92517) malloc: *** error for object 0x3000100200460: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug
<snip>
I am seeing the same thing using Boost 1.38.0 built for 64-bit Intel on OS X 10.6.1. The test program worked fine on OS X 10.5.7, but I'm seeing the same malloc errors as you.
For me, the error occurs in the file framework.ipp:
class framework_impl : public test_tree_visitor { public: framework_impl() : m_master_test_suite( 0 ) , m_curr_test_case( INV_TEST_UNIT_ID ) , m_next_test_case_id( MIN_TEST_CASE_ID ) , m_next_test_suite_id( MIN_TEST_SUITE_ID ) , m_is_initialized( false ) , m_test_in_progress( false ) {}
~framework_impl() { clear(); }
void clear() { while( !m_test_units.empty() ) { test_unit_store::value_type const& tu = *m_test_units.begin();
// the delete will erase this element from map if( test_id_2_unit_type( tu.second->p_id ) == tut_suite ) delete (test_suite const*)tu.second; else delete (test_case const*)tu.second; <<<<------------------------- This statement generates the error } }
GDB won't let me put a breakpoint anywhere in this file. Maybe I can hack it to see something...
Okay, I was able to debug it by doing this: The main unit test file is called UnitTestMain.cpp, and just includes
, and that file includes all of the *.ipp files. I preprocessed UnitTestMain.cpp in Xcode and deleted all of the preprocessor lines with sed: sed '/^#/ d' UnitTestMain.cpp and I substituted the new UnitTestMain.cpp (the preprocessed file) for the original one. After that I could set a breakpoint in framework_impl::register_test_unit(). What I see is that the test case being registered is the "this" pointer from an object that was created with new, so it looks like the error is a bug in Snow Leopard. That's about all I know right now. I have never used boost unit test, so don't really know how to put together a reduced test case that can be submitted to Apple. Any takers? :-) - Rush This seems to be once again this bug in Boost.Test:
http://www.nabble.com/-Boost.Test--valgrind-complains-about-invalid-reads-td...
http://www.nabble.com/-Boost.Test--Error:-Non-aligned-pointer-being-freed-td...
Try making the test_unit destructor virtual and your problem should be resolved.
Thank you, Peter. That does resolve the problem.
Do you have any idea whether 1.40 still has this problem?
Yes, 1.40.0 also suffers from this bug.
Maybe we can finally convince Gennadiy to fix this issue in SVN.
This would stop forcing people to use their own patches each time a new version of Boost is released.
+1 I reported this before too. *everyone* (on 64bit?) is seeing this, and the fix is trivial. Peter

On Sep 21, 2009, at 1:20 PM, Peter Soetens wrote:
On Mon, Sep 21, 2009 at 22:06, Peter Klotz
wrote: Rush Manbert wrote:
On Sep 19, 2009, at 12:42 AM, Peter Klotz wrote:
Rush Manbert wrote:
On Sep 18, 2009, at 4:57 PM, Rush Manbert wrote:
On Sep 17, 2009, at 1:02 AM, Vrai Stacey wrote:
> Hi, > > I've been making my first forays in to using Boost on the new > OS X > release and have encountered some strange behaviour on 64bit > machines. > Testing with Boost 1.37.0 and 1.40.0 I've found that everything > compiles as expected, but attempting to run any unit tests > results in > an error of the form: > > *** No errors detected > test_exceptions(92517) malloc: *** error for object > 0x3000100200460: > pointer being freed was not allocated > *** set a breakpoint in malloc_error_break to debug > <snip>
I am seeing the same thing using Boost 1.38.0 built for 64-bit Intel on OS X 10.6.1. The test program worked fine on OS X 10.5.7, but I'm seeing the same malloc errors as you.
For me, the error occurs in the file framework.ipp:
class framework_impl : public test_tree_visitor { public: framework_impl() : m_master_test_suite( 0 ) , m_curr_test_case( INV_TEST_UNIT_ID ) , m_next_test_case_id( MIN_TEST_CASE_ID ) , m_next_test_suite_id( MIN_TEST_SUITE_ID ) , m_is_initialized( false ) , m_test_in_progress( false ) {}
~framework_impl() { clear(); }
void clear() { while( !m_test_units.empty() ) { test_unit_store::value_type const& tu = *m_test_units.begin();
// the delete will erase this element from map if( test_id_2_unit_type( tu.second->p_id ) == tut_suite ) delete (test_suite const*)tu.second; else delete (test_case const*)tu.second; <<<<------------------------- This statement generates the error } }
GDB won't let me put a breakpoint anywhere in this file. Maybe I can hack it to see something...
Okay, I was able to debug it by doing this: The main unit test file is called UnitTestMain.cpp, and just includes
, and that file includes all of the *.ipp files. I preprocessed UnitTestMain.cpp in Xcode and deleted all of the preprocessor lines with sed: sed '/^#/ d' UnitTestMain.cpp and I substituted the new UnitTestMain.cpp (the preprocessed file) for the original one. After that I could set a breakpoint in framework_impl::register_test_unit(). What I see is that the test case being registered is the "this" pointer from an object that was created with new, so it looks like the error is a bug in Snow Leopard. That's about all I know right now. I have never used boost unit test, so don't really know how to put together a reduced test case that can be submitted to Apple. Any takers? :-) - Rush This seems to be once again this bug in Boost.Test:
http://www.nabble.com/-Boost.Test--valgrind-complains-about-invalid-reads-td...
http://www.nabble.com/-Boost.Test--Error:-Non-aligned-pointer-being-freed-td...
Try making the test_unit destructor virtual and your problem should be resolved.
Thank you, Peter. That does resolve the problem.
Do you have any idea whether 1.40 still has this problem?
Yes, 1.40.0 also suffers from this bug.
Maybe we can finally convince Gennadiy to fix this issue in SVN.
This would stop forcing people to use their own patches each time a new version of Boost is released.
+1
I reported this before too. *everyone* (on 64bit?) is seeing this, and the fix is trivial.
Now I have gone back and read the original thread about valgrind complaints. I see that Gennadiy is rather adamant about not changing the code. However, under Snow Leopard the deletion of test cases fails completely and aborts the program. The C-style cast in framework_impl::clear() does not work under Snow Leopard, but it does work correctly if test_unit has a virtual destructor. I suspect that the noise level over this is going to increase rapidly. - Rush

Rush Manbert
Now I have gone back and read the original thread about valgrind complaints. I see that Gennadiy is rather adamant about not changing the code.
I am not adamant. I just want to understand what is wrong and need some time to look deeper. I plan to look into this Friday/Saturday. Genandiy P.S. Does anyone have vmware image for the system where I can easily reproduce this?

Hi All, Section "Test organization or the house that Jack built" of Boost.Test documentation (http://www.boost.org/doc/libs/1_40_0/libs/test/doc/html/utf/user-guide/test-...) says The single test module may mix both automated and manual test case registration. In other words, within the same test module you can have both test cases implemented remotely and registered manually in the test module initialization function and test cases that are registered automatically at implementation point. I would like to do exactly that but I couldn't find any example of how to do this. Usually test initialization happens in init_unit_test_suite but if I use BOOST_TEST_MODULE this one is generated for me automatically and as I understand is in charge of automated tests registration. If I can't define my own init_unit_test_suite what is the proper place to register test cases manually while keeping automated registration logics in place? WBR Oleg V. Zhylin ovz@yahoo.com

Oleg V. Zhylin
I would like to do exactly that but I couldn't find any example of how to do this. Usually test initialization happens in init_unit_test_suite, but if I use BOOST_TEST_MODULE this one is generated for me automatically
Do not define this.
and as I understand is in charge of automated tests registration.
No. It's not. If you using dynamic library variant of UTF you'll be required to implement main as well. Gennadiy

Gennadiy Rozental wrote:
Rush Manbert
writes: Now I have gone back and read the original thread about valgrind complaints. I see that Gennadiy is rather adamant about not changing the code.
I am not adamant. I just want to understand what is wrong and need some time to look deeper. I plan to look into this Friday/Saturday.
Genandiy
P.S. Does anyone have vmware image for the system where I can easily reproduce this?
Ok. I was able to get hold of freebsd 64 bit image and reproduce this. Further investigation uncovered that the issue with statement: delete (test_case*)tu.second; is not the non-virtual destructor of class test_unit, but the fact that compiler decides to evaluate the expression tu.second twice: once before calling the destructor and once before calling free. By the type the destructor is called tu is invalid. Not sure what is so specific about 64 bit, but i believe the issues should be resolved (by caching the value before delete statement). Gennadiy

Gennadiy Rozental wrote:
Gennadiy Rozental wrote:
Rush Manbert
writes: Now I have gone back and read the original thread about valgrind complaints. I see that Gennadiy is rather adamant about not changing the code.
I am not adamant. I just want to understand what is wrong and need some time to look deeper. I plan to look into this Friday/Saturday.
Genandiy
P.S. Does anyone have vmware image for the system where I can easily reproduce this?
Ok. I was able to get hold of freebsd 64 bit image and reproduce this.
Further investigation uncovered that the issue with statement:
delete (test_case*)tu.second;
is not the non-virtual destructor of class test_unit, but the fact that compiler decides to evaluate the expression tu.second twice: once before calling the destructor and once before calling free. By the type the destructor is called tu is invalid.
Not sure what is so specific about 64 bit, but i believe the issues should be resolved (by caching the value before delete statement).
Hello Gennadiy Great that you finally had the time to investigate this issue. I have two remarks about your analysis: 1) Why are you insisting on casting tu.second to test_case*? Actually you want your derived class destroyed but tu.second is a base class pointer of type test_unit*. You can simply call "delete tu.second" if the test_unit destructor is virtual. The whole casting thing is in my opinion just a workaround for not making the destructor virtual. 2) I am seeing the problem on 32Bit Intel Linux as well, so it is not restricted to 64Bit. Regards, Peter.

Peter Klotz
1) Why are you insisting on casting tu.second to test_case*?
Actually you want your derived class destroyed but tu.second is a base class pointer of type test_unit*. You can simply call "delete tu.second" if the test_unit destructor is virtual. The whole casting thing is in my opinion just a workaround for not making the destructor virtual.
I disagree. The whole idea is that I do NOT use these pointers polymorphically. And never intended to. This is just a storage for proper memory management, where I've decided to use one array instead of two and thus keep them as test_unit*, knowing that I have correct type at runtime anyway. Making the destructor virtual is only make sense if there are other virtual function (and even in this case there is exception). test_unit has none. Also don't forget that making needlessly class destructor virtual you immediately increase instance size by 4 or 8
2) I am seeing the problem on 32Bit Intel Linux as well, so it is not restricted to 64Bit.
This makes more sense. Gennadiy

On Sep 29, 2009, at 1:56 AM, Gennadiy Rozental wrote:
Peter Klotz
writes: 1) Why are you insisting on casting tu.second to test_case*?
Actually you want your derived class destroyed but tu.second is a base class pointer of type test_unit*. You can simply call "delete tu.second" if the test_unit destructor is virtual. The whole casting thing is in my opinion just a workaround for not making the destructor virtual.
I disagree. The whole idea is that I do NOT use these pointers polymorphically. And never intended to. This is just a storage for proper memory management, where I've decided to use one array instead of two and thus keep them as test_unit*, knowing that I have correct type at runtime anyway.
Making the destructor virtual is only make sense if there are other virtual function (and even in this case there is exception). test_unit has none. Also don't forget that making needlessly class destructor virtual you immediately increase instance size by 4 or 8
I see your point, but by not using the pointer polymorphically to delete the derived class objects, you now need to write a workaround for how some compilers are implemented. But if the base class destructor were virtual, the delete would be guaranteed to work by the language definition, regardless of the compiler implementation. You wouldn't care how the compiler implemented it, it would just work. And it will still just work as compilers and runtime libraries for C++ evolve. That seems to make more sense to me. I think the size argument is less important today than it might have been in the past. Any user of Boost has bought into a ton of template classes and methods that generate a lot of code and produce large programs. We've all got lots of memory. And even if I had a million test cases, that just costs me an extra 4-8 Mb, and even then it's probably only in my test environment, not my released product. I'll trade that to gain reliability any day. - Rush

AMDG Rush Manbert wrote:
I see your point, but by not using the pointer polymorphically to delete the derived class objects, you now need to write a workaround for how some compilers are implemented. But if the base class destructor were virtual, the delete would be guaranteed to work by the language definition, regardless of the compiler implementation. You wouldn't care how the compiler implemented it, it would just work. And it will still just work as compilers and runtime libraries for C++ evolve. That seems to make more sense to me.
As Gennadiy has already said, the fact that making the destructor virtual fixes the problem is an artifact of the way that compilers implement virtual destructors. with a non virtual destructor, the compiler generates read the pointer call the destructor read the pointer call operator delete. The problem is the second read after the destructor has run, With a virtual destructor the compiler generates read the pointer use the v-table to call the destructor/delete. This is a single function call so the pointer is only looked up once. In Christ, Steven Watanabe

Steven Watanabe wrote:
With a virtual destructor the compiler generates read the pointer use the v-table to call the destructor/delete. This is a single function call so the pointer is only looked up once.
Is it necessary has to be single function call? What is so special about this case? Don't we just added search through virtual functions table? Gennadiy

AMDG Gennadiy Rozental wrote:
Steven Watanabe wrote:
With a virtual destructor the compiler generates read the pointer use the v-table to call the destructor/delete. This is a single function call so the pointer is only looked up once.
Is it necessary has to be single function call?
There is no reason it has to be done this way. I was just describing what I've seen compilers generate.
What is so special about this case? Don't we just added search through virtual functions table?
The only thing special about the virtual destructor is that two function calls (destructor and operator delete) are combined in a single virtual function call. In Christ, Steven Watanabe

On Wed, Sep 30, 2009 at 20:48, Steven Watanabe
AMDG
Rush Manbert wrote:
I see your point, but by not using the pointer polymorphically to delete the derived class objects, you now need to write a workaround for how some compilers are implemented. But if the base class destructor were virtual, the delete would be guaranteed to work by the language definition, regardless of the compiler implementation. You wouldn't care how the compiler implemented it, it would just work. And it will still just work as compilers and runtime libraries for C++ evolve. That seems to make more sense to me.
As Gennadiy has already said, the fact that making the destructor virtual fixes the problem is an artifact of the way that compilers implement virtual destructors.
Great! I can point to 10000 lines of code in Boost that accommodate for compiler quirks, especially for MSVS and Borland compilers. So for once, do the GNU people the same honor and fix this single line of code for our automated testing happiness. I don't see any reason here why it shouldn't be fixed for the GNU compiler, giving the portability/non-discriminality that Boost claims to have over all compilers. Just #ifdef the virtual around the destructor for GNU compilers and we can all happily carry on. Peter

AMDG Peter Soetens wrote:
On Wed, Sep 30, 2009 at 20:48, Steven Watanabe
wrote: Rush Manbert wrote:
I see your point, but by not using the pointer polymorphically to delete the derived class objects, you now need to write a workaround for how some compilers are implemented. But if the base class destructor were virtual, the delete would be guaranteed to work by the language definition, regardless of the compiler implementation. You wouldn't care how the compiler implemented it, it would just work. And it will still just work as compilers and runtime libraries for C++ evolve. That seems to make more sense to me.
As Gennadiy has already said, the fact that making the destructor virtual fixes the problem is an artifact of the way that compilers implement virtual destructors.
Great! I can point to 10000 lines of code in Boost that accommodate for compiler quirks, especially for MSVS and Borland compilers. So for once, do the GNU people the same honor and fix this single line of code for our automated testing happiness.
Compiler specific workarounds should only be used when we're dealing with compiler bugs or compiler specific features. This is neither.
I don't see any reason here why it shouldn't be fixed for the GNU compiler, giving the portability/non-discriminality that Boost claims to have over all compilers.
Because there's a better fix that guarantees that it will work correctly on all compilers...
Just #ifdef the virtual around the destructor for GNU compilers and we can all happily carry on.
IMO, it's a bad idea to throw in a hack to fix a problem without understanding why it works. In Christ, Steven Watanabe

On Mon, Oct 12, 2009 at 20:18, Steven Watanabe
AMDG
Peter Soetens wrote:
On Wed, Sep 30, 2009 at 20:48, Steven Watanabe
wrote: Rush Manbert wrote:
I see your point, but by not using the pointer polymorphically to delete the derived class objects, you now need to write a workaround for how some compilers are implemented. But if the base class destructor were virtual, the delete would be guaranteed to work by the language definition, regardless of the compiler implementation. You wouldn't care how the compiler implemented it, it would just work. And it will still just work as compilers and runtime libraries for C++ evolve. That seems to make more sense to me.
As Gennadiy has already said, the fact that making the destructor virtual fixes the problem is an artifact of the way that compilers implement virtual destructors.
Great! I can point to 10000 lines of code in Boost that accommodate for compiler quirks, especially for MSVS and Borland compilers. So for once, do the GNU people the same honor and fix this single line of code for our automated testing happiness.
Compiler specific workarounds should only be used when we're dealing with compiler bugs or compiler specific features. This is neither.
I don't see any reason here why it shouldn't be fixed for the GNU compiler, giving the portability/non-discriminality that Boost claims to have over all compilers.
Because there's a better fix that guarantees that it will work correctly on all compilers...
Which I must have missed. What better fix has been confirmed to solve this problem ?
Just #ifdef the virtual around the destructor for GNU compilers and we can all happily carry on.
IMO, it's a bad idea to throw in a hack to fix a problem without understanding why it works.
I'm not sure what you're aiming at (my understanding or yours), but this is not an argument. What we do know is that making the destructor virtual and dropping the cast will produce the correct behavior for virtually every C++ compiler out there. Peter

Couldn't we just use shared_ptr to manage the object? That won't need
a virtual destructor and in all likelihood will work fine since we
don't have similar bug reports for shared_ptr.
Emil Dotchevski
Reverge Studios, Inc.
http://www.revergestudios.com/reblog/index.php?n=ReCode
On Mon, Oct 12, 2009 at 1:05 PM, Peter Soetens
On Mon, Oct 12, 2009 at 20:18, Steven Watanabe
wrote: AMDG
Peter Soetens wrote:
On Wed, Sep 30, 2009 at 20:48, Steven Watanabe
wrote: Rush Manbert wrote:
I see your point, but by not using the pointer polymorphically to delete the derived class objects, you now need to write a workaround for how some compilers are implemented. But if the base class destructor were virtual, the delete would be guaranteed to work by the language definition, regardless of the compiler implementation. You wouldn't care how the compiler implemented it, it would just work. And it will still just work as compilers and runtime libraries for C++ evolve. That seems to make more sense to me.
As Gennadiy has already said, the fact that making the destructor virtual fixes the problem is an artifact of the way that compilers implement virtual destructors.
Great! I can point to 10000 lines of code in Boost that accommodate for compiler quirks, especially for MSVS and Borland compilers. So for once, do the GNU people the same honor and fix this single line of code for our automated testing happiness.
Compiler specific workarounds should only be used when we're dealing with compiler bugs or compiler specific features. This is neither.
I don't see any reason here why it shouldn't be fixed for the GNU compiler, giving the portability/non-discriminality that Boost claims to have over all compilers.
Because there's a better fix that guarantees that it will work correctly on all compilers...
Which I must have missed. What better fix has been confirmed to solve this problem ?
Just #ifdef the virtual around the destructor for GNU compilers and we can all happily carry on.
IMO, it's a bad idea to throw in a hack to fix a problem without understanding why it works.
I'm not sure what you're aiming at (my understanding or yours), but this is not an argument. What we do know is that making the destructor virtual and dropping the cast will produce the correct behavior for virtually every C++ compiler out there.
Peter _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Emil Dotchevski wrote:
Couldn't we just use shared_ptr to manage the object? That won't need a virtual destructor and in all likelihood will work fine since we don't have similar bug reports for shared_ptr.
The only thing shared_ptr would do to us here is to add level of indirection leading to the destructor and free being called with the same address. And I already achieved this ;) Gennadiy

Peter Soetens wrote:
On Mon, Oct 12, 2009 at 20:18, Steven Watanabe
wrote: AMDG
Because there's a better fix that guarantees that it will work correctly on all compilers...
Which I must have missed. What better fix has been confirmed to solve this problem ?
Yes. I've fixed the issue some time ago.
Just #ifdef the virtual around the destructor for GNU compilers and we can all happily carry on.
IMO, it's a bad idea to throw in a hack to fix a problem without understanding why it works.
I'm not sure what you're aiming at (my understanding or yours), but this is not an argument. What we do know is that making the destructor virtual and dropping the cast will produce the correct behavior for virtually every C++ compiler out there.
The virtual destructor produces virtually correct behavior on virtually every C++ compiler virtually every time ;) Gennadiy

Peter Soetens wrote:
As Gennadiy has already said, the fact that making the destructor virtual fixes the problem is an artifact of the way that compilers implement virtual destructors.
Great! I can point to 10000 lines of code in Boost that accommodate for compiler quirks, especially for MSVS and Borland compilers. So for once, do the GNU people the same honor and fix this single line of code for our automated testing happiness.
I don't see any reason here why it shouldn't be fixed for the GNU compiler, giving the portability/non-discriminality that Boost claims to have over all compilers.
Just #ifdef the virtual around the destructor for GNU compilers and we can all happily carry on.
Not sure I follow what the fuss is about. The fix already checked in. And it does not require ifdef. Gennadiy

Gennadiy Rozental wrote:
Gennadiy Rozental wrote:
Rush Manbert
writes: Now I have gone back and read the original thread about valgrind complaints. I see that Gennadiy is rather adamant about not changing the code.
I am not adamant. I just want to understand what is wrong and need some time to look deeper. I plan to look into this Friday/Saturday.
Genandiy
P.S. Does anyone have vmware image for the system where I can easily reproduce this?
Ok. I was able to get hold of freebsd 64 bit image and reproduce this.
Further investigation uncovered that the issue with statement:
delete (test_case*)tu.second;
is not the non-virtual destructor of class test_unit, but the fact that compiler decides to evaluate the expression tu.second twice: once before calling the destructor and once before calling free. By the type the destructor is called tu is invalid.
Not sure what is so specific about 64 bit, but i believe the issues should be resolved (by caching the value before delete statement).
Hello Gennadiy It seems that your fix did not make it into 1.41.0. Which release of Boost will contain the fix? Regards, Peter.

Peter Klotz
Hello Gennadiy
It seems that your fix did not make it into 1.41.0.
Which release of Boost will contain the fix?
Is release branch open for merges? I can do this today. Gennadiy

Peter Soetens
I reported this before too. *everyone* (on 64bit?) is seeing this, and the fix is trivial.
Can you explain why the issue occur? It seems I'll not be able to get to this hardware till next weekend, but any input is appreciated. Gennadiy
participants (9)
-
Emil Dotchevski
-
Gennadiy Rozental
-
Oleg V. Zhylin
-
Peter Klotz
-
Peter Soetens
-
Rush Manbert
-
Steven Watanabe
-
Vrai Stacey
-
Vrai Stacey