[asio] need help with timer crash please
I incorrectly posted this to the boost developer's list, and am posting here to rectify that... I'm seeing two distinct crashes (gdb output below) on expiration of a deadline timer, and could use some help identifying possible reasons. The timer has not been deleted. The timer expired once before and was later rescheduled using expires_from_now() and async_wait(). The timer is only accessed by threads in io_service::run(), and is protected by a mutex. Form the gdb output, I'm guessing the timer is being removed on expiration and the internal structures have somehow been trashed. Any pointers to possible explanations would be appreciated. crash 1...
Program received signal SIGSEGV, Segmentation fault. [Switching to thread 3500.0x83c] 0x5ae9a800 in ?? () (gdb) bt full #0 0x5ae9a800 in ?? () No symbol table info available. #1 0x005ae5f7 in boost::asio::detail::timer_queue<boost::asio::time_traits<boos t::posix_time::ptime> >::complete_timers (this=0x5387f74) at /usr/local/include/boost-1_38/boost/asio/detail/timer_queue.hpp:205 this = ( timer_queue<boost::asio::time_traits<boost::posix_time::ptime> > * const) 0x 5387f74 this_timer = ( timer_queue<boost::asio::time_traits<boost::posix_time::ptime>
::timer_base *) 0x53a4faf #2 0x053a4fb7 in ?? () No symbol table info available. #3 0x1dcdcb20 in ?? () No symbol table info available. #4 0x005bf1c0 in boost::asio::detail::select_reactor<false>::complete_operation s_and_timers (this=0x658, lock=@0x1dcd9498) at /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_vector.h:462 this = (class select_reactor< false> * const) 0x658 this = (scoped_lock<boost::asio::detail::posix_mutex> * const) 0xa i = 500011328 #5 0x00000000 in ?? () No symbol table info available. (gdb)
crash 2...
Program received signal SIGSEGV, Segmentation fault. [Switching to thread 1320.0xa08] 0x005adff1 in boost::asio::detail::timer_queue<boost::asio::time_traits<boost::p osix_time::ptime> >::remove_timer (this=0x5387f74, t=0x53aac70) at /usr/local/include/boost-1_38/boost/asio/detail/timer_queue.hpp:399 399 t->prev_->next_ = t->next_; (gdb) bt full #0 0x005adff1 in boost::asio::detail::timer_queue<boost::asio::time_traits<boos t::posix_time::ptime> >::remove_timer (this=0x5387f74, t=0x53aac70) at /usr/local/include/boost-1_38/boost/asio/detail/timer_queue.hpp:399 this = ( timer_queue<boost::asio::time_traits<boost::posix_time::ptime> > * const) 0x 5387f74 index = 87735592 it = {_M_node = 0x53aad20} #1 0x00000002 in ?? () No symbol table info available. #2 0x00000000 in ?? () No symbol table info available. (gdb)
Thanks, Brian
<bnv <at> nc.rr.com> writes:
I'm seeing two distinct crashes (gdb output below) on expiration of a deadline timer, and could use some help identifying possible reasons.
The timer has not been deleted. The timer expired once before and was later rescheduled using expires_from_now() and async_wait(). The timer is only accessed by threads in io_service::run(), and is protected by a mutex.
I've eliminated any issues due to multi-threading as a candidate by having just one in io_service::run(), and am still seeing the crash. One additional bit of information is that the timer is scheduled for the very first time by the main process, before the worker thread enters io_service::run(). Is there any problem with scheduling a timer with one thread and then re-scheduling it with another? -Brian
I'm seeing two distinct crashes (gdb output below) on expiration of a deadline timer, and could use some help identifying possible reasons.
The timer has not been deleted. The timer expired once before and was later rescheduled using expires_from_now() and async_wait(). The timer is only accessed by threads in io_service::run(), and is protected by a mutex.
If the crash occurs during the time-out handler invocation, then probably the handler is bound to a dead object or tries to access a dead object.
Igor R <boost.lists <at> gmail.com> writes:
I'm seeing two distinct crashes (gdb output below) on expiration of a deadline timer, and could use some help identifying possible reasons.
The timer has not been deleted. The timer expired once before and was later rescheduled using expires_from_now() and async_wait(). The timer is only accessed by threads in io_service::run(), and is protected by a mutex.
If the crash occurs during the time-out handler invocation, then probably the handler is bound to a dead object or tries to access a dead object.
Hi Igor, Thanks for the reply. However, I've confirmed that the boost::asio::deadline_timer object still exists. In examples I've found for repeating timers, that the timer is re-scheduled within the handler itself. In my case, the timer is re-scheduled sometime after the handler has run. I wonder if a timer cannot be re-used this way? -Brian
Thanks for the reply. However, I've confirmed that the boost::asio::deadline_timer object still exists.
The timer object exists, but what about the objects which are accessed from the handler? Can you post some code snippet?
In examples I've found for repeating timers, that the timer is re-scheduled within the handler itself. In my case, the timer is re-scheduled sometime after the handler has run. I wonder if a timer cannot be re-used this way?
The timer can be used in any way, until you don't try to access it simultaneously from multiple threads.
One additional bit of information is that the timer is scheduled for the very first time by the main process, before the worker thread enters io_service::run(). Is there any problem with scheduling a timer with one thread and then re-scheduling it with another?
If you mean something like this: int main() { io_service io; deadline_timer timer(io); timer.expires(...); timer.async_wait(...); thread t(&io_service::run, &io); t.join(); }; then it's ok, because there's no race conditions here: when you access the timer from the main thread, the timer is not running.
Igor R <boost.lists <at> gmail.com> writes:
Thanks for the reply. However, I've confirmed that the boost::asio::deadline_timer object still exists.
The timer object exists, but what about the objects which are accessed from the handler? Can you post some code snippet?
Yes, objects used by the handler also exist, but the crash occurs before the handler is invoked. It's happening within asio just as it's processing the expired timer. We appear to have ruled out threading issues as well. The application is large, so it's difficult to post a snippet. I will try to write a test program that simulates the same behavior to see if I can reproduce the crash. -Brian
<bnv <at> nc.rr.com> writes:
I'm seeing two distinct crashes (gdb output below) on expiration of a deadline timer, and could use some help identifying possible reasons.
The timer has not been deleted. The timer expired once before and was later rescheduled using expires_from_now() and async_wait(). The timer is only accessed by threads in io_service::run(), and is protected by a mutex.
After restructuring some code, the problem moved and manifested itself differently. This continue to point to some type of corruption problem, but I could find no cause. On a hunch, I increased the stack size (this is under cygwin) and the problem appears to have gone away. Thought it worth mentioning if others run into a similar issue. -Brian
participants (3)
-
bnv
-
bnv@nc.rr.com
-
Igor R