Hello
I m using ASIO library (from boost 1.35) in a network daemon (running on
a Debian Etch system). The code structure is almost the same as the one
described in the HTTP Server example
(http://tenermerx.com/Asio/boost_asio_1_3_1/doc/html/boost_asio/examples.html...).
One io_service is used, async_accept() creates new "sessions" and so on.
The daemon can run flawlessly for weeks (or only some hours), and crash
randomly (segmentation fault). The "network" load isn't really high,
every 5 minutes, a few megabytes of data (characters) are sent to the
daemon. I didn't noticed any memory leaks or null pointers accesses.
Among many crashes, I found 2 kind of crashes.
Since core are dumped, I tried to debug it, but I don't know how to
interpret the core result. The binary wasn't linked against debug
libraries and I just can get debug data from the binary itself.
Here's the gdb output on "bt full" command. The daemon has 4 threads,
here s the gdb output of the one which causes the crash (I assume this
is this one)
First kind of crash : (only the relevant part is pasted, the output is
huge). It seems to be related to the way I handle the timeout on a timer.
In my daemon, the io_service thread may call close() function on the
socket object when another thread may call cancel() on the socket
object. Could this lead to a crash ? (I can post source code if needed)
What is the best way to handle receive timeout *and* socket & timer
close from another thread ?
Thread 1 (process 27120):
Program terminated with signal 11, Segmentation fault.
#0 0xb7c7c024 in pthread_mutex_lock () from
/lib/tls/i686/cmov/libpthread.so.0
No symbol table info available.
#1 0xb7d5f0c6 in pthread_mutex_lock () from /lib/tls/i686/cmov/libc.so.6
No symbol table info available.
#2 0x08060438 in boost::asio::detail::posix_mutex::lock (this=0x16) at
/usr/include/boost/asio/detail/posix_mutex.hpp:71
error = 0
#3 0x08060559 in scoped_lock (this=0xb796807c, m=@0x16) at
/usr/include/boost/asio/detail/scoped_lock.hpp:36
No locals.
#4 0x08060ad0 in
boost::asio::detail::epoll_reactor<false>::close_descriptor (this=0x2,
descriptor=-1291829448)
at /usr/include/boost/asio/detail/epoll_reactor.hpp:297
lock = {boost::noncopyable_::noncopyable = {<No data fields>}, mutex_
= @0x16, locked_ = 212}
ev = {events = 6, data = {ptr = 0x25, fd = 37, u32 = 37, u64 = 8589934629}}
#5 0x0809b189 in
boost::asio::detail::reactive_socket_service