Tracking down a race condition
In the logging example I have been working on to learn more about boost
threads I have run across what I think is a race condition. If I compile
and run this sample program it will sometimes write out an error message
to a file. For example:
2010-01-23_16:2:23 (pid=10948 thread_id=0x92ce448) (II) funky
Other times I see the file is blank. So can I say conclusively that its
a race condition? No. Either what is happening is a race condition where
the file is closed before the contents are written to the file or
something else is causing the logging message to be written sometimes
and not at others.
So to recap:
--- How can you track down a race condition?
--- Can I use 'strace' or 'gdb'?
Stephen
-------------------------------------------
#include
Personally I will first try to output few lines of simple message to std
err, see if if key codes (i.e. add_Trace(), processTraces(), constructor and
destructor, ...) was executed AND in the correct order, providing that the
problem is not too race to replicate. If it does not help then only I will
use gdb for stepping through it (
http://www.google.com.sg/search?q=gdb+tutorial). Honestly I'm relatively
green in using gdb especially in debugging multi-threaded code so please
share your approach :)
On Sun, Jan 24, 2010 at 5:09 AM, Stephen Torri
In the logging example I have been working on to learn more about boost threads I have run across what I think is a race condition. If I compile and run this sample program it will sometimes write out an error message to a file. For example:
2010-01-23_16:2:23 (pid=10948 thread_id=0x92ce448) (II) funky
Other times I see the file is blank. So can I say conclusively that its a race condition? No. Either what is happening is a race condition where the file is closed before the contents are written to the file or something else is causing the logging message to be written sometimes and not at others.
So to recap:
--- How can you track down a race condition? --- Can I use 'strace' or 'gdb'?
Stephen
------------------------------------------- #include
#include #include #include #include #include #include #include #include #include #include #include #include #include <fstream> #include <sstream> #include <vector>
namespace myproject { namespace api {
/*! * \brief Trace level constants * \date 2003 * \author Stephen Torri */ class TraceLevel { public: enum Values { NONE = 0, WARN = 10, ERROR = 20, INFO = 30, DEBUG = 40, DETAIL = 50, DATA = 60 }; };
} // namespace api } // namespace myproject
namespace myproject { namespace interface {
class ITraceable { public:
typedef boost::shared_ptr<ITraceable> ptr_t;
virtual std::string get_Formatted_String() = 0;
virtual boost::uint32_t getLevel() = 0;
virtual boost::uint32_t getPID() = 0;
virtual boost::thread::id getThreadID() = 0; }; } // namespace interface } // namespace myproject
namespace myproject { namespace trace {
class StringTraceable : public interface::ITraceable { public:
StringTraceable ( boost::uint32_t area, boost::uint32_t level, std::string name ) : m_area ( area ), m_level ( level ), m_name ( name ) { #ifndef WIN32 m_pid = getpid(); #else m_pid = GetCurrentProcessId(); #endif /* WIN32 */
m_thread_id = boost::this_thread::get_id(); }
virtual ~StringTraceable(){}
virtual std::string get_Formatted_String() { return m_name; }
virtual boost::uint32_t getLevel() { return m_level; }
virtual boost::uint32_t getPID() { return m_pid; }
virtual boost::thread::id getThreadID() { return m_thread_id; }
private:
boost::uint32_t m_area; boost::uint32_t m_level; boost::uint32_t m_pid; boost::thread::id m_thread_id; std::string m_name; };
template <typename UnsignedType> class NumberTraceable : public interface::ITraceable { public:
NumberTraceable ( boost::uint32_t area, boost::uint32_t level, std::string name, UnsignedType line ) : m_area ( area ), m_level ( level ), m_name ( name ), m_line ( line ) { #ifndef WIN32 m_pid = getpid(); #else m_pid = GetCurrentProcessId(); #endif /* WIN32 */
m_thread_id = boost::this_thread::get_id(); }
virtual ~NumberTraceable(){}
virtual std::string get_Formatted_String() { return m_name; }
virtual boost::uint32_t getLevel() { return m_level; }
virtual boost::uint32_t getPID() { return m_pid; }
virtual boost::thread::id getThreadID() { return m_thread_id; }
private:
boost::uint32_t m_area; boost::uint32_t m_level; boost::uint32_t m_pid; boost::thread::id m_thread_id; std::string m_name; UnsignedType m_line; };
template <typename T> struct storage_for { boost::aligned_storage
data; }; template <typename T> class Singleton : boost::noncopyable { public:
static T& Instance() { boost::call_once ( init, m_flag ); return *m_obj; }
static void init () { m_obj.reset ( new T() ); }
protected:
~Singleton(){} Singleton(){}
private:
static boost::scoped_ptr<T> m_obj; static boost::once_flag m_flag; };
} // namespace trace } // namespace trace
template <typename T> boost::scoped_ptr<T> myproject::trace::Singleton<T>::m_obj ( 0 ); template <typename T> boost::once_flag myproject::trace::Singleton<T>::m_flag = BOOST_ONCE_INIT;
namespace myproject { namespace trace {
class Trace_State_Utils { public:
static std::string get_Level_String ( boost::uint32_t level ) { // Write message prefix if ( level == api::TraceLevel::WARN ) { return "(WW) "; } else if ( level == api::TraceLevel::ERROR ) { return "(EE) "; } else if ( level == api::TraceLevel::INFO ) { return "(II) "; } else if ( level == api::TraceLevel::DEBUG ) { return "(DEBUG) "; } else if ( level == api::TraceLevel::DETAIL ) { return "(DETAIL) "; } else if ( level == api::TraceLevel::DATA ) { return "(DATA) "; } else { // We should not be here abort(); } } };
/* Idea taken from http://www.codeproject.com/debug/xytrace.asp * (28 Jan 2002 - Xiangyang Liu) * * I have modified this so that we don't use varargs and use constant types */ class Trace_State { public:
typedef boost::shared_ptr
ptr_t; Trace_State() : m_file_prefix ( "Trace" ) {}
virtual ~Trace_State();
Trace_State ( std::string prefix );
void set_Trace_File_Prefix ( std::string name );
void set_Trace_Level ( boost::uint32_t level );
void set_Trace_Area_Mask ( boost::uint32_t mask );
void open_Trace_File ( void );
std::string get_ID_String ( void );
void close_Trace_File ( void );
boost::uint32_t get_Trace_Level ( void ) const;
boost::uint32_t get_Trace_Area_Mask ( void ) const;
bool is_Valid_Level ( boost::uint32_t lvl );
bool is_Valid_Area_Mask ( boost::uint32_t mask );
void write_Message ( interface::ITraceable::ptr_t trace ) { // Write ID m_log_stream << boost::format("%s (pid=%d thread_id=" ) % this->get_ID_String() % trace->getPID() << trace->getThreadID() << ") " << std::flush;
m_log_stream << Trace_State_Utils::get_Level_String(trace->getLevel()) << trace->get_Formatted_String() << std::endl << std::flush; }
private:
std::string m_file_prefix;
std::ofstream m_log_stream; };
} /* namespace trace */ } /* namespace trace */
namespace myproject { namespace trace {
Trace_State::Trace_State( std::string prefix ) : m_file_prefix ( prefix ) {}
Trace_State::~Trace_State() { this->close_Trace_File(); }
void Trace_State::set_Trace_File_Prefix ( std::string name ) { }
bool Trace_State::is_Valid_Level ( boost::uint32_t lvl ) { return true; }
bool Trace_State::is_Valid_Area_Mask ( boost::uint32_t ) { return true; }
void Trace_State::set_Trace_Level ( boost::uint32_t level ) { }
void Trace_State::set_Trace_Area_Mask ( boost::uint32_t mask ) { }
void Trace_State::open_Trace_File ( void ) { if ( ! m_log_stream.is_open() ) { // Create file name std::stringstream name;
name << boost::format("%s_%s.txt") % m_file_prefix % this->get_ID_String();
std::string file_name = name.str();
m_log_stream.open ( file_name.c_str() ); } }
std::string Trace_State::get_ID_String ( void ) { // Create id string std::stringstream name;
// Get current time boost::posix_time::ptime now = boost::posix_time::second_clock::local_time();
std::tm tm_ref = boost::posix_time::to_tm ( now );
boost::gregorian::date today = now.date();
name << boost::format ( "%1%_%2%:%3%:%4%" ) % boost::gregorian::to_iso_extended_string ( today ) % tm_ref.tm_hour % tm_ref.tm_min % tm_ref.tm_sec;
return name.str(); }
void Trace_State::close_Trace_File ( void ) { if ( m_log_stream.is_open() ) { m_log_stream << std::flush;
m_log_stream.close(); } }
boost::uint32_t Trace_State::get_Trace_Level ( void ) const { return 0; }
boost::uint32_t Trace_State::get_Trace_Area_Mask ( void ) const { return 0; }
} /* namespace trace */ } /* namespace trace */
namespace myproject { namespace trace {
// // Turn Trace into a singleton class that contains a worker thread // - Wait for data // - If data exists then process // - Else wait // Add method for adding a ITraceable object. This will lock the logging vector, add // element then signal the boost condition variable class Trace : public Singleton<Trace> {
friend class Singleton<Trace>;
public:
~Trace();
static void write_Trace ( boost::uint32_t area, boost::uint32_t level, std::string name ) { interface::ITraceable::ptr_t string_ptr ( new StringTraceable ( area, level, name ) ); Trace::Instance().add_Trace ( string_ptr ); }
template <typename UnsignedType> static void write_Trace ( boost::uint32_t area, boost::uint32_t level, std::string format, UnsignedType value ) { interface::ITraceable::ptr_t unsigned_ptr ( new NumberTraceable<UnsignedType> ( area, level, format, value ) ); Trace::Instance().add_Trace ( unsigned_ptr ); }
void add_Trace ( interface::ITraceable::ptr_t trace );
void set_Trace_Level ( boost::uint32_t level );
void set_Trace_Area_Mask ( boost::uint32_t mask );
void open_Trace_File ();
void close_Trace_File();
private:
Trace();
// Worker thread functions void threadMain(); void processTraces(boost::unique_lockboost::mutex& lock);
// Synchronization variables boost::mutex m_lock; boost::condition_variable m_condition;
// Data variables typedef std::vector < myproject::interface::ITraceable::ptr_t > QueueType; QueueType m_queue;
// Thread variables boost::shared_ptrboost::thread m_thread; bool m_hasWork;
// Logging Trace_State m_state; };
} /* namespace trace */ } /* namespace myproject */
namespace myproject { namespace trace {
Trace::Trace () : m_hasWork ( true ), m_state ( "Trace" ) { // start thread m_thread = boost::shared_ptrboost::thread ( new boost::thread ( boost::bind ( &Trace::threadMain, this ) ) ); }
Trace::~Trace() { // stop thread { boost::lock_guardboost::mutex lock ( m_lock ); m_hasWork = false; }
m_condition.notify_one();
m_thread->join(); }
void Trace::add_Trace ( interface::ITraceable::ptr_t trace ) { // lock queue { boost::lock_guardboost::mutex lock ( m_lock ); m_queue.push_back ( trace ); }
m_condition.notify_one(); }
void Trace::threadMain() { boost::unique_lockboost::mutex lock ( m_lock );
while ( m_hasWork ) { if ( m_queue.size() == 0 ) { m_condition.wait ( lock ); } else { this->processTraces(lock); } } }
void Trace::processTraces(boost::unique_lockboost::mutex& lock) { assert ( lock.owns_lock() );
QueueType local_queue ( m_queue.begin(), m_queue.end() ); m_queue.clear();
lock.unlock();
for ( QueueType::iterator pos = local_queue.begin(); pos != local_queue.end(); pos++ ) { myproject::interface::ITraceable::ptr_t trace_obj = (*pos); m_state.write_Message ( trace_obj ); }
lock.lock(); }
void Trace::set_Trace_Level ( boost::uint32_t level ) { m_state.set_Trace_Level ( level ); }
void Trace::set_Trace_Area_Mask ( boost::uint32_t mask ) { m_state.set_Trace_Area_Mask ( mask ); }
void Trace::open_Trace_File() { m_state.open_Trace_File(); }
void Trace::close_Trace_File() { m_state.close_Trace_File(); }
} /* namespace trace */ } /* namespace myproject */
int main ( int, char** ) { namespace trace = myproject::trace; namespace api = myproject::api;
trace::Trace::Instance().open_Trace_File();
myproject::interface::ITraceable::ptr_t tmp_ptr ( new trace::StringTraceable( 0, api::TraceLevel::INFO, "funky" ) );
trace::Trace::Instance().add_Trace ( tmp_ptr );
trace::Trace::Instance().close_Trace_File(); }
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Stephen Torri wrote:
In the logging example I have been working on to learn more about boost threads I have run across what I think is a race condition. If I compile and run this sample program it will sometimes write out an error message to a file. For example:
2010-01-23_16:2:23 (pid=10948 thread_id=0x92ce448) (II) funky
Other times I see the file is blank. So can I say conclusively that its a race condition? No. Either what is happening is a race condition where the file is closed before the contents are written to the file or something else is causing the logging message to be written sometimes and not at others.
You can see what is happening by making the writing of logs slower: void Trace::threadMain() { boost::this_thread::sleep(boost::posix_time::seconds(1)); // Insert this boost::unique_lockboost::mutex lock ( m_lock ); ... I'd say you need to join the threads before closing the file... HTH Roland
On Mon, 2010-01-25 at 19:38 +0100, Roland Bock wrote:
You can see what is happening by making the writing of logs slower:
void Trace::threadMain() { boost::this_thread::sleep(boost::posix_time::seconds(1)); // Insert this boost::unique_lockboost::mutex lock ( m_lock ); ...
I made it slower adding in the line. I did not see anything in the
output. How did you observe the program after adding that line? Did you
use a debugger or just execute it?
I ran it in gdb with the following commands:
(gdb) b myproject::trace::Trace::threadMain
(gdb) r
This got me into the threadMain when it is started for the first time.
Where I noticed that m_hasWork was set to false. I was expecting the
initalizers to be executed before calling the thread creation. I moved
the initialization of m_hasWork to just before the thread creation but
it appears to be still false when I ran it again. So I put it inside the
threadMain to set m_hasWork to true. Well it appears that there is a one
copy of m_hasWork in the thread memory space which is different from the
main program memory space.
I moved the file opening to just before the creating the thread and
closed if after the join in the destructor.
Now it writes but fails to exit. I believe this is related problem to
the m_hasWork since the thread never exits because its version of
m_hasWork is always true.
Am I creating the thread incorrectly? The following code shows the
problem.
Stephen
---------------------------
#include
Stephen Torri wrote:
On Mon, 2010-01-25 at 19:38 +0100, Roland Bock wrote:
You can see what is happening by making the writing of logs slower:
void Trace::threadMain() { boost::this_thread::sleep(boost::posix_time::seconds(1)); // Insert this boost::unique_lockboost::mutex lock ( m_lock ); ...
I made it slower adding in the line. I did not see anything in the output. How did you observe the program after adding that line? Did you use a debugger or just execute it?
Hi, if I read your original code correctly, you have two threads: one closes the file, one wants to write to it. If you have a racing condition, you can often see a change by making one thread slower. And voila, there is a change: Instead of something getting an empty log file, you now always get an empty log file (the close always wins the race). No debugger needed for this. Regarding your new problem: Please try to isolate the problem in code that compiles and consists of a few dozen lines (the less, the more probable that someone will take the time to read it). Regards, Roland
participants (3)
-
Roland Bock
-
Shiou Ming Lee
-
Stephen Torri