[BOOST THREAD] Threads Spawning Unexpectedly

older
Re: [Boost-users] Serialization...

Terrimane Pritchett

26 Apr 2009 26 Apr '09

11:52 p.m.

Hello, I have taken interest in using Boost Threads in my project. I have a strange issue that I am trying to debug. I spawn N boost::thread instances because I know exactly how many I need. My program crashes after a time on a boost::bad_lexical_cast exception. I have placed a wrapper around boost::lexical_cast which uses a boost::recursive_mutex and boost::lock_guard combination to prevent multiple threads from calling boost::lexical_cast at the same time. Still, my project invariably chokes on boost::bad_lexical_cast being thrown when I can verify the cast should work i.e. everything works as expected in my single threaded implementation. My real cause for alarm is this....when I act to debug my project I always discover there are more active boost::thread instances than I have explicitly created. I only create boost::thread instances in one location. I have perused the documentation but as of yet cannot find an explanation for this behavior. My assumption is that unless I explicitly instantiate a boost::thread it should not exist. However, I want to ask if this is indeed the case? If not, how so and what is the rationale behind this behavior? Here is what my code more or less looks like - this is all I do with boost::threads currently - how are more than N boost::threads spawning where N is the number of elements in an arbitrary stl container which itself is not allowed to change in size? typedef boost::shared_ptr<boost::thread> THREADHANDLE; typedef std::vector<THREADHANDLE> THREADHANDLECONTAINER; THREADHANDLECONTAINER workerThreads; workerThreads.reserve(some_container.size()); ... workerThreads.push_back( THREADHANDLE(new THREADHANDLE::element_type( Callable, arg1, boost::ref(arg2)); ... BOOST_FOREACH(const THREADHANDLE &workerThread,workerThreads) { workerThread->join(); }//end loop ------------------------------------------ Here is my attempt at using boost::lexical_cast in a generic thread safe manner. I am expecting exactly one thread my call boost::lexical_cast at a time with this wrapper yet my application blows up solely on boost::bad_lexical_cast whenever I introduce Boost Threads. Am I missing the elephant in the room somehow? //LexicalCaster.h #include <boost/thread.hpp> struct LexicalCaster { template<typename TargetType,typename SourceType> inline static TargetType LexicalCast(const SourceType &source); private: static boost::recursive_mutex& GetMutex(); static int& GetCount(); }; //LexicalCaster.cpp #include "LexicalCast.h" boost::recursive_mutex _mtx; int count(0); //here for debugging purposes boost::recursive_mutex& LexicalCaster::GetMutex() { return _mtx; } int& LexicalCaster::GetCount() { return count; } //LexicalCaster.inl #include <boost/lexical_cast.hpp> template<typename TargetType,typename SourceType> TargetType LexicalCaster::LexicalCast(const SourceType &source) { boost::lock_guard<boost::recursive_mutex> guardian(LexicalCaster::GetMutex()); std::cerr<<"Count: "<<LexicalCaster::GetCount()++<<std::endl; //the count will display sequentially if only //one thread of execution has access at a time return boost::lexical_cast<TargetType>(source);

Attachments:

attachment.html (text/html — 3.8 KB)

Show replies by date

Nigel Rantor

27 Apr 27 Apr

1:01 a.m.

Terrimane Pritchett wrote:

...

Hello,

I have taken interest in using Boost Threads in my project. I have a strange issue that I am trying to debug.

I spawn N boost::thread instances because I know exactly how many I need. My program crashes after a time on a boost::bad_lexical_cast exception. I have placed a wrapper around boost::lexical_cast which uses a boost::recursive_mutex and boost::lock_guard combination to prevent multiple threads from calling boost::lexical_cast at the same time.

What made you think that the exception was being thrown becasue the program is multithreaded?

...

Still, my project invariably chokes on boost::bad_lexical_cast being thrown when I can verify the cast should work i.e. everything works as expected in my single threaded implementation.

Have you checked the information that the exception is returning to you? Have you got the data that caused the exception?

...

My real cause for alarm is this....when I act to debug my project I always discover there are more active boost::thread instances than I have explicitly created. I only create boost::thread instances in one location.

Could you please elaborate as to why you think you have more threads than expected? How many? What other libraries are you using that may create threads?

...

Here is what my code more or less looks like - this is all I do with boost::threads currently - how are more than N boost::threads spawning where N is the number of elements in an arbitrary stl container which itself is not allowed to change in size?

Posting code that won't compile is worse than not posting anything at all. A cursory glance at the lexical_cast source leads me to think that tracking down the data that caused the bad_lexical_cast to be thrown should be your first job. Until you do that I wouldn't waste any time trying to create wrappers around libraries that may already be thread-safe. Someone who is more familliar with the lexical_cast code may be able to say one-way or another. If you had hard evidence it was a threading issue I'd spend some more time on it, but I'd put my money on the bad_lexical_cast being thrown because it could not perform the requested conversion, and nothing to do with threads. Let us know how you get on with trying to track down the data that caused the exception to be thrown. Regards, Nigel

Terrimane Pritchett

3:27 a.m.

On Sun, Apr 26, 2009 at 6:01 PM, Nigel Rantor <wiggly@wiggly.org> wrote:

...

Terrimane Pritchett wrote:* *

...
*

* * What made you think that the exception was being thrown becasue the program is multithreaded?*

Because I have a single threaded implementation that uses boost::lexical_cast extensively without throwing any exceptions or generating any warning/errors during compilation. Secondly, boost::lexcial_cast does not guarantee thread safety - just as std::stringstream does not guarantee thread safety. *Have you checked the information that the exception is returning to you?* Yes, but if there is something specific you suggest I look for I would like input about that.

...

*Have you got the data that caused the exception?*

Yes. The data originates in a Collada document that has been vetted as sound and I have used to for testing purposes elsewhere.

...

* * * Could you please elaborate as to why you think you have more threads than expected? How many? What other libraries are you using that may create threads?*

The number of new threads that are generated is not predicable. I am using MSVS 2008. I can see every active thread in my application at any breakpoint. I can count how many threads are active and see what type of threads they are. I explicitly create N threads and count M threads where N is less than M. Immediately after I spawn my threads I halt execution and count how many are active and in what state they are in. At that point the only threads present are those I have spawned. I then let the app run and new threads appear which I did not explicitly create. I am using no other threading libraries in this project. I am testing Boost Threads for the purpose of becoming my base multi-threading API as opposed to say pthreads. There is one and only one point at which child threads of the parent process are allowed to spawn - that is my intention.

...

* *

...

*A cursory glance at the lexical_cast source leads me to think that tracking down the data that caused the bad_lexical_cast to be thrown should be your first job.

Until you do that I wouldn't waste any time trying to create wrappers around libraries that may already be thread-safe.

Someone who is more familliar with the lexical_cast code may be able to say one-way or another. If you had hard evidence it was a threading issue I'd spend some more time on it, but I'd put my money on the bad_lexical_cast being thrown because it could not perform the requested conversion, and nothing to do with threads.*

I have a complete single threaded implementation which executes over the exact same data - boost::lexical_cast performs exactly as it should using that same input data. I suppose I will need you to qualify what you would consider to be hard evidence here.

...

* Let us know how you get on with trying to track down the data that caused the exception to be thrown.*

I already have done this. I get data from Collada documents as std::strings. I print them out. I have boost::tokenizer tokenize the strings. I print out the tokens. I have boost::lexical_cast convert the tokens to plain ole data types. In the single threaded implementation every cast can be checked. It is more difficult with the multithreaded application but generally the same thing is done. Perhaps I misunderstand what data it is that you are referring to. If so, I'll find it out after correct my misunderstanding.

...

Best,

Shon

...

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Igor R

8:26 a.m.

...

Secondly, boost::lexcial_cast does not guarantee thread safety - just as std::stringstream does not guarantee thread safety.

What do you mean by saying "thread-safety"? It is safe to use *distinct* std::stringstream objects in different threads, without synchnonization. It is not safe to use *shared* std::stringstream without synchronization. Thus, lexical_cast may be used in different threads without synchnonization, because every lexical_cast call uses its own local streambuf.

...

Yes, but if there is something specific you suggest I look for I would like input about that.

Do you use MSVC? If yes, please do as follows: 1) Go to menu Debug-->Exceptions... 2) In the exception tree open C++ exception branch and add "std::bad_cast" exception, then enable its check box. Now run your application. It will stop when lexical_cast throws exception, but before its handler is found (i.e. before the stack is unwound). Use "Call Stack" window to see what parameter was passed to lexical_cast.

...

The number of new threads that are generated is not predicable. I am using MSVS 2008. I can see every active thread in my application at any breakpoint. I can count how many threads are active and see what type of threads they are. I explicitly create N threads and count M threads where N is less than M. Immediately after I spawn my threads I halt execution and count how many are active and in what state they are in. At that point the only threads present are those I have spawned. I then let the app run and new threads appear which I did not explicitly create.

You see *all* the threads in the application - not only those you started with boost::thread(). You probably use some third-party libraries or COM, or some other facilities that launch threads implicitly. In "Threads" window you can see the type of every thread (like pthread, RPC thrread, Win32 thread etc.) and its call-stack - so you can figure-out where the thread comes from.

Nigel Rantor

10:55 a.m.

Terrimane Pritchett wrote:

...

On Sun, Apr 26, 2009 at 6:01 PM, Nigel Rantor <wiggly@wiggly.org <mailto:wiggly@wiggly.org>> wrote:

Terrimane Pritchett wrote:/ /

/ / / What made you think that the exception was being thrown becasue the program is multithreaded?/

Because I have a single threaded implementation that uses boost::lexical_cast extensively without throwing any exceptions or generating any warning/errors during compilation. Secondly, boost::lexcial_cast does not guarantee thread safety - just as std::stringstream does not guarantee thread safety.

See Igor's comments on thread-safety. The main situation where a function may not be thread-safe is where it access shared data, I don't beleive this is the case with lexical_cast so I'm having a problem thinking how it could cause a problem. If it *was* using shared data then I would expect to see data errors rather than a consistent exception being thrown.

...

/Have you checked the information that the exception is returning to you?/

Yes, but if there is something specific you suggest I look for I would like input about that.

I would wrap the call to lexical_cast in a try block that catches the exception and writes out the data that was passed in and perhaps some of the type information that the exception contains and is telling you. e.g. ------------------------------------------------------- #include <iostream> #include <boost/lexical_cast.hpp> using namespace std; using namespace boost; int main( int argc, char** argv ) { if( argc < 2 ) { cout << "please supply an argument to convert to an int\n"; return 0; } string data( argv[1] ); try { int i = lexical_cast<int>( data ); cout << "data str : " << data << "\n"; cout << "data int : " << i << "\n"; } catch( bad_lexical_cast& e ) { cout << "bad lexical cast : " << e.what() << "\n"; cout << "data : " << data << "\n"; cout << "source : " << e.source_type().name() << "\n"; cout << "target : " << e.target_type().name() << "\n"; } return 0; } -------------------------------------------------------

...

/Have you got the data that caused the exception?/

Yes. The data originates in a Collada document that has been vetted as sound and I have used to for testing purposes elsewhere.

See above. I actually meant the specific data that caused the exception to be thrown. i.e. The exact data that was passed to the call to lexical_cast that threw.

...

/ / / Could you please elaborate as to why you think you have more threads than expected? How many? What other libraries are you using that may create threads?/

The number of new threads that are generated is not predicable. I am using MSVS 2008. I can see every active thread in my application at any breakpoint. I can count how many threads are active and see what type of threads they are. I explicitly create N threads and count M threads where N is less than M. Immediately after I spawn my threads I halt execution and count how many are active and in what state they are in. At that point the only threads present are those I have spawned. I then let the app run and new threads appear which I did not explicitly create.

Okay, this is a different problem (I think). If you're sure that your code where you spawn threads is not being called again then I would suggest you do have an external library spawning threads behind your back. There is no reason that the boost threading library would be doing this. I suppose the most thorough way of figuring this out would be to use a debugger aftter all of your threads are spawned and set breakpoints on your system's thread creation calls.

...

I have a complete single threaded implementation which executes over the exact same data - boost::lexical_cast performs exactly as it should using that same input data.

I suppose I will need you to qualify what you would consider to be hard evidence here.

Okay. Sorry if I sound like I doubt you, but I have no idea who you are and skepticism is my default stance. Just becasue a single-threaded program iteratoes over the same data without error does not mean that lexical_cast is the culprit here, it may simply be that you're not synchronizing some other peice of code that eventually means lexical_cast gets fed some bad data.

...

/ Let us know how you get on with trying to track down the data that caused the exception to be thrown./

I already have done this. I get data from Collada documents as std::strings. I print them out. I have boost::tokenizer tokenize the strings. I print out the tokens. I have boost::lexical_cast convert the tokens to plain ole data types. In the single threaded implementation every cast can be checked. It is more difficult with the multithreaded application but generally the same thing is done.

This is exactly what you need to do. You have to get the code to print out exactly what it was trying to convert when the exception was thrown. See above.

...

Perhaps I misunderstand what data it is that you are referring to. If so, I'll find it out after correct my misunderstanding.

Keep us updated, I'm interested. n

Terrimane Pritchett

6 p.m.

I suspect my problem continued problem lies with boost::tokenizer...for more on that skip to the bottom of this post. First I want to respond to some of the valid concerns presented to me. The input data *shouldn't* be shared. It shouldn't be but as there is something wrong here I must validate all assumptions. I am essentially using a boost::thread to process a file for 1 to N files. After I get it working I'll control how many files are processed at once etc. What is relevant here is that each file is processed is a separate thread and there is no communication or sharing of data between threads. (don't misinterpret this as criticism...I LOVE all things boost believe me!) If boost::lexical_cast were 100% thread safe I would have no need for guarding access to calling it. Igor R mentioned boost::lexical_cast is thread safe because the function has it own local streambuf instance. I need to do more research but my understanding is that this isn't the case. I was led to believe boost::lexical_cast was not thread safe mostly by reading this discussion: http://lists.boost.org/Archives/boost/2006/09/109907.php Basically, if locale,stringstream etc can access global data (which they may) and boost::lexical_cast is built on top of them without protections in place to guarantee thread safety (which I cannot readily see in the documentation or code) then boost::lexical_cast is not thread safe. Before I'm taken out back and beaten I admit I haven't spent much time exploring how boost::lexical_cast is implemented. I try to stay away from implementation details when implementations may change and I may be tempted to leverage something I shouldn't. I am open to knowledge so by all means impart some upon me if have some to offer. Back to my problem. Its been suggested that some other library is spawning threads behind my back. I can report that is not the case. I have moved back to the single-threaded implementation and *ONLY* my main thread is a part of my application's process. When I move to the multi-threaded implementation that leverages boost::thread I now have new threads appearing...some I spawned...others I did not. The only difference between both implementations is the inclusion of Boost Threads to the project. If some library were spawning threads behind my back I would see them in the single threaded application at some point and also be able to decipher what kind of threads they were. Neither MSVC or the OS detects anything other than my Main thread running in my single threaded implementation. Something must explain where the extra threads are coming from. I have full confidence that boost::lexcial_cast is getting bad data. The question for me is how is that possible? Data *shouldn't* be shared. I *shouldn't* be seeing more threads than I explicitly create. I may be able to answer how boost::lexical_cast is getting bad data...but I don't have any clues as to where my extra threads are coming from. My suspicion in that boost::tokenizer is supplying my calls to boost::lexical_cast with bad data. I looked at the documentation for boost::tokenizer and saw no mention of thread safety anywhere. I then took a look at the source of the boost token_iterator class since it is really what does the heavy lifting in tokenizing strings. Specifically I had a look at token_functions.hpp. Here it can be seen that the standard cctype library is leveraged. This library may not be thread safe because implementors are allowed to leverage global/static data (particularly locales if memory serves) without regard for shared access by multiple threads. While I have guarded boost::lexical_cast I have not guarded access to boost::tokenizer or its iterator. I will attempt to add proper protections while I await any incoming thoughts and input on the matter. Shon P.S. I did catch boost::bad_lexical cast...it never suspected the cast was getting bad data. I wanted to know why. Still, if anyone is interested in what sort of bad data is coming in causing the exception to be thrown here is a sample. try { std::cout<<"string input: "<<"<begin>"<<source<<"<end>"<<"\n"; TargetType convertedValue = boost::lexical_cast<TargetType>(source); std::cout<<"float output: "<<convertedValue<<std::endl; return convertedValue; } catch (boost::bad_lexical_cast& e) { std::cerr<<"bad lexical cast: "<<e.what()<<"\n"; std::cerr<<"data: "<<source<<"\n"; std::cerr<<"source: "<<e.source_type().name()<<"\n"; std::cerr<<"target: "<<e.target_type().name()<<std::endl; } Call Count: 14073 string input: <begin>80.8047<end> float output: 80.8047 Call Count: 14074 string input: <begin>348657 <end> bad lexical cast: bad lexical cast: source type value could not be interpreted a s target data: 348657 source: class std::basic_string<char,struct std::char_traits<char>,class std::al locator<char> > target: float *There is a trailing space at the end of the input string to the casting function* *Other random anomalies cause problems but they are expected when the data is being stomped somewhere* On Mon, Apr 27, 2009 at 3:55 AM, Nigel Rantor <wiggly@wiggly.org> wrote:

...

Terrimane Pritchett wrote:

...
On Sun, Apr 26, 2009 at 6:01 PM, Nigel Rantor <wiggly@wiggly.org <mailto: wiggly@wiggly.org>> wrote:

Terrimane Pritchett wrote:/ /

/ / / What made you think that the exception was being thrown becasue the program is multithreaded?/

Because I have a single threaded implementation that uses boost::lexical_cast extensively without throwing any exceptions or generating any warning/errors during compilation. Secondly, boost::lexcial_cast does not guarantee thread safety - just as std::stringstream does not guarantee thread safety.

See Igor's comments on thread-safety.

The main situation where a function may not be thread-safe is where it access shared data, I don't beleive this is the case with lexical_cast so I'm having a problem thinking how it could cause a problem.

If it *was* using shared data then I would expect to see data errors rather than a consistent exception being thrown.

/Have you checked the information that the exception is returning to

...
you?/

Yes, but if there is something specific you suggest I look for I would like input about that.

I would wrap the call to lexical_cast in a try block that catches the exception and writes out the data that was passed in and perhaps some of the type information that the exception contains and is telling you.

e.g.

------------------------------------------------------- #include <iostream> #include <boost/lexical_cast.hpp>

using namespace std; using namespace boost;

int main( int argc, char** argv ) { if( argc < 2 ) { cout << "please supply an argument to convert to an int\n"; return 0; }

string data( argv[1] );

try { int i = lexical_cast<int>( data );

cout << "data str : " << data << "\n"; cout << "data int : " << i << "\n"; } catch( bad_lexical_cast& e ) { cout << "bad lexical cast : " << e.what() << "\n"; cout << "data : " << data << "\n"; cout << "source : " << e.source_type().name() << "\n"; cout << "target : " << e.target_type().name() << "\n"; }

return 0; } -------------------------------------------------------

/Have you got the data that caused the exception?/

...
Yes. The data originates in a Collada document that has been vetted as sound and I have used to for testing purposes elsewhere.

See above. I actually meant the specific data that caused the exception to be thrown. i.e. The exact data that was passed to the call to lexical_cast that threw.

/

...
/ / Could you please elaborate as to why you think you have more threads than expected? How many? What other libraries are you using that may create threads?/

The number of new threads that are generated is not predicable. I am using MSVS 2008. I can see every active thread in my application at any breakpoint. I can count how many threads are active and see what type of threads they are. I explicitly create N threads and count M threads where N is less than M. Immediately after I spawn my threads I halt execution and count how many are active and in what state they are in. At that point the only threads present are those I have spawned. I then let the app run and new threads appear which I did not explicitly create.

Okay, this is a different problem (I think).

If you're sure that your code where you spawn threads is not being called again then I would suggest you do have an external library spawning threads behind your back.

There is no reason that the boost threading library would be doing this.

I suppose the most thorough way of figuring this out would be to use a debugger aftter all of your threads are spawned and set breakpoints on your system's thread creation calls.

I have a complete single threaded implementation which executes over the

...
exact same data - boost::lexical_cast performs exactly as it should using that same input data.

I suppose I will need you to qualify what you would consider to be hard evidence here.

Okay. Sorry if I sound like I doubt you, but I have no idea who you are and skepticism is my default stance.

Just becasue a single-threaded program iteratoes over the same data without error does not mean that lexical_cast is the culprit here, it may simply be that you're not synchronizing some other peice of code that eventually means lexical_cast gets fed some bad data.

/

...
Let us know how you get on with trying to track down the data that caused the exception to be thrown./

I already have done this. I get data from Collada documents as std::strings. I print them out. I have boost::tokenizer tokenize the strings. I print out the tokens. I have boost::lexical_cast convert the tokens to plain ole data types. In the single threaded implementation every cast can be checked. It is more difficult with the multithreaded application but generally the same thing is done.

This is exactly what you need to do. You have to get the code to print out exactly what it was trying to convert when the exception was thrown. See above.

Perhaps I misunderstand what data it is that you are referring to. If so,

...
I'll find it out after correct my misunderstanding.

Keep us updated, I'm interested.

n

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Igor R

6:46 p.m.

...

If boost::lexical_cast were 100% thread safe I would have no need for guarding access to calling it. Igor R mentioned boost::lexical_cast is thread safe because the function has it own local streambuf instance. I need to do more research but my understanding is that this isn't the case. I was led to believe boost::lexical_cast was not thread safe mostly by reading this discussion:

http://lists.boost.org/Archives/boost/2006/09/109907.php

Well, of course it cannot be "more thread-safe" than the underlying c++ std lib. My remark primarily related to the case of MSVC (http://msdn.microsoft.com/en-us/library/c9ceah3b.aspx), but you're right, it's not 100% portable thread-safity.

Nigel Rantor

7:25 p.m.

Terrimane Pritchett wrote:

...

I suspect my problem continued problem lies with boost::tokenizer...for more on that skip to the bottom of this post.

First I want to respond to some of the valid concerns presented to me.

The input data *shouldn't* be shared. It shouldn't be but as there is something wrong here I must validate all assumptions.

I am essentially using a boost::thread to process a file for 1 to N files. After I get it working I'll control how many files are processed at once etc. What is relevant here is that each file is processed is a separate thread and there is no communication or sharing of data between threads.

Okay, great. I agree, if you've got a one thread per file then that sounds like they can't stomp on each other.

...

(don't misinterpret this as criticism...I LOVE all things boost believe me!) If boost::lexical_cast were 100% thread safe I would have no need for guarding access to calling it. Igor R mentioned boost::lexical_cast is thread safe because the function has it own local streambuf instance. I need to do more research but my understanding is that this isn't the case. I was led to believe boost::lexical_cast was not thread safe mostly by reading this discussion:

http://lists.boost.org/Archives/boost/2006/09/109907.php

Okay, fair enough. As I said my opinion was based on a cursory glance at the lexical_cast source and the assumption that it didn't use any shared state.

...

Back to my problem. Its been suggested that some other library is spawning threads behind my back. I can report that is not the case. I have moved back to the single-threaded implementation and *ONLY* my main thread is a part of my application's process. When I move to the multi-threaded implementation that leverages boost::thread I now have new threads appearing...some I spawned...others I did not. The only difference between both implementations is the inclusion of Boost Threads to the project. If some library were spawning threads behind my back I would see them in the single threaded application at some point and also be able to decipher what kind of threads they were. Neither MSVC or the OS detects anything other than my Main thread running in my single threaded implementation.

Something must explain where the extra threads are coming from.

I agree. Something must explain it. My next suggestion for your threading issue would be to take your current program and keep on cutting it down to a smaller and smaller program that exhibits the problem. When you've got it as small as possible then it will either: a) be clear what you removed to make it work as expected OR b) be a small enough program that you can post it here for others to test

...

I have full confidence that boost::lexcial_cast is getting bad data. The question for me is how is that possible? Data *shouldn't* be shared. I *shouldn't* be seeing more threads than I explicitly create.

Well it's somewhere is in the program of course. I think your suggestion that the tokeniser may be to blame is possibly the right one. If you have the ability to guard that with a mutex as a quick test that would be quite cool. Otherwise I'd suggest doing the same as the above and sending us a working program that exhibits the problem. If that is not possible for some reason, and there are many reasons why you may not be able to, then I'd suggest playing with guarding the tokeniser first and then see how it goes. It's hard to make suggestions without a fair amount of context about the actual code.

...

*There is a trailing space at the end of the input string to the casting function* *Other random anomalies cause problems but they are expected when the data is being stomped somewhere*

Well, the extra space is why it's throwing lexical cast...the question of course is why is the tokeniser stripping the space in your single-threaded version (you already said you were using the same data set in both runs) but not in the multi-threaded version. Good luck, I can't wait to see what happens with your next experiment. Regards, Nigel

John Wilkinson

7:38 p.m.

Perhaps I missed something, but have you looked at the call stack of the "extra" threads in the debugger? IIRC, if you link to the multithreaded Windows runtime, you will get background threads as a matter of course. You should be able to look at the call stack of each thread, and determine whether or not it is one created in your 1...N loop. You might try setting N to 2 before you do that. Does the number of "extra" threads stay constant if you vary N? John ________________________________ From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Terrimane Pritchett Sent: Monday, April 27, 2009 1:00 PM To: boost-users@lists.boost.org Subject: Re: [Boost-users] [BOOST THREAD] Threads Spawning Unexpectedly I suspect my problem continued problem lies with boost::tokenizer...for more on that skip to the bottom of this post. First I want to respond to some of the valid concerns presented to me. The input data *shouldn't* be shared. It shouldn't be but as there is something wrong here I must validate all assumptions. I am essentially using a boost::thread to process a file for 1 to N files. After I get it working I'll control how many files are processed at once etc. What is relevant here is that each file is processed is a separate thread and there is no communication or sharing of data between threads. (don't misinterpret this as criticism...I LOVE all things boost believe me!) If boost::lexical_cast were 100% thread safe I would have no need for guarding access to calling it. Igor R mentioned boost::lexical_cast is thread safe because the function has it own local streambuf instance. I need to do more research but my understanding is that this isn't the case. I was led to believe boost::lexical_cast was not thread safe mostly by reading this discussion: http://lists.boost.org/Archives/boost/2006/09/109907.php Basically, if locale,stringstream etc can access global data (which they may) and boost::lexical_cast is built on top of them without protections in place to guarantee thread safety (which I cannot readily see in the documentation or code) then boost::lexical_cast is not thread safe. Before I'm taken out back and beaten I admit I haven't spent much time exploring how boost::lexical_cast is implemented. I try to stay away from implementation details when implementations may change and I may be tempted to leverage something I shouldn't. I am open to knowledge so by all means impart some upon me if have some to offer. Back to my problem. Its been suggested that some other library is spawning threads behind my back. I can report that is not the case. I have moved back to the single-threaded implementation and *ONLY* my main thread is a part of my application's process. When I move to the multi-threaded implementation that leverages boost::thread I now have new threads appearing...some I spawned...others I did not. The only difference between both implementations is the inclusion of Boost Threads to the project. If some library were spawning threads behind my back I would see them in the single threaded application at some point and also be able to decipher what kind of threads they were. Neither MSVC or the OS detects anything other than my Main thread running in my single threaded implementation. Something must explain where the extra threads are coming from. I have full confidence that boost::lexcial_cast is getting bad data. The question for me is how is that possible? Data *shouldn't* be shared. I *shouldn't* be seeing more threads than I explicitly create. I may be able to answer how boost::lexical_cast is getting bad data...but I don't have any clues as to where my extra threads are coming from. My suspicion in that boost::tokenizer is supplying my calls to boost::lexical_cast with bad data. I looked at the documentation for boost::tokenizer and saw no mention of thread safety anywhere. I then took a look at the source of the boost token_iterator class since it is really what does the heavy lifting in tokenizing strings. Specifically I had a look at token_functions.hpp. Here it can be seen that the standard cctype library is leveraged. This library may not be thread safe because implementors are allowed to leverage global/static data (particularly locales if memory serves) without regard for shared access by multiple threads. While I have guarded boost::lexical_cast I have not guarded access to boost::tokenizer or its iterator. I will attempt to add proper protections while I await any incoming thoughts and input on the matter. Shon P.S. I did catch boost::bad_lexical cast...it never suspected the cast was getting bad data. I wanted to know why. Still, if anyone is interested in what sort of bad data is coming in causing the exception to be thrown here is a sample. try { std::cout<<"string input: "<<"<begin>"<<source<<"<end>"<<"\n"; TargetType convertedValue = boost::lexical_cast<TargetType>(source); std::cout<<"float output: "<<convertedValue<<std::endl; return convertedValue; } catch (boost::bad_lexical_cast& e) { std::cerr<<"bad lexical cast: "<<e.what()<<"\n"; std::cerr<<"data: "<<source<<"\n"; std::cerr<<"source: "<<e.source_type().name()<<"\n"; std::cerr<<"target: "<<e.target_type().name()<<std::endl; } Call Count: 14073 string input: <begin>80.8047<end> float output: 80.8047 Call Count: 14074 string input: <begin>348657 <end> bad lexical cast: bad lexical cast: source type value could not be interpreted a s target data: 348657 source: class std::basic_string<char,struct std::char_traits<char>,class std::al locator<char> > target: float *There is a trailing space at the end of the input string to the casting function* *Other random anomalies cause problems but they are expected when the data is being stomped somewhere* On Mon, Apr 27, 2009 at 3:55 AM, Nigel Rantor <wiggly@wiggly.org> wrote: Terrimane Pritchett wrote: On Sun, Apr 26, 2009 at 6:01 PM, Nigel Rantor <wiggly@wiggly.org <mailto:wiggly@wiggly.org>> wrote: Terrimane Pritchett wrote:/ / / / / What made you think that the exception was being thrown becasue the program is multithreaded?/ Because I have a single threaded implementation that uses boost::lexical_cast extensively without throwing any exceptions or generating any warning/errors during compilation. Secondly, boost::lexcial_cast does not guarantee thread safety - just as std::stringstream does not guarantee thread safety. See Igor's comments on thread-safety. The main situation where a function may not be thread-safe is where it access shared data, I don't beleive this is the case with lexical_cast so I'm having a problem thinking how it could cause a problem. If it *was* using shared data then I would expect to see data errors rather than a consistent exception being thrown. /Have you checked the information that the exception is returning to you?/ Yes, but if there is something specific you suggest I look for I would like input about that. I would wrap the call to lexical_cast in a try block that catches the exception and writes out the data that was passed in and perhaps some of the type information that the exception contains and is telling you. e.g. ------------------------------------------------------- #include <iostream> #include <boost/lexical_cast.hpp> using namespace std; using namespace boost; int main( int argc, char** argv ) { if( argc < 2 ) { cout << "please supply an argument to convert to an int\n"; return 0; } string data( argv[1] ); try { int i = lexical_cast<int>( data ); cout << "data str : " << data << "\n"; cout << "data int : " << i << "\n"; } catch( bad_lexical_cast& e ) { cout << "bad lexical cast : " << e.what() << "\n"; cout << "data : " << data << "\n"; cout << "source : " << e.source_type().name() << "\n"; cout << "target : " << e.target_type().name() << "\n"; } return 0; } ------------------------------------------------------- /Have you got the data that caused the exception?/ Yes. The data originates in a Collada document that has been vetted as sound and I have used to for testing purposes elsewhere. See above. I actually meant the specific data that caused the exception to be thrown. i.e. The exact data that was passed to the call to lexical_cast that threw. / / / Could you please elaborate as to why you think you have more threads than expected? How many? What other libraries are you using that may create threads?/ The number of new threads that are generated is not predicable. I am using MSVS 2008. I can see every active thread in my application at any breakpoint. I can count how many threads are active and see what type of threads they are. I explicitly create N threads and count M threads where N is less than M. Immediately after I spawn my threads I halt execution and count how many are active and in what state they are in. At that point the only threads present are those I have spawned. I then let the app run and new threads appear which I did not explicitly create. Okay, this is a different problem (I think). If you're sure that your code where you spawn threads is not being called again then I would suggest you do have an external library spawning threads behind your back. There is no reason that the boost threading library would be doing this. I suppose the most thorough way of figuring this out would be to use a debugger aftter all of your threads are spawned and set breakpoints on your system's thread creation calls. I have a complete single threaded implementation which executes over the exact same data - boost::lexical_cast performs exactly as it should using that same input data. I suppose I will need you to qualify what you would consider to be hard evidence here. Okay. Sorry if I sound like I doubt you, but I have no idea who you are and skepticism is my default stance. Just becasue a single-threaded program iteratoes over the same data without error does not mean that lexical_cast is the culprit here, it may simply be that you're not synchronizing some other peice of code that eventually means lexical_cast gets fed some bad data. / Let us know how you get on with trying to track down the data that caused the exception to be thrown./ I already have done this. I get data from Collada documents as std::strings. I print them out. I have boost::tokenizer tokenize the strings. I print out the tokens. I have boost::lexical_cast convert the tokens to plain ole data types. In the single threaded implementation every cast can be checked. It is more difficult with the multithreaded application but generally the same thing is done. This is exactly what you need to do. You have to get the code to print out exactly what it was trying to convert when the exception was thrown. See above. Perhaps I misunderstand what data it is that you are referring to. If so, I'll find it out after correct my misunderstanding. Keep us updated, I'm interested. n _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Terrimane Pritchett

10:05 p.m.

I have sprinkled some boost::lock_guards around my instantiations of boost::tokenizer and subsequent iteration over tokens. I am currently running a debug run of my multi-threaded implementation. I has been running for about 3 hours without crashing - bad_lexical_cast or otherwise. I have been pretty quick and dirty with my fixes but and while I realize correlation is not causation...I may be on the right path to fixing my problems. If the debug run succeeds it will take several hours to complete so anyone paying attention please bear with me. Nigel- I will follow you suggestions and report back as soon as I can. I can upload code but I will need to remove a few the dependency to the Collada DOM library which is used to parse Collada documents. I will also need to find/create some suitable test input. It will take me a bit to write test app anyone can compile. Be aware I won't go so far as to bother anyone to look at my code unless my theory about boost::tokenizer issues proves to be false or unwieldy to contain. John- What you say about the Windows runtime may be a possibility but I am given pause by one thing...I don't change any compiler options for the single-threaded or multi-threaded implementation. Both implementations are linking against the same runtime. I will have to research what goes on with boost::threads a bit further but for now at least there is a starting point for investigation. On Mon, Apr 27, 2009 at 12:38 PM, John Wilkinson <jwilkinson@tsystem.com>wrote:

...

Perhaps I missed something, but have you looked at the call stack of the “extra” threads in the debugger? IIRC, if you link to the multithreaded Windows runtime, you will get background threads as a matter of course. You should be able to look at the call stack of each thread, and determine whether or not it is one created in your 1…N loop. You might try setting N to 2 before you do that. Does the number of “extra” threads stay constant if you vary N?

John

------------------------------

*From:* boost-users-bounces@lists.boost.org [mailto: boost-users-bounces@lists.boost.org] *On Behalf Of *Terrimane Pritchett *Sent:* Monday, April 27, 2009 1:00 PM *To:* boost-users@lists.boost.org *Subject:* Re: [Boost-users] [BOOST THREAD] Threads Spawning Unexpectedly

I suspect my problem continued problem lies with boost::tokenizer...for more on that skip to the bottom of this post.

First I want to respond to some of the valid concerns presented to me.

The input data *shouldn't* be shared. It shouldn't be but as there is something wrong here I must validate all assumptions.

I am essentially using a boost::thread to process a file for 1 to N files. After I get it working I'll control how many files are processed at once etc. What is relevant here is that each file is processed is a separate thread and there is no communication or sharing of data between threads.

(don't misinterpret this as criticism...I LOVE all things boost believe me!) If boost::lexical_cast were 100% thread safe I would have no need for guarding access to calling it. Igor R mentioned boost::lexical_cast is thread safe because the function has it own local streambuf instance. I need to do more research but my understanding is that this isn't the case. I was led to believe boost::lexical_cast was not thread safe mostly by reading this discussion:

http://lists.boost.org/Archives/boost/2006/09/109907.php

Basically, if locale,stringstream etc can access global data (which they may) and boost::lexical_cast is built on top of them without protections in place to guarantee thread safety (which I cannot readily see in the documentation or code) then boost::lexical_cast is not thread safe.

Before I'm taken out back and beaten I admit I haven't spent much time exploring how boost::lexical_cast is implemented. I try to stay away from implementation details when implementations may change and I may be tempted to leverage something I shouldn't. I am open to knowledge so by all means impart some upon me if have some to offer.

Back to my problem. Its been suggested that some other library is spawning threads behind my back. I can report that is not the case. I have moved back to the single-threaded implementation and *ONLY* my main thread is a part of my application's process. When I move to the multi-threaded implementation that leverages boost::thread I now have new threads appearing...some I spawned...others I did not. The only difference between both implementations is the inclusion of Boost Threads to the project. If some library were spawning threads behind my back I would see them in the single threaded application at some point and also be able to decipher what kind of threads they were. Neither MSVC or the OS detects anything other than my Main thread running in my single threaded implementation.

Something must explain where the extra threads are coming from.

I have full confidence that boost::lexcial_cast is getting bad data. The question for me is how is that possible? Data *shouldn't* be shared. I *shouldn't* be seeing more threads than I explicitly create.

I may be able to answer how boost::lexical_cast is getting bad data...but I don't have any clues as to where my extra threads are coming from.

My suspicion in that boost::tokenizer is supplying my calls to boost::lexical_cast with bad data. I looked at the documentation for boost::tokenizer and saw no mention of thread safety anywhere. I then took a look at the source of the boost token_iterator class since it is really what does the heavy lifting in tokenizing strings. Specifically I had a look at token_functions.hpp. Here it can be seen that the standard cctype library is leveraged. This library may not be thread safe because implementors are allowed to leverage global/static data (particularly locales if memory serves) without regard for shared access by multiple threads.

While I have guarded boost::lexical_cast I have not guarded access to boost::tokenizer or its iterator. I will attempt to add proper protections while I await any incoming thoughts and input on the matter.

Shon

P.S. I did catch boost::bad_lexical cast...it never suspected the cast was getting bad data. I wanted to know why. Still, if anyone is interested in what sort of bad data is coming in causing the exception to be thrown here is a sample.

try { std::cout<<"string input: "<<"<begin>"<<source<<"<end>"<<"\n";

TargetType convertedValue = boost::lexical_cast<TargetType>(source);

std::cout<<"float output: "<<convertedValue<<std::endl;

return convertedValue; } catch (boost::bad_lexical_cast& e) { std::cerr<<"bad lexical cast: "<<e.what()<<"\n"; std::cerr<<"data: "<<source<<"\n"; std::cerr<<"source: "<<e.source_type().name()<<"\n"; std::cerr<<"target: "<<e.target_type().name()<<std::endl; }

Call Count: 14073 string input: <begin>80.8047<end> float output: 80.8047 Call Count: 14074 string input: <begin>348657 <end> bad lexical cast: bad lexical cast: source type value could not be interpreted a s target data: 348657 source: class std::basic_string<char,struct std::char_traits<char>,class std::al locator<char> > target: float

*There is a trailing space at the end of the input string to the casting function* *Other random anomalies cause problems but they are expected when the data is being stomped somewhere*

On Mon, Apr 27, 2009 at 3:55 AM, Nigel Rantor <wiggly@wiggly.org> wrote:

Terrimane Pritchett wrote:

On Sun, Apr 26, 2009 at 6:01 PM, Nigel Rantor <wiggly@wiggly.org <mailto: wiggly@wiggly.org>> wrote:

Terrimane Pritchett wrote:/ /

/ / / What made you think that the exception was being thrown becasue the program is multithreaded?/

Because I have a single threaded implementation that uses boost::lexical_cast extensively without throwing any exceptions or generating any warning/errors during compilation. Secondly, boost::lexcial_cast does not guarantee thread safety - just as std::stringstream does not guarantee thread safety.

See Igor's comments on thread-safety.

The main situation where a function may not be thread-safe is where it access shared data, I don't beleive this is the case with lexical_cast so I'm having a problem thinking how it could cause a problem.

If it *was* using shared data then I would expect to see data errors rather than a consistent exception being thrown.

/Have you checked the information that the exception is returning to

you?/

Yes, but if there is something specific you suggest I look for I would like input about that.

I would wrap the call to lexical_cast in a try block that catches the exception and writes out the data that was passed in and perhaps some of the type information that the exception contains and is telling you.

e.g.

------------------------------------------------------- #include <iostream>

#include <boost/lexical_cast.hpp>

using namespace std; using namespace boost;

int main( int argc, char** argv ) { if( argc < 2 ) { cout << "please supply an argument to convert to an int\n"; return 0; }

string data( argv[1] );

try { int i = lexical_cast<int>( data );

cout << "data str : " << data << "\n"; cout << "data int : " << i << "\n"; } catch( bad_lexical_cast& e ) { cout << "bad lexical cast : " << e.what() << "\n"; cout << "data : " << data << "\n"; cout << "source : " << e.source_type().name() << "\n"; cout << "target : " << e.target_type().name() << "\n"; }

return 0; } -------------------------------------------------------

/Have you got the data that caused the exception?/

Yes. The data originates in a Collada document that has been vetted as sound and I have used to for testing purposes elsewhere.

See above. I actually meant the specific data that caused the exception to be thrown. i.e. The exact data that was passed to the call to lexical_cast that threw.

/ / / Could you please elaborate as to why you think you have more

threads than expected? How many? What other libraries are you using that may create threads?/

The number of new threads that are generated is not predicable. I am using MSVS 2008. I can see every active thread in my application at any breakpoint. I can count how many threads are active and see what type of threads they are. I explicitly create N threads and count M threads where N is less than M. Immediately after I spawn my threads I halt execution and count how many are active and in what state they are in. At that point the only threads present are those I have spawned. I then let the app run and new threads appear which I did not explicitly create.

Okay, this is a different problem (I think).

If you're sure that your code where you spawn threads is not being called again then I would suggest you do have an external library spawning threads behind your back.

There is no reason that the boost threading library would be doing this.

I suppose the most thorough way of figuring this out would be to use a debugger aftter all of your threads are spawned and set breakpoints on your system's thread creation calls.

I have a complete single threaded implementation which executes over the exact same data - boost::lexical_cast performs exactly as it should using that same input data.

I suppose I will need you to qualify what you would consider to be hard evidence here.

Okay. Sorry if I sound like I doubt you, but I have no idea who you are and skepticism is my default stance.

Just becasue a single-threaded program iteratoes over the same data without error does not mean that lexical_cast is the culprit here, it may simply be that you're not synchronizing some other peice of code that eventually means lexical_cast gets fed some bad data.

/

Let us know how you get on with trying to track down the data that caused the exception to be thrown./

I already have done this. I get data from Collada documents as std::strings. I print them out. I have boost::tokenizer tokenize the strings. I print out the tokens. I have boost::lexical_cast convert the tokens to plain ole data types. In the single threaded implementation every cast can be checked. It is more difficult with the multithreaded application but generally the same thing is done.

This is exactly what you need to do. You have to get the code to print out exactly what it was trying to convert when the exception was thrown. See above.

Perhaps I misunderstand what data it is that you are referring to. If so, I'll find it out after correct my misunderstanding.

Keep us updated, I'm interested.

n

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

John Wilkinson

28 Apr 28 Apr

7:39 p.m.

________________________________________ From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Terrimane Pritchett Sent: Monday, April 27, 2009 5:06 PM To: boost-users@lists.boost.org Subject: Re: [Boost-users] [BOOST THREAD] Threads Spawning Unexpectedly

...

What you say about the Windows runtime may be a possibility but I am given pause by one thing...I don't change any compiler options for the single- threaded or multi-threaded implementation. Both implementations are linking against the same runtime. I will have to research what goes on with boost::threads a bit further but for now at least there is a starting point for investigation.

That may be part of your problem. Assuming that you are using Visual Studio, go to the project properties dialog. In the tree at the left, select Configuration Properties - C/C++ - Code Generation. About halfway down the list on the right, you will see a "Runtime Library" property. Set it to the correct value for your program. The choice should be obvious. John

Igor R

7:56 p.m.

...

That may be part of your problem. Assuming that you are using Visual Studio, go to the project properties dialog. In the tree at the left, select Configuration Properties - C/C++ - Code Generation. About halfway down the list on the right, you will see a "Runtime Library" property. Set it to the correct value for your program. The choice should be obvious.

Single-threaded CRT is available in MSVC7 and below, so if the OP uses more modern version, invalid CRT threading choice should not be a problem.

Terrimane Pritchett

9:28 p.m.

Hello everyone, Sorry for the delay in coming back to you. I managed to cause my workstation to overheat and die yesterday and had to wait for a replacement. I have managed to get my multi-threaded implementation to run without any obvious problems. Basically I have to make the following the following actions atomic by rapping them in a function call: 1. Invoke an external library to pase my file give me some data in one big blob 2. Instantiate boost::tokenizer to tokenize the data 3. Convert the data to usable types with boost::lexical_cast 4 Return the usable data to the caller of the function I really cannot say with certainty what went wrong but I will offer my best estimation. I use an external library to parse my files. The library relies on std::stringstream to convert its internal types to byte strings. I had multiple threads calling the same functions from the library and given std::stringstream is not thread safe there was opportunity for the data I requested to get corrupted before my code ever touched it. I use boost::lexical_cast and boost::tokenizer extensively both of which are not guaranteed to be thread safe - because this is dependent upon the thread safety guarantees or lack there of standard c/c++ library implementations they are built on top of. Particualy the standard does not require thread safety for std::stringstream, std::locale or the cctype library. So potentially multiple threads all using boost::lexical_cast or boost::tokenizer could have caused my data to be corrupted. These are all only precursory findings - so take them for what you will. The obvious solution was to make getting raw data and converting it to something useable as atomic as possible and then imposing my own thread safety measures myself. I've been able to complete several passes of my test units without any problems after making these changes for both debug and release builds. On the issue of my extra threads - I can only say for every boost::thread I spawn there is another thread spawned. The companion thread is something that spins around in the function "free" until the boost::thread completes. I'm sure it's an implementation detail I shouldn't concern myself about since the behavior is predicatable and doesn't appear to be causing my application any adverse affects. Now that my application runs to completion without crashing I see these "extra" threads dissappear as my boost::threads dissappear so I've probably been fretting over nothing. If I've obviously misunderstood, mistepped, or misrepresented something please correct me.

...

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

5961

Age (days ago)

5963

Last active (days ago)

List overview

Download

12 comments

4 participants

participants (4)

Igor R
John Wilkinson
Nigel Rantor
Terrimane Pritchett