deep copying reference_wrapper<>s in a bind()

older
[wave] linking errors for the head...

Edd Dawson

11 Jul 2007 11 Jul '07

7:29 p.m.

Hi folks, I'm currently attempting to create a little facility that will allow me to eaily kick off a function call in another thread and get the result through a "future". Something like: unsigned count_primes_upto(std::size_t max); async::layered_cage < std::exception, std::domain_error, async::catch_all

...

cage; // to catch exceptions // call count_primes_upto(9876543210) in another thread async::future<unsigned> n = async::call(cage, &count_primes_upto, 9876543210); // do other stuff in the mean time // ... // Block until done. Get result or propagate exception std::cout << "there are " << n.value() << " primes less than " << 9876543210 <<'\n'; To achieve this I'm using boost::bind and boost::function quite heavily. I've discovered that by the time the final boost::function<void ()> is composed for the boost::thread constructor, I've copied each of the arguments (such as 9876543210 in the above) about 30 times!. Now, I can use boost::ref() internally in such a way that no copying of arguments happens at all. However, this is also undesirable; I would like the thread that is created to have its own copy of each argument unless the client explicitly wraps an argument in boost::ref/cref in order to avoid dangling cross-thread references. So what I'd like is a way to copy the result of my final boost::bind() call in to a new functor where all the ref() arguments are deep-copied once and only once at the end. Is there some way of doing this? Perhaps the "as yet undocumented" visit_each is relevant, here? Kind regards, Edd

Show replies by date

Kirit Sælensminde

12 Jul 12 Jul

4:51 a.m.

Edd Dawson wrote:

...

Hi folks,

I'm currently attempting to create a little facility that will allow me to eaily kick off a function call in another thread and get the result through a "future". Something like:

[snip]

...

To achieve this I'm using boost::bind and boost::function quite heavily. I've discovered that by the time the final boost::function<void ()> is composed for the boost::thread constructor, I've copied each of the arguments (such as 9876543210 in the above) about 30 times!.

I've done what you're describing. I've not looked to see how many times the arguments are copied though. It does sound a bit extreme. I wouldn't start a new thread for each invocation though. Normally you'd be better off re-using an earlier thread. Unless you're talking about very large parameters the thread start up and shut down will probably still take longer than the argument copying, but you'll have to take timings to be sure.

...

Now, I can use boost::ref() internally in such a way that no copying of arguments happens at all. However, this is also undesirable; I would like the thread that is created to have its own copy of each argument unless the client explicitly wraps an argument in boost::ref/cref in order to avoid dangling cross-thread references.

So what I'd like is a way to copy the result of my final boost::bind() call in to a new functor where all the ref() arguments are deep-copied once and only once at the end.

Why not copy them once at the beginning into a structure with a pointer? I've not looked at how I might do this in mine, but I imagine our architectures aren't that dissimilar -- I use multiply nested functions that wrap the functions to execute at the higher levels into the new function signatures needed by the lower levels. Note that the refs you're talking about are probably the same number of bits as the integer. What you really need to do is to decide whether to just copy the argument or not depending on what the argument is. One thought might be to create a mismatch between the function signature and what you're passing in. Say your prime number function took a string as an argument. Normally you would do this: long primes( const std::string &bignum ); std::string limit( "9876543210" ); async::call(cage, &count_primes_upto, limit ); This will copy the string a load of times. But wouldn't this copy it once at the end when passed into primes? long primes( std::string bignum ); std::string limit( "9876543210" ); async::call(cage, &count_primes_upto, boost::ref( limit ) ); Another alternative that might work for strings is this: long primes( const std::string &bignum ); std::string limit( "9876543210" ); async::call(cage, &count_primes_upto, limit.c_str() ); That should force a string constructor only at the end of the chain, the rest of the time passing a pointer. I'm not sure if any of those will work, or if they will really address your problem though. Maybe something to try? K

Edd Dawson

5:57 p.m.

Hi Kirit! Kirit Sælensminde wrote:

...

Edd Dawson wrote:

...
I'm currently attempting to create a little facility that will allow me to eaily kick off a function call in another thread and get the result through a "future". Something like:

[snip]

...
To achieve this I'm using boost::bind and boost::function quite heavily. I've discovered that by the time the final boost::function<void ()> is composed for the boost::thread constructor, I've copied each of the arguments (such as 9876543210 in the above) about 30 times!.

...

I wouldn't start a new thread for each invocation though. Normally you'd be better off re-using an earlier thread.

That is indeed something I've considered and I imagine that I'll experiement with this in due course.

...

Unless you're talking about very large parameters the thread start up and shut down will probably still take longer than the argument copying, but you'll have to take timings to be sure.

I'm sure that's true, but I don't want to be copying say 3 strings of bloaty XML 30 times each. It's wasteful. I wouldn't mind if it was 2 or 3. But as it stands, I'm not particularly happy.

...

...
So what I'd like is a way to copy the result of my final boost::bind() call in to a new functor where all the ref() arguments are deep-copied once and only once at the end.

...

Why not copy them once at the beginning into a structure with a pointer? I've not looked at how I might do this in mine, but I imagine our architectures aren't that dissimilar -- I use multiply nested functions that wrap the functions to execute at the higher levels into the new function signatures needed by the lower levels.

Yes, that's exactly what I'm doing, too (http://www.mr-edd.co.uk/?p=54). I may have to create some custom argument-copying code, as you suggest.

...

Note that the refs you're talking about are probably the same number of bits as the integer. What you really need to do is to decide whether to just copy the argument or not depending on what the argument is.

The refs are small, yes. But it's not the small objects I'm worried about. It's the containers of data. By default, I feel it's safest that the function call get it's own local copy of all the data. The last thing I want is for the thread to continue to run while the data it references has gone out of scope in the "parent" thread. But if the client knows that the data will hang around for long enough, they can say so by explicitly wrapping a boost::ref() around the argument. I imagine that similar reasoning was used to decide that boost::bind should copy it's arguments by default, rather than using references/pointers. IMHO, reference-semantics-by-default is fraught with danger.

...

One thought might be to create a mismatch between the function signature and what you're passing in. Say your prime number function took a string as an argument. Normally you would do this:

long primes( const std::string &bignum ); std::string limit( "9876543210" ); async::call(cage, &count_primes_upto, limit );

This will copy the string a load of times. But wouldn't this copy it once at the end when passed into primes?

If by "this" in 2nd sentence the above, you're referring to the code below, then no; no copy is made at the end. The refrence_wrapper provides a conversion to a std::string&, so no copy is required.

...

long primes( std::string bignum ); std::string limit( "9876543210" ); async::call(cage, &count_primes_upto, boost::ref( limit ) );

Another alternative that might work for strings is this:

long primes( const std::string &bignum ); std::string limit( "9876543210" ); async::call(cage, &count_primes_upto, limit.c_str() );

That should force a string constructor only at the end of the chain, the rest of the time passing a pointer.

That's an interesting thought! It may be possible to create some kind of template helper to this end, that forces a conversion and therefor a copy. The twist is that I *think* I need the argument copies to be made before the other thread has started and not at the point where the "real" function is finally called, else I could end up with dangling references. But I'll have a think about that -- thanks! Otherwise, I guess I'm going to have to roll my own argument-packing code. My fear is that it will be painfully close to some of the stuff I'm already using from Boost. Kind regards, Edd

Edd Dawson

13 Jul 13 Jul

12:31 a.m.

Hi again, Kirit! Kirit Sælensminde wrote:

...

That should force a string constructor only at the end of the chain, the rest of the time passing a pointer.

Based on your suggestion I've come up with a nice and simple solution. Rather than doing a single copy at the end I can do once at the beginning, using this crazy looking contraption: template<typename T> class ref_once_copied { public: ref_once_copied(const T &obj) : obj_(new T(obj)), referenced_(false) { } ~ref_once_copied() { if (referenced_) delete obj_; } operator T &() { referenced_ = true; return *obj_; } private: T *obj_; bool referenced_; }; It's a bit like boost::reference_wrapper<>, except that it makes a copy of the object given to the constructor. Copies are cheap because I'm only copying a pointer and a bool. I'm not *too* bothered about copying these ~30 times :) We know that the conversion-to-T&-operator will be called exactly once and after it's been called we know this ref_once_copied object won't be used again or copied anywhere else. So, we can flag the internal pointer for deletion in the destructor. I started out using a boost::shared_ptr<> for the obj_ member, but I realised that approach outlined above would work just as well and also avoid the extra overhead that the reference counting entails. So, in async::call(), for each argument I find the value of: boost::is_reference_wrapper<ArgType>::value || boost::is_scalar<ArgType>::value. If this value is true, then I don't wrap the argument in a ref_once_copied<>, otherwise I do. Thus arguments that aren't explicitly wrapped by boost::ref() are copied once and only once. So I just wanted to say thanks! I wouldn't have gone down this road if it wasn't for your suggestion. Perhaps you'll find a use for this in your implementation? Kind regards, Edd

Kirit Sælensminde

9:33 a.m.

Edd Dawson wrote:

...

Hi again, Kirit!

Kirit Sælensminde wrote:

...
That should force a string constructor only at the end of the chain, the rest of the time passing a pointer.

Based on your suggestion I've come up with a nice and simple solution. Rather than doing a single copy at the end I can do once at the beginning, using this crazy looking contraption:

template<typename T> class ref_once_copied { public: ref_once_copied(const T &obj) : obj_(new T(obj)), referenced_(false) { } ~ref_once_copied() { if (referenced_) delete obj_; } operator T &() { referenced_ = true; return *obj_; }

private: T *obj_; bool referenced_; };

That looks like a pretty smart move. The first question I had was about using types so as to eliminate the bool. I.e. arrange for it to return a slightly different type, maybe via a function used to wrap this class. Now that I'm looking at it again though I'm not sure that I completely understand your implementation. It looks to me like you're always forcing a copy (in the constructor) so what job does referenced_ do? I can see that it has value if you move the copy from the constructor into the type cast (the operator T &()), but as it is in the constructor if the reference isn't taken for some reason then it will leak a T. It looks like you've changed your mind about the semantics half way through implementing it.

...

I started out using a boost::shared_ptr<> for the obj_ member, but I realised that approach outlined above would work just as well and also avoid the extra overhead that the reference counting entails.

I agree with this. In a multi-threaded environment those interlocked increments and decrements aren't cheap. A policy based implementation might be nice where we can provide the right threading semantics to the shared_ptr would be cool.

...

So I just wanted to say thanks! I wouldn't have gone down this road if it wasn't for your suggestion. Perhaps you'll find a use for this in your implementation?

I certainly will! :-) You're the same Edd that has a C++ JPEG library as well aren't you? I've bookmarked it and will have a play with that once I get a chance. K

Edd Dawson

5:36 p.m.

Hi Kirit! Kirit Sælensminde wrote:

...

Edd Dawson wrote:

...
Hi again, Kirit!

Kirit Sælensminde wrote:

...

...
template<typename T> class ref_once_copied { public: ref_once_copied(const T &obj) : obj_(new T(obj)), referenced_(false) { } ~ref_once_copied() { if (referenced_) delete obj_; } operator T &() { referenced_ = true; return *obj_; }

private: T *obj_; bool referenced_; };

That looks like a pretty smart move. The first question I had was about using types so as to eliminate the bool. I.e. arrange for it to return a slightly different type, maybe via a function used to wrap this class.

I did spend a while thinking along those lines, but I failed to come up with anything :(

...

Now that I'm looking at it again though I'm not sure that I completely understand your implementation. It looks to me like you're always forcing a copy (in the constructor) so what job does referenced_ do?

I'm forcing a copy at the start. Copies of these objects are made a bunch of times (~30, as I say) before they find their resting place inside some functor, F say, created by boost::bind(). But copies of these ref_once_copied objects are cheap, so they don't bother me. Then, when F is called inside the child thread, the function proper is handed these objects as arguments. At that point the conversion operators kick in. Once that's happened, the referenced_ flag is set so that the destructor knows the obj_ members have been used and can be deleted.

...

I can see that it has value if you move the copy from the constructor into the type cast (the operator T &()), but as it is in the constructor if the reference isn't taken for some reason then it will leak a T.

That's true. But the conversion operator is *always* invoked exactly once, right at the point where the actual function is called. However, I had a nasty thought on the way to work today. By moving away from a reference counted approach, I've lost exception safety. Essentially, I have code that looks very roughly like this inside my async::call(): template< guff > return_type_guff call(const Caller &caller, const Functoid &f, const Argtype0 &a0, ..., const ArgyTypeN &aN) { return make_future( caller, boost::bind(&some_helper, f, wrap(a0), ..., wrap(aN)) ); } Each call to wrap() returns either its argument unmodified (if it's a boost::reference_wrapper or a scalar) else an appropriate ref_once_copied<> made from that argument. But if one of those calls to wrap() throws, I'm going to be leaking left right and center because the ref_once_copied<> destructors won't delete their obj_ members as the conversion operator wont have been called yet. So for now I'm going to go back to reference counting. It's the only way I can think of doing this safely.

...

It looks like you've changed your mind about the semantics half way through implementing it.

...
I started out using a boost::shared_ptr<> for the obj_ member, but I realised that approach outlined above would work just as well and also avoid the extra overhead that the reference counting entails.

I agree with this. In a multi-threaded environment those interlocked increments and decrements aren't cheap. A policy based implementation might be nice where we can provide the right threading semantics to the shared_ptr would be cool.

Well I'm leaning towards implementing my own very light-weight reference count for use in ref_once_copied. In my implementation, I shouldn't need any atomic/interlocked stuff at the argument-binding stage as far as I can tell. However, I have to confess I'm still really rather new to multithreading and concurrency which is why I felt the need to create an easy to use mechanism such as this. So I may well be wrong.

...

...
So I just wanted to say thanks! I wouldn't have gone down this road if it wasn't for your suggestion. Perhaps you'll find a use for this in your implementation?

I certainly will! :-)

You're the same Edd that has a C++ JPEG library as well aren't you? I've bookmarked it and will have a play with that once I get a chance.

That's me! I'd be interested to hear how you get on with it! I'm toying with the idea of creatng a similar library for pngs, but that won't be for some time, yet. Edd

6570

Age (days ago)

6572

Last active (days ago)

List overview

Download

5 comments

2 participants

participants (2)

Edd Dawson
Kirit Sælensminde