[proto] transforms && rvalue refs

I'm playing with proto on gcc 4.3 with c++0x enabled and trying to see if I can use rvalue refs inside callable transforms, to tell what is/isn't temporary while transforming an expression. I have a different way to do this, involving embedding the 'rvalueness' of an object in its type... yuck. Hopefully something better is out there. The use case: a = exp(exp(b)); (as below), where a and b are terminals and exp is a proto lazy function. This expression should result in only one copy of b... You'd want the inner call 'exp(b)' to see that it is being passed an lvalue, copy the value to a temp, modify and return std::move(temp). The outer call to exp() should see that an rvalue has arrived and feel free to overwrite it and return it by std::move. I think. The closest I can get is below. I thought I'd ask for a sanity-check on this before go any further... I have the feeling the check will come back negative. Thanks in advance, -t #include <boost/proto/proto.hpp> #include <boost/ref.hpp> #include <cassert> namespace bp = boost::proto; // // Tag for calls to exp() in our DSEL // struct exp_tag { friend std::ostream& operator<<(std::ostream& os, exp_tag) { return os << "exp_tag"; } }; // // The terminal. Assume this thing is very heavy. Some construtors // have assert(0) in them to verify they aren't called. // // struct array_impl { std::string name; array_impl() : name("NONAME") { } array_impl(const std::string& name_) : name(name_) { std::cout << "name=" << name << "\n"; } // // constructors // array_impl(array_impl& rhs) : name(rhs.name + "_copied") { } array_impl(const array_impl& rhs) : name(rhs.name+ "_copied") { } array_impl(array_impl&& rhs) { name.swap(rhs.name); name += "_moved"; } array_impl& operator=(array_impl&& rhs) { name.swap(rhs.name); name += "_moved"; return *this; } // // not called // array_impl& operator=(array_impl& rhs) { assert(0); // just to be sure what is and isn't getting called } array_impl& operator=(const array_impl& rhs) { assert(0); // just to be sure what is and isn't getting called } array_impl& operator=(const array_impl&& rhs) { assert(0); // just to be sure what is and isn't getting called } friend std::ostream& operator<<(std::ostream& os, const array_impl&) { return os << "array_impl"; } }; // // The transform for calls to exp. You'd like to be able to see if // this is getting called with a movable object. // struct UnaryFnCall : bp::callable { typedef array_impl result_type; template <typename Tag> result_type operator()(Tag, const array_impl& t) { std::cout << "transform a const ref\n"; array_impl tmp(t); return std::move(tmp); } template <typename Tag> result_type operator()(Tag, array_impl&& t) { // this guy doesn't get called, but I'd like him to. assert(0); // if this trips, something good happened. return std::move(t); } }; struct Grammar : bp::or_<bp::when<bp::terminal<array_impl>, bp::_value>, bp::when<bp::unary_expr<exp_tag, Grammar>, UnaryFnCall(exp_tag(), Grammar(bp::_child0))> > { }; template <typename T> struct Expression; struct Domain : bp::domain<bp::pod_generator<Expression>, Grammar> { }; template <typename Expr> struct Expression { BOOST_PROTO_EXTENDS(Expr, Expression<Expr>, Domain); }; template<class T> typename bp::result_of::make_expr< exp_tag, Domain, T&
::type exp (T&& t) { return bp::make_expr<exp_tag, Domain>(boost::ref(t)); }
// // Our expression type that // struct array : public Expression<bp::terminal<array_impl>::type> { typedef array_impl impl_t; impl_t& self() { return boost::proto::value(*this); } array() { assert(0); } array(const std::string& name) { self().name = name; } array(const array&) { assert(0); } array(array&&) { assert(0); } template <typename Expr> array& operator=(Expr const& expr) { bp::display_expr(expr, std::cout); BOOST_MPL_ASSERT(( bp::matches<Expr, Grammar> )); typedef typename boost::result_of<Grammar(Expr const&)>::type result_t; result_t r = Grammar()(expr); self().operator=(std::move(r)); return *this; } }; int main(int, char**) { array a("a"), b("b"); // Here, only one copy is made a = exp(b); std::cout << "a's name:" << a.self().name << "\n\n"; // Here, two copies are made a = exp(exp(exp(b))); std::cout << "a's name:" << a.self().name << "\n"; }

troy d. straszheim wrote:
I'm playing with proto on gcc 4.3 with c++0x enabled and trying to see if I can use rvalue refs inside callable transforms, to tell what is/isn't temporary while transforming an expression. I have a different way to do this, involving embedding the 'rvalueness' of an object in its type... yuck. Hopefully something better is out there.
Using expression templates to eliminate unnecessary temporaries is a well understood idiom; it's why expression templates were invented in the first place. You don't need rvalue references at all. Could you say more about what you are trying to do and why you think rvalue references are needed? -- Eric Niebler BoostPro Computing http://www.boostpro.com

Eric Niebler wrote:
Using expression templates to eliminate unnecessary temporaries is a well understood idiom; it's why expression templates were invented in the first place. You don't need rvalue references at all.
Could you say more about what you are trying to do and why you think rvalue references are needed?
Maybe I've missed something stupid or have organized things so as to cause myself trouble. In an expression like // these contain pointers to memory out on the GPU array<float> A, B(10,10), C(10,10); A = exp(B) * C; When it is time to evaluate the expression, the grammar fires the transform for exp(B) first. It creates a temporary T with the same dimensions as B and and a function is called to compute the result into T. Now 'T * C' happens. In this case there is an available temporary, 'T', so the result can go right in to T. T is returned and moved in to A. Done... at least, that's the idea. So how should these transforms know whether their arguments are overwritable or not? I didn't see how to get the 'slicing through the expression' approach, as in the calc examples, with contexts and overloaded operator[], to work here: each step of the calculation has to get done all at once, via a function call. But maybe it is time to go back and look again. -t

troy d. straszheim a écrit :
When it is time to evaluate the expression, the grammar fires the transform for exp(B) first. It creates a temporary T with the same dimensions as B and and a function is called to compute the result into T. Now 'T * C' happens. In this case there is an available temporary, 'T', so the result can go right in to T. T is returned and moved in to A. Done... at least, that's the idea.
No, when you evaluate the expression, you iterate via a for loop nest over the elements of all array, recursively calling the elementwise version of exp and *. The context just evaluate one element at a time. In the end, the only temporary created are those from the elementwise function return. Expression Tempalte never create copy of their terminal object unless requested -- ___________________________________________ Joel Falcou - Assistant Professor PARALL Team - LRI - Universite Paris Sud XI Tel : (+33)1 69 15 66 35

Joel Falcou wrote:
troy d. straszheim a écrit :
When it is time to evaluate the expression, the grammar fires the transform for exp(B) first. It creates a temporary T with the same dimensions as B and and a function is called to compute the result into T. Now 'T * C' happens. In this case there is an available temporary, 'T', so the result can go right in to T. T is returned and moved in to A. Done... at least, that's the idea.
No, when you evaluate the expression, you iterate via a for loop nest over the elements of all array, recursively calling the elementwise version of exp and *. The context just evaluate one element at a time. In the end, the only temporary created are those from the elementwise function return.
That's the point... you can't do that here. The elements behind A B and C are on the other end of a very slow bus. -t

troy d. straszheim a écrit :
That's the point... you can't do that here. The elements behind A B and C are on the other end of a very slow bus.
That wasn't clear, sorry. What kind of slow bus ? -- ___________________________________________ Joel Falcou - Assistant Professor PARALL Team - LRI - Universite Paris Sud XI Tel : (+33)1 69 15 66 35

Joel Falcou wrote:
troy d. straszheim a écrit :
That's the point... you can't do that here. The elements behind A B and C are on the other end of a very slow bus.
That wasn't clear, sorry. What kind of slow bus ?
A bus that has an CUDA-capable video card on the other end. Transforms fire CUDA kernels that do the math. Before I started playing with a c++0x capable compiler, I dealt with by putting 'rvalueness' into the type system. This works with proto expression templates: template <typename T, typename IsRValue = mpl::false_> class array : public Expression<bp::terminal<array_impl<T, IsRValue> > > { // ... transforms return array_impl<T, mpl::true_>, and later transforms can tell that their arguments are reusable. It works fine, though it is arguably ugly, and as eric points out, this isn't what expression templates are 'for'. Anyhow, the other day I decided to see if I could make those extra template parameters go away via rvalue refs </chase_own_tail> -t

troy d. straszheim wrote:
Eric Niebler wrote:
Using expression templates to eliminate unnecessary temporaries is a well understood idiom; it's why expression templates were invented in the first place. You don't need rvalue references at all.
Could you say more about what you are trying to do and why you think rvalue references are needed?
Maybe I've missed something stupid or have organized things so as to cause myself trouble. In an expression like
// these contain pointers to memory out on the GPU array<float> A, B(10,10), C(10,10);
A = exp(B) * C;
When it is time to evaluate the expression, the grammar fires the transform for exp(B) first. It creates a temporary T with the same dimensions as B and and a function is called to compute the result into T. Now 'T * C' happens. In this case there is an available temporary, 'T', so the result can go right in to T. T is returned and moved in to A. Done... at least, that's the idea.
This is how the expression would be evaluated eagerly -- that is, in the absence of expression templates -- right?
So how should these transforms know whether their arguments are overwritable or not? I didn't see how to get the 'slicing through the expression' approach, as in the calc examples, with contexts and overloaded operator[], to work here: each step of the calculation has to get done all at once, via a function call. But maybe it is time to go back and look again.
You use expression templates when you can take advantage of domain-specific knowledge combined with a complete description of the expression tree you're trying to evaluate to perform domain-specific optimizations like loop unrolling. You need to ask yourself, "What special features of my domain abstraction can I use to optimize the evaluation of this expression?" If the answer is "rvalues," then that's not an answer that justifies the use of expression templates. If you can find a way to use the full expression to precompute the size of the result and compute it in-place, *now* you're talking. See? -- Eric Niebler BoostPro Computing http://www.boostpro.com

Eric Niebler wrote:
troy d. straszheim wrote:
Eric Niebler wrote:
Using expression templates to eliminate unnecessary temporaries is a well understood idiom; it's why expression templates were invented in the first place. You don't need rvalue references at all.
Could you say more about what you are trying to do and why you think rvalue references are needed?
Maybe I've missed something stupid or have organized things so as to cause myself trouble. In an expression like
// these contain pointers to memory out on the GPU array<float> A, B(10,10), C(10,10);
A = exp(B) * C;
When it is time to evaluate the expression, the grammar fires the transform for exp(B) first. It creates a temporary T with the same dimensions as B and and a function is called to compute the result into T. Now 'T * C' happens. In this case there is an available temporary, 'T', so the result can go right in to T. T is returned and moved in to A. Done... at least, that's the idea.
This is how the expression would be evaluated eagerly -- that is, in the absence of expression templates -- right?
Yup.
So how should these transforms know whether their arguments are overwritable or not? I didn't see how to get the 'slicing through the expression' approach, as in the calc examples, with contexts and overloaded operator[], to work here: each step of the calculation has to get done all at once, via a function call. But maybe it is time to go back and look again.
You use expression templates when you can take advantage of domain-specific knowledge combined with a complete description of the expression tree you're trying to evaluate to perform domain-specific optimizations like loop unrolling. You need to ask yourself, "What special features of my domain abstraction can I use to optimize the evaluation of this expression?" If the answer is "rvalues," then that's not an answer that justifies the use of expression templates. If you can find a way to use the full expression to precompute the size of the result and compute it in-place, *now* you're talking. See?
That's definitely where I'm looking to go... I've been using proto to organize things and putting off the fancier stuff until the basics are in place. I just wanted to see if this rval thing would work. Thanks for the help here. -t
participants (3)
-
Eric Niebler
-
Joel Falcou
-
troy d. straszheim