Re: [boost] [move] new rvalue reference emulation

11 Apr 2011

      On Mon, Apr 4, 2011 at 10:36 AM, Jeffrey Lee Hellrung, Jr. <
jeffrey.hellrung@gmail.com> wrote:
[...snipping original description...]

I have a follow-up to my original message regarding a typeless emulated
rvalue reference.  I was disappointed in the lack of feedback (other than
Dave) on the original message, which might mean that those who took the time
to parse the message didn't think it was worth pursuing; maybe C++03 is
already too outdated and we should forget about it?  However, at the moment,
I prefer to think that everyone (except, again, Dave; thanks Dave!) was
simply too busy to delve into what I was trying to convey.  So let me try
again, with a longer, more detailed, and more leisurely explanation, and
this time I additionally present a partial solution to extend the technique
to quasi-polymorphic result types (I'll explain what I mean below).

If you're already plenty familiar with the difficulty of capturing arbitrary
rvalues in C++03, you can skip to "TYPELESS EMULATED RVALUE REFERENCES".
The "new" stuff dealing with quasi-polymorphic result types is even further
down, at "QUASI-POLYMORPHIC RESULT TYPES".

[In the exposition below, whenever I say "lvalue-to-const" or
"lvalue-to-non-const", I usually mean "lvalue-reference-to-const" and
"lvalue-reference-to-non-const", respectively.]

First, the big picture: In C++03, it is notoriously difficult (and, until I
discovered this use of typeless emulated rvalue references, I thought it was
impossible) to capture *arbitrary* rvalues *and* distinguish between
captured rvalues and lvalues-to-const.  To be specific, one generally uses a
"template< class T > void foo(T const &)" overload to capture arbitrary
rvalues, but unfortunately this overload also binds to lvalues-to-const.
This makes it impossible to distinguish between a captured rvalue and an
lvalue-to-const, hence the function body must go the safe route and assume
the argument is const, precluding any kind of move optimizations.

Now, if we restrict the problem to capturing *specific* rvalues instead of
*arbitrary* rvalues, then things look better.  Boost.Move defines a class
template rv<T> and introduces the convention that rv<T>& represents an
(emulated) rvalue reference.  A move-emulation-enabled class X defines a
non-const conversion operator to rv<X>&, which basically just amounts to a
static cast: "return static_cast< rv<X>& >(*this);" (see the Boost.Move
documentation, either in trunk or the sandbox; or past mailing list
messages; if this is unfamiliar).  This means that overloads with argument
rv<X>& are candidates to capture rvalues of type X, and leads to defining
the following 3 overloads for foo:

template< class T >
void foo(T&); // 1
template< class T >
typename boost::disable_if< boost::is_convertible< T&, rv<X>& > >
void foo(T const &); // 2
void foo(rv<X>&); // 3

If one goes through all the various argument binding possibilities, one
finds that: the "foo(T&); // 1" overload captures lvalues-to-non-const, X
const & (due to the use of SFINAE via disable_if in the "foo(T const &); //
2" overload), and explicitly created emulated rvalue references *except*
rv<X>& (these are captured by the "foo(rv<X>&); // 3" overload); the "foo(T
const &); // 2" overload captures lvalues-to-const *except* X const & (these
are captured by the "foo(T&); // 1" overload due to disable_if) and rvalues
*except* X-rvalues (again, thanks to disable_if); finally, the "foo(rv<X>&);
// 3" overload captures X-rvalues and explicitly created
emulated-rvalue-references-to-X.  To be absolutely clear:

template< class T > T make();
// ...
X x;
X const cx;
Y y; // Y != X
Y const cy;
foo(x); // 1, T = X
foo(cx); // 1, T = X const
foo(make<X>()); // 3 (*not* 2 !!!), as desired :)
foo(y); // 1, T = Y
foo(cy); // 2, T = Y
foo(make<Y>()); // 2, T = Y, since we didn't tell the foo overloads about Y
:(
// and if Y is move-emulation-enabled...
foo(static_cast< rv<Y>& >(y)); // 3

Thus, X-rvalues are *accurately* captured, meaning that an X-rvalue argument
to foo is bound to an emulated-rvalue-reference-to-X, rv<X>&, allowing the
argument to be moved rather than copied.  This is better than using only the
first two foo overloads, especially if the class X plays a prominent role in
the specific application of the foo overload set.  For example, the
push_back member function of vector<X> can accurately capture X-rvalues
using this technique, which is highly desirable.

Clearly, this technique can be extended to any finite, compile-time-known
set of "special" move-emulation-enabled types X0, ..., X[n-1] such that all
Xi-rvalues (i = 0, ..., n-1) are accurately captured.  This would be
applicable to, say, variant< X0, ..., X[n-1] > constructor and assignment
overloads.

The ability to accurately capture any rvalue of a type within a specific,
compile-time-known set of "special" types is quite powerful, and is, indeed,
sufficient for many applications.  However, many other applications don't
have such a set of special types; instead, it is desirable to accurately
capture arbitrary rvalues of a priori unknown types.  At first, one might
think that the emulated rvalue reference rv<T>& might come to the rescue and
try to use a "template< class T > void foo2(rv<T>&)" overload.  However,
this overload cannot bind to rvalues:

template< class T >
void foo2(rv<T>&);
// ...
foo2(make<X>()); // will not bind to foo2<T>(rv<T>&)! :(

Unfortunately, the "T" in the "rv<T>&" argument is a deduced template
parameter, so there's no way for the compiler to know what T should be when
looking for conversions from X to rv<T>&.  It might look obvious that T
should be X, especially considering that X provides an implicit conversion
to rv<X>& and this fact is plainly available to the compiler, but, trust me,
things just don't (and can't) work that way.  The problem appears to lie in
the template parameter deduction, and this motivates the introduction of a

TYPELESS EMULATED RVALUE REFERENCE

What we're going to try to do is capture arbitrary rvalues in a typeless
fashion, thereby obviating the need for any template parameter deduction and
allowing the compiler to locate a conversion operator from the rvalue to the
typeless emulated rvalue reference:

template< class T >
void foo3(genrv< ??? > const x)
{ /* ... */ }

genrv< ??? > is basically just a wrapper around a void pointer which points
to the type-erased object ("gen" being short for "generic").  Of course, we
have to be able to recover the original type of the object prior to
type-erasure, which we effect via dynamic dispatch through a function
reference.  We use the function reference together with the visitor pattern
to be able to "do something" with caught object.  A basic formulation of the
pieces looks as follows:

// New typeless emulated rvalue reference class.
template< class Visitor >
struct genrv
{
    // Adding support for non-void result types is easy, but we'll use void
for now to simplify exposition.
    typedef void result_type;

    template< class T >
    explicit genrv(T* const p)
        // Set the function reference to a specific instantiation of the
static apply member function.
        // This way, we can recover the type of the object pointed to by mp
later.
        : m_apply(apply<T>),
          mp(static_cast< void* >(p))
    { }

    result_type operator()(Visitor visitor) const
    { return m_apply(visitor, mp); }

private:
    template< class T >
    static result_type apply(Visitor visitor, void* const p)
    // Explicitly cast the object to an emulated rvalue reference to ensure
the right overload is selected.
    { return visitor(static_cast< rv<T>& >(*static_cast< T* >(p))); }

    result_type (&m_apply)(Visitor, void* const);
    void* const mp;
};

// ...

// Assume X is just a typical move-emulation-enabled class...
struct X
{
    // ...
    // The usual conversion to rv<X>&, as introduced by Boost.Move.
    operator rv<X>&() { return static_cast< rv<X>& >(*this); }
    // New conversion operator to genrv<V>!
    template< class V >
    operator genrv<V>() { return genrv<V>(this); }
};

// ...

// Typical function object...
struct foo4
{
    template< class T >
    void operator()(T&) const; // 1

    typedef genrv< foo4 > genrv_type;

    template< class T >
    typename boost::disable_if< boost::is_convertible< T&, genrv_type >
...
::type
    operator()(T const &) const; // 2
// The disable_if in "operator()(T const &) const; // 2" forces rvalues
of move-emulation-enabled classes to bind here.
    void operator()(genrv_type const x) const // 3
    // Dispatch to overload "operator()(T&) const; // 1", where T = rv<
[whatever type x is wrapping] >.
    { return x(*this); }
};

Let's go through what's happening here.  Suppose foo4 is passed an X-rvalue
as an argument.  This rvalue cannot bind to the "operator()(T&) const; // 1"
overload; the fact that X has an implicit conversion to foo4::genrv_type
means that it cannot bind to "operator()(T const &) const; // 2" due to the
disable_if *and* that it *can* bind to "void operator()(genrv_type const x)
const // 3".  So overload 3 is chosen, requiring the implicit conversion
from the X-rvalue to a genrv_type.  The genrv_type is constructed such that
its m_apply member variable references genrv_type::apply<X>(Visitor, void*
const) and its mp member variable points to the X-rvalue.  Then we *visit*
the genrv_type object with the foo4 function object (this is the "x(*this)"
expression), causing a runtime dispatch to genrv_type::apply<X>(Visitor,
void* const) via the m_apply function reference.
genrv_type::apply<X>(Visitor, void* const) "knows" the "real" type of the
object pointed to be the void pointer is X, so it can safely cast up to the
emulated rvalue reference rv<X>& and pass this to the visitor object.
Execution thus passes to the "void operator()(T&) const; // 1" overload,
with T = rv<X>, which is the desired outcome.  The X-rvalue is (eventually)
bound to an emulated rvalue reference, without foo4 possessing any special
knowledge of X itself!

One can also verify that passing an lvalue or a non-move-emulation-enabled
rvalue as an argument to foo4 will bind to one of the first two overloads,
as desired.

That's the basic idea.  Now, the drawbacks.  First, of course, there's the
dynamic dispatch through the function reference to recover the original type
of the object.  Unfortunately, it looks like MSVC9 and the version of GCC I
tried (a relatively recent version) were unable to eliminate the dispatch
(default release mode settings for MSVC9; only added -O3 to GCC), implying
some runtime overhead.  However, the cost of such an indirection would
likely be compensated many times over by through the avoidance of
unnecessary copying of movable rvalues.  In other words, this is still
better than capturing by "T const &".

The second (and, in my opinion, more serious) drawback is a pretty severe
restriction on the result type.  Fully polymorphic result types (that is,
result types that depend arbitrarily on the argument type(s)) are out of the
question.  In other words, there's no way to do the equivalent of the
following C++0x code in C++03 without result_of::foo5 possessing some kind
of almost trivial "structure", like being a constant metafunction.

namespace result_of
{ template< class T > struct foo5 { typedef /* ... */ type; }; }

template< class T >
typename result_of::foo5<T>::type
foo5(T&& x);

One can support non-void monomorphic result types fairly easily by only a
slight adjustment of the genrv class template and the genrv conversion
operator:

// Adding a Result template parameter...not much else changes.
template< class Visitor, class Result = void >
struct genrv
{
    typedef Result result_type;

    template< class T >
    explicit genrv(T* const p)
        : m_apply(apply<T>),
          mp(static_cast< void* >(p))
    { }

    result_type operator()(Visitor visitor) const
    { return m_apply(visitor, mp); }

private:
    template< class T >
    static result_type apply(Visitor visitor, void* const p)
    { return visitor(static_cast< rv<T>& >(*static_cast< T* >(p))); }

    result_type (&m_apply)(Visitor, void* const);
    void* const mp;
};

// ...

struct X
{
    // ...
    operator rv<X>&() { return static_cast< rv<X>& >(*this); }
    template< class V, class R >
    operator genrv<V,R>() { return genrv<V,R>(this); }
};

// ...

struct foo6
{
    typedef /*...*/ result_type;

    template< class T >
    result_type operator()(T&) const; // 1

    // Just need to specify the Result template parameter in genrv...
    typedef genrv< foo6, result_type > genrv_type;

    template< class T >
    typename boost::disable_if< boost::is_convertible< T&, genrv_type >,
result_type >::type
    operator()(T const &) const; // 2

    result_type operator()(genrv_type const x) const // 3
    { return x(*this); }
};

This was as far as I had gotten before, in the original message.  Today, I
got one step further, figuring out how to support

QUASI-POLYMORPHIC RESULT TYPES

By a quasi-polymorphic result type, I mean that the result type *can* depend
on the argument type, but only in a very limited way.  Specifically, the
following technique allows one to support any finite, compile-time-known set
of result types.  The basic idea here is to be able to "tag" certain genrv
instantiations and limit the implicit conversions to these instantiations.
Let's use a concrete example with just two result types.  Suppose we
partition our argument type space up into 2 sets, those which satisfy predA
and those which satisfy predB, where pred[A,B] are Boost.MPL metafunctions.
Thus, every argument type satisfies either predA or predB, but never both
simultaneously (so, informally, predB(T) = !predA(T)).  Let resultA and
resultB be the result types after application on arguments types satisfying
predA and predB, respectively.  In other words:

struct foo7
{
    template<class> struct result;
    // Boost.ResultOf protocol
    template< class This, class T >
    struct result< This ( T ) >
        : boost::mpl::if_<
              boost::mpl::apply1< predA, T >,
              resultA,
              resultB
          >
    { };
    // ...
};

To support this result structure for foo7, we add a couple extra template
parameters to genrv:

template<
    class Visitor, class Result = void,
    // Add a couple more template parameters to aid the suppression of
certain implicit conversions, as we'll shortly see...
    class Cond = boost::mpl::always< boost::true_type >,
    bool Enable = true
...
struct genrv
{
    typedef Result result_type;

    template< class T >
    explicit genrv(T* const p)
        : m_apply(apply<T>),
          mp(static_cast< void* >(p))
    // Make sure we didn't f*** up.
    { BOOST_MPL_ASSERT((boost::mpl::apply1< Cond, T >)); }

    result_type operator()(Visitor visitor) const
    { return m_apply(visitor, mp); }

private:
    template< class T >
    static result_type apply(Visitor visitor, void* const p)
    { return visitor(static_cast< rv<T>& >(*static_cast< T* >(p))); }

    result_type (&m_apply)(Visitor, void* const);
    void* const mp;
};

We likewise supplement the genrv conversion operator in X:

// MSVC9 ICE's without this helper struct :(
template< class Cond, class T >
struct apply_cond
{ static const bool value = boost::mpl::apply1< Cond, T >::type::value; };

struct X
{
    // ...
    template< class V, class R, class C >
    operator genrv<V,R,C,apply_cond<C,X>::value>()
    { return genrv<V,R,C,apply_cond<C,X>::value>(this); }
};

Now, when the compiler is looking to convert an X to a genrv<V,R,C,true>,
that conversion operator will only exist if X satisfies the Boost.MPL
predicate C.  If X does *not* satisfy C, then X can only be converted to
genrv<V,R,C,false>.  Now, our foo7 overloads with genrv arguments will
always use the default Enable parameter, which is true, thus ensuring that
anything that binds to genrv<V,R,C> really does satisfy C.  Thus, the foo7
overloads look like the following:

struct foo7
{
    template<class> struct result; // Boost.ResultOf protocol
    template< class This, class T >
    struct result< This ( T ) >
        : boost::mpl::if_<
              boost::mpl::apply1< predA, T >,
              resultA,
              resultB
          >
    { };
    // Add a specialization to transform rv<T>& to plain T.
    template< class This, class T >
    struct result< This ( rv<T>& ) >
        : result< This ( T ) >
    { };

    template< class T >
    typename result< foo7 ( T& ) >::type
    operator()(T&) const; // 1

    typedef genrv< foo7, resultA, predA > genrvA_type;
    typedef genrv< foo7, resultB, predB > genrvB_type;

    template< class T >
    typename boost::lazy_disable_if<
        boost::mpl::or_<
            boost::is_convertible< T&, genrvA_type >,
            boost::is_convertible< T&, genrvB_type >
        >,
        result< foo7 ( T const & ) >
    >::type
    operator()(T const &) const; // 2

    // Captures move-emulation-enabled rvalues satisfying predA.
    resultA operator()(genrvA_type const x) const // 3A
    { return x(*this); }
    // Captures move-emulation-enabled rvalues satisfying predB.
    resultB operator()(genrvB_type const x) const // 3B
    { return x(*this); }
};

This is still pretty far from supporting fully polymorphic result types, but
it is still (maybe only slightly) better than monomorphic result types, and
likely the best we can do with this technique, given that none of the genrv
overloads can have deduced template parameters.

CONCLUSION

Despite the drawbacks of this technique (the probable runtime overhead,
limitations on the result type, and additional complexity), I think this
would be a worthwhile addition to the Boost.Move library.  I look forward to
hearing what others think about it...

- Jeff