Re: [boost] Formal Review Request: TypeErasure

On Jun 8, 2012, at 9:47 AM, Hite, Christopher wrote:
In a nutshell, concept-based-runtime-polymorphism/type-erasure allows you to have runtime polymorphism while preserving value semantics. The value of that win is very commonly underestimated.
I think you're overselling it a bit here. You can have value sematics with interfaces, just pass around these and heap all the time in clone and maintain single points of ownership or shared_ptr. It has about the same cost as boost::function.
struct interface{ virtual ~interface()=0; virtual interface* clone()=0; };
To be fair consider the version of any<..,_self&> which doesn't tackle storage and ownership and compare that with an interface with virtual functions. With the any<...,_self&> I've got a virtual table pointer and a pointer to the type-deleted object. With interface& I've got a pointer to an object with a virtual table intruding into it.
I suspect that you could achieve generic 'value semantics' for any pure virtual interface via a very simple macro + other C++11 features. I think the cost of making a call with the following code would be the same as using pointer semantics via a virtual call. Consider the following to define value semantics for my_interface: class my_interface { public: virtual ~my_interface() {}; virtual int add( int ) = 0; virtual int add( double, std::string ) = 0; virtual int sub( int ) = 0; }; BOOST_VALUE( my_interface, (add)(sub) ) struct test : virtual public my_interface { int add( int i ) { return i + 5; } int add( double, std::string ) { return 6; } int sub( int i ) { return i - 5; } }; int main( int argc, char** argv ) { value<my_interface> v; v = test(); std::cerr<<v.add(int(5))<<" == 10\n"; std::cerr<<v.add(double(5.5), "hello")<<" == 6\n"; std::cerr<<v.sub(int(5))<<" == 0\n"; return 0; } The macro would expand to: template<> struct value<my_interface> : public value_base<my_interface> { using value_base<my_interface>::value_base<my_interface>; using value_base<my_interface>::operator=; // this is what the macro would most expand template<typename... Args> auto add(Args&&... args) -> decltype( ((my_interface*)0)->add( std::forward<Args>(args)... ) ) { return impl->add( std::forward<Args>(args)... ); } // this is what the macro would most expand template<typename... Args> auto sub(Args&&... args) -> decltype( ((my_interface*)0)->sub( std::forward<Args>(args)... ) ) { return impl->sub( std::forward<Args>(args)... ); } }; Granted the 'only' downside to this approach is that it is 'intrusive'. I have implemented as much here: https://github.com/bytemaster/mace/blob/master/libs/stub/examples/value.cpp You can achieve non-intrusive version with a slightly more verbose interface definition: template<typename T=void,bool impl=false> struct my_interface{ virtual ~my_interface() {} virtual int add( int ) = 0; virtual int add( double, std::string ) = 0; virtual int sub( int ) = 0; }; template<typename T> struct my_interface<T,true> : public T, virtual public my_interface<>{ template<typename... Args> my_interface(Args&& ...args):T( std::forward<Args>(args)... ){} virtual int add( int i) { return T::add(i); } virtual int add( double d, std::string s) { return T::add( d, s); } virtual int sub( int i) { return T::sub( i ); } }; BOOST_VALUE( my_interface, (add)(sub) ) struct test { /** no intrusive base class required **/ int add( int i ) { return i + 5; } int add( double, std::string ) { return 6; } int sub( int i ) { return i - 5; } }; int main( int argc, char** argv ) { value<my_interface> v; v = test(); std::cerr<<v.add(int(5))<<" == 10\n"; std::cerr<<v.add(double(5.5), "hello")<<" == 6\n"; std::cerr<<v.sub(int(5))<<" == 0\n"; return 0; } https://github.com/bytemaster/mace/blob/master/libs/stub/examples/value2.cpp Other than the repetitive nature of the interface spec, it is very straight forward, clear, and easy to read and understand compared to how TypeErasure currently achieves such a feat. Does this have value?? Could you integrate this technique with TypeErasure? Is it good enough for most use cases (non operator overloading)? It probably wouldn't be difficult to convert this to reference semantics and avoid heap alloc!

Daniel Larimer:
I suspect that you could achieve generic 'value semantics' for any pure virtual interface via a very simple macro + other C++11 features. I think the cost of making a call with the following code would be the same as using pointer semantics via a virtual call. Wow thanks for proving my point. So value<T> requires T to be copyable.
Why do you use virtual inheritance? It does cost something, I think a extra word in the object which gets added to this before calling stuff on the interface.
Other than the repetitive nature of the interface spec, it is very straight forward, clear, and easy to read and understand compared to how TypeErasure currently achieves such a feat. Yes I agree. You're using the laguage's built in type-erasure.
Does this have value?? Could you integrate this technique with TypeErasure? Is it good enough for most use cases (non operator overloading)? It probably wouldn't be difficult to convert this to reference semantics and avoid heap alloc! I'm not sure; your non-intrusive version is so great, because the user is forced to write the delegator.
I seriously doubt people that people will use any<> to replace complex interfaces. (see next post) Chris

On Jun 15, 2012, at 10:36 AM, Hite, Christopher wrote:
Daniel Larimer:
I suspect that you could achieve generic 'value semantics' for any pure virtual interface via a very simple macro + other C++11 features. I think the cost of making a call with the following code would be the same as using pointer semantics via a virtual call. Wow thanks for proving my point. So value<T> requires T to be copyable.
Why do you use virtual inheritance? It does cost something, I think a extra word in the object which gets added to this before calling stuff on the interface.
I have updated my example and taken it a bit beyond proto-type code. In the example below I demonstrate interface 'composition', added operator (ostream,+,etc) support and completely eliminated the need for a macro to make things work. I also demonstrated overloading operator() to create a 'boost function' type erasure. It will now store 'non class types' as it no longer inherits from the stored type so it could store non-class functions as well. With a bit more work I think I could have any< function<Signature> > working. The code no longer depends on variadic templates and should compile with VC2010. Virtual inheritance is needed to solve diamond inheritance problem and to support interface composition (you only want one instance to actually hold the data) I think that the header (any.hpp) just about covers value semantics. A slightly modified version could provide pointer and reference semantics. Pointer and reference semantics would not need to heap allocate and I believe I can avoid heap allocation for all 'small' types with a little bit of template logic. https://github.com/bytemaster/mace/blob/master/libs/stub/include/mace/stub/a... https://github.com/bytemaster/mace/blob/master/libs/stub/include/mace/stub/a... If I went back to the MACRO based approach combined with variadic templates I could eliminate one virtual indirection, but I don't know if that is enough to justify loss of portability and the use of a MACRO. I suppose there is room for both with/without a MACRO for 'fast' versions. With my new header I don't think I would ever use the TypeErasure library under consideration because of its three fatal flaws: 1) Lack of readability... I could hardly follow the examples and without documentation your typical coder would never be able to follow the code or know what interface they need to expose. 2) Lack of writability... Not only is the TypeErasure library hard to read, it is hard to figure out how to use it to do what you want. 3) Compile times go to hell. Now that I have a better alternative (linked example) I think I will use it for my needs. I even think I could achieve the following: any<addable<int>,ostreamable,pushbackable<double>> x; Without drastically increasing the compile times or complexity of defining addable, ostreamable, or pushbackable.
Does this have value?? Could you integrate this technique with TypeErasure? Is it good enough for most use cases (non operator overloading)? It probably wouldn't be difficult to convert this to reference semantics and avoid heap alloc! I'm not sure; your non-intrusive version is so great, because the user is forced to write the delegator.
I seriously doubt people that people will use any<> to replace complex interfaces. (see next post)
I have been putting a lot of thought into the various arguments against type erasure of this kind and think I have a better rational. Why do we like templates so much? Because they allow us to program 'generically' and write algorithms that will work with 'anything' that supports some concept. Inheritance hierarchies create fragile/interdependent code. Often there is no one interface that applies perfectly to 'every use case'. If I have an interface that will work with any 'Number', why should I force my Integer class to expose a method "virtual double get()const" just so I can use it with my particular interface. The reality is that types exist outside of where they might be used. Interfaces are 'imposed' on types and as soon as you have two or more different 'users of type' each imposing a different interface on type you get conflicts. If you were to chart the natural relationships among types you would probably end up with a graph and not a nice hierarchy. It is the same reason why using 'tags' to group content rather than 'folders' is more natural. You have to force types into a hierarchy and often 'make compromises' that don't feel right just to 'get the inheritance right'. I have spent hours agonizing over how to resolve inheritance hierarchy conflicts. Interfaces should be defined by the user of the object, not by the object itself. When I write a library I shouldn't force the users of my library to modify how they organize *their* objects just so that it fits how I organize *my* objects. As a result, I think that the only reasons to use the 'traditional' interface pattern are: 1) If you want to compiler to 'enforce' interface implementation and generate compile errors for every object that needs changed when you change the interface. In this case the 'interface' owns the objects. 2) If you want to 'extend' the functionality of an existing type. This is more normal polymorphic inheritance than actual 'interfaces' though. With my approach you still 'define' your pure virtual interfaces and so your code is still "self-documenting", you just avoid making the users of your library adopt your INTRUSIVE interface and instead your library naturally adapts to the user. Clearly type erasure has value or we wouldn't be using boost::function, boost::any, void*, or any number of other techniques to erase types. It is also particularly useful for migrating from heavily templated code to non-templated implementations. So perhaps we should change the nature of the discussion on this library from "should we even have or use such a tool?" to "If you are going to use such a technique, what is the best library for the job?" If there is enough interest and we can establish some more through requirements, I would gladly polish up my any<> interface to complete a simple and small library for doing this.
Chris
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

AMDG On 06/15/2012 07:19 PM, Daniel Larimer wrote:
On Jun 15, 2012, at 10:36 AM, Hite, Christopher wrote:
Daniel Larimer:
I suspect that you could achieve generic 'value semantics' for any pure virtual interface via a very simple macro + other C++11 features. I think the cost of making a call with the following code would be the same as using pointer semantics via a virtual call. Wow thanks for proving my point. So value<T> requires T to be copyable.
Why do you use virtual inheritance? It does cost something, I think a extra word in the object which gets added to this before calling stuff on the interface.
I have updated my example and taken it a bit beyond proto-type code. In the example below I demonstrate interface 'composition', added operator (ostream,+,etc) support and completely eliminated the need for a macro to make things work. I also demonstrated overloading operator() to create a 'boost function' type erasure. It will now store 'non class types' as it no longer inherits from the stored type so it could store non-class functions as well. With a bit more work I think I could have any< function<Signature> > working. The code no longer depends on variadic templates and should compile with VC2010.
Virtual inheritance is needed to solve diamond inheritance problem and to support interface composition (you only want one instance to actually hold the data)
I think that the header (any.hpp) just about covers value semantics. A slightly modified version could provide pointer and reference semantics. Pointer and reference semantics would not need to heap allocate and I believe I can avoid heap allocation for all 'small' types with a little bit of template logic.
https://github.com/bytemaster/mace/blob/master/libs/stub/include/mace/stub/a...
https://github.com/bytemaster/mace/blob/master/libs/stub/include/mace/stub/a...
If I went back to the MACRO based approach combined with variadic templates I could eliminate one virtual indirection, but I don't know if that is enough to justify loss of portability and the use of a MACRO. I suppose there is room for both with/without a MACRO for 'fast' versions.
With my new header I don't think I would ever use the TypeErasure library under consideration because of its three fatal flaws:
1) Lack of readability... I could hardly follow the examples and without documentation your typical coder would never be able to follow the code or know what interface they need to expose.
2) Lack of writability... Not only is the TypeErasure library hard to read, it is hard to figure out how to use it to do what you want.
Okay. Consider: template<class T = _self> struct member_add1 { static void apply(T& t, int i) { t.add(i); } }; template<class T = _self> struct member_add2 { static void apply(T& t, double d, std::string s) { t.add(d, s); } }; template<class T = _self> struct member_sub { static int apply(T& t, int i) { return t.sub(i); } }; namespace boost { namespace type_erasure { template<class T, class Base> struct concept_interface<member_sub<T>, Base, T> : Base { int sub(int i) { return call(member_sub<T>(), *this, i); } }; }} template<class T = _self> struct my_interface : mpl::vector< member_sub<T>, member_add1<T>, member_add2<T>, ostreamable<T>, callable<int(int), T> > {}; namespace boost { namespace type_erasure { template<class T, class Base> struct concept_interface<my_interface<T>, Base, T> : Base { int add(int i) { return call(member_add1<T>(), *this, i); } int add(double d, std::string s) { return call(member_add2<T>(), *this, d, s); } }; }} This is about the same amount of code as your example. What do you have to keep track of here: - primitive concept defs - Use of the placeholder _self - The static apply function - concept_interface: - Inheritance from Base is required - The argument of the concept must match the placeholder - boost::type_erasure::call - Use of MPL Sequences to combine concepts With your code: - virtual inheritance - Use of forward_interface - inheritance from any_store<T> - The implementation is a specialization of the interface - this->val With some macros my code becomes. (This macro is fairly easy to write, since it's just a slightly more generic version of the macros I used to define my own operators). BOOST_TYPE_ERASURE_MEMBER((member_add1), add, 1); BOOST_TYPE_ERASURE_MEMBER((member_add2), add, 2); BOOST_TYPE_ERASURE_MEMBER((member_sub), sub, 1); template<class T = _self> struct sub_inter : mpl::vector<member_sub<T> > {}; template<class T = _self> struct my_interface : mpl::vector< member_sub<T, int>, member_add1<T, int>, member_add2<T, double, std::string>, ostreamable<T>, callable<int(int), T> > {};
<snip> So perhaps we should change the nature of the discussion on this library from
"should we even have or use such a tool?"
to
"If you are going to use such a technique, what is the best library for the job?"
If there is enough interest and we can establish some more through requirements, I would gladly polish up my any<> interface to complete a simple and small library for doing this.
Here's why I believe that my library is superior: - You don't have a good way to represent boost::type_erasure::equality_comparable<>, and you have no way to represent equality_comparable<_a, _b>. The former is critical, the latter less so. (Concepts involving multiple types or associated types appear all the time in generic programming. Try implementing a type erased version of your favorite STL algorithm). - The type erased ostream operator in your example has it's arguments backwards. Doing it right adds extra complexity. - Virtual inheritance makes your objects bigger and increases the cost of dispatching. - You can't beat a macro interface for keeping the base concept definitions simple. In Christ, Steven Watanabe

On 06/15/12 21:19, Daniel Larimer wrote: [snip]
Clearly type erasure has value or we wouldn't be using boost::function, boost::any, void*, or any number of other techniques to erase types. It is also particularly useful for migrating from heavily templated code to non-templated implementations.
Spirit programs take a long time to compile because they are heavily templated: http://article.gmane.org/gmane.comp.parsers.spirit.devel/3746 http://article.gmane.org/gmane.comp.parsers.spirit.general/23421 could type_erasure be used to speed up the compile time (with, I assume, some slow down in run-time)? [snip]

Hi Steven, Here are a few questions and suggestions: - I can create an any of a "smaller" concept from an any of a "bigger" concept, but I have to do it explicitly, why ? Couldn't the conversion be automatic when dealing with references ? - Why the _self keyword for references ? Wouldn't any<mpl::vector<...>&> work and be more intuitive ? - Even better, I'd rather have the ref any type has a typedef for the value any type. typedef any<mpl::vector<...>> MyAnyT; MyAnyT value(x); MyAnyT::RefT ref(y); (and reciprocally MyAnyRefT::ValueT) - _self and the placeholders have their underscore at the beginning while typeid_ has its own at the end. I guess this may be uniformized. Regards, Julien

On Sun, Jun 17, 2012 at 8:18 AM, Julien Nitard <julien.nitard@m4tp.org>wrote:
Hi Steven,
Here are a few questions and suggestions:
[...]
- _self and the placeholders have their underscore at the beginning while typeid_ has its own at the end. I guess this may be uniformized.
AFAIK, Boost has adopted the convention that placeholders (whatever that means in a given context) have a leading underscore while library identifiers use a trailing underscore if they would otherwise have the same name as a standard C++ identifier. - Jeff

AMDG On 06/17/2012 08:18 AM, Julien Nitard wrote:
Here are a few questions and suggestions:
- I can create an any of a "smaller" concept from an any of a "bigger" concept, but I have to do it explicitly, why ? Couldn't the conversion be automatic when dealing with references ?
The constructors like template<class Concept1, class Tag2> any(const any<Concept2, Tag2>&); look implicit to me.
- Why the _self keyword for references ? Wouldn't any<mpl::vector<...>&> work and be more intuitive ?
It might be more convenient, but it isn't logically consistent. any<mpl::vector<...> > is really any<mpl::vector<...>, _self>, because of the default argument.
- Even better, I'd rather have the ref any type has a typedef for the value any type. typedef any<mpl::vector<...>> MyAnyT; MyAnyT value(x); MyAnyT::RefT ref(y); (and reciprocally MyAnyRefT::ValueT)
By design, the only public members of any are constructors, destructors, and assignment operators. Any other name could potentially conflict with a user-defined member.
- _self and the placeholders have their underscore at the beginning while typeid_ has its own at the end. I guess this may be uniformized.
This is following normal boost conventions. Only placeholders get a leading underscore. In Christ, Steven Watanabe

Hi Steven,
- I can create an any of a "smaller" concept from an any of a "bigger" concept, but I have to do it explicitly, why ? Couldn't the conversion be automatic when dealing with references ?
The constructors like
template<class Concept1, class Tag2> any(const any<Concept2, Tag2>&);
look implicit to me.
Indeed, it works. My bad, it was a stupid mistake: if you want the conversion to work, you obviously need to take a ref to const or a value. Unless I am mistaken, the "default" way for a function to work with "any" should be take a "reference any" by value. It is cheap to copy from other "reference any" and cheap to create from a "value any". Taking a "reference any" by reference is not that convenient because unless it is const, it can't be bound to "value any". Could you confirm this ?
- Why the _self keyword for references ? Wouldn't any<mpl::vector<...>&> work and be more intuitive ?
It might be more convenient, but it isn't logically consistent.
any<mpl::vector<...> > is really any<mpl::vector<...>, _self>, because of the default argument.
I am not entirely convinced that it would be a big problem that _self would refer to remove_reference<concept>::type instead of just concept, but I get your point.
- Even better, I'd rather have the ref any type has a typedef for the value any type. typedef any<mpl::vector<...>> MyAnyT; MyAnyT value(x); MyAnyT::RefT ref(y); (and reciprocally MyAnyRefT::ValueT)
By design, the only public members of any are constructors, destructors, and assignment operators. Any other name could potentially conflict with a user-defined member.
Well yes it could. I think I should have mentioned the problem instead of trying to solve it by myself, sorry. My main issue is that when I am going to use an any type, I'll need to use the reference version as often or more as the value version and I'd like to be able to save the second typedef and have the relationship between the two appearing clearly in the code. Maybe a meta function would do the job better : AsRef<MyAnyType>::type, but it's starting to become verbose. Any idea ?
- _self and the placeholders have their underscore at the beginning while typeid_ has its own at the end. I guess this may be uniformized. This is following normal boost conventions. Only placeholders get a leading underscore.
Duly noted. Many thanks for your answers and time, Julien
participants (6)
-
Daniel Larimer
-
Hite, Christopher
-
Jeffrey Lee Hellrung, Jr.
-
Julien Nitard
-
Larry Evans
-
Steven Watanabe