yomm2 - open methods - Boost - lists.preview.boost.org

yomm2 - open methods

Jean-Louis Leroy

19 Jan 2018 19 Jan '18

6:23 p.m.

Hi, in 2013 I posted a proposal for an "open multi-methods" library, see here: https://lists.boost.org/Archives/boost/2013/07/205383.php Due to insufficient interest, I did not go forward with a formal submission, and I published my work on github, under the name yomm11. I have now completely rewritten the library. The new version - yomm2 - is available here: https://github.com/jll63/yomm2 This iteration is much better because it does not require any modifications to classes involved in method dispatch, yet the speed of a method call with one virtual argument is within 15% of the equivalent virtual function call. Over the years I realized that the value of such a library resides more in the 'open' than in the 'multi' aspect. Open methods are virtual functions that live outside classes, like free functions. They solve several problems, notably the Expression Problem (http://wiki.c2.com/?ExpressionProblem) and Cross-Cutting Concerns (http://wiki.c2.com/?CrossCuttingConcern). They provide an elegant replacement for the Visitor pattern. yomm2 also implement multiple dispatch, which can come (occasionally) handy. Well I am just touching base to see if things have evolved and if there might now be an interest in seeing this in Boost. Jean-Louis Leroy

Show replies by date

Steven Watanabe

19 Jan 19 Jan

7:24 p.m.

AMDG On 01/19/2018 11:23 AM, Jean-Louis Leroy via Boost wrote:

...

in 2013 I posted a proposal for an "open multi-methods" library, see here: https://lists.boost.org/Archives/boost/2013/07/205383.php Due to insufficient interest, I did not go forward with a formal submission, and I published my work on github, under the name yomm11.

I have now completely rewritten the library. The new version - yomm2 - is available here: https://github.com/jll63/yomm2 This iteration is much better because it does not require any modifications to classes involved in method dispatch, yet the speed of a method call with one virtual argument is within 15% of the equivalent virtual function call.

So, if it's fully non-intrusive, can it be made to work with boost::any or Boost.TypeErasure? In Christ, Steven Watanabe

Zach Laine

8:26 p.m.

On Fri, Jan 19, 2018 at 1:24 PM, Steven Watanabe via Boost < boost@lists.boost.org> wrote:

...

AMDG

On 01/19/2018 11:23 AM, Jean-Louis Leroy via Boost wrote:

...
in 2013 I posted a proposal for an "open multi-methods" library, see here: https://lists.boost.org/Archives/boost/2013/07/205383.php Due to insufficient interest, I did not go forward with a formal submission, and I published my work on github, under the name yomm11.

I have now completely rewritten the library. The new version - yomm2 - is available here: https://github.com/jll63/yomm2 This iteration is much better because it does not require any modifications to classes involved in method dispatch, yet the speed of a method call with one virtual argument is within 15% of the equivalent virtual function call.

So, if it's fully non-intrusive, can it be made to work with boost::any or Boost.TypeErasure?

This was the first thing that occurred to me, too. A graceful way to do dynamic re-binding of erased types is something I want. zach

Jean-Louis Leroy

8:50 p.m.

I have zero experience using TypeErasure so what I am going to say here may miss the point entirely. A priori it's two different worlds because open methods suppose that there is an is-a relationship between the virtual parameters in the method declaration and the corresponding arguments in the method definitions (aka overrides or specializations). It also supposes that virtual parameters are polymorphic types. But wait! yomm2 supports virtual_<std::shared_ptr<Animal>> parameters in addition to virtual_<Animal&> and virtual_<Animal*> - and shared_ptrs are not polymorphic types (neither is Animal* in fact) so there is some flexibility here. Do you mean something like (inspired from my synopsis): using AnyAnimal = any< mpl::vector< ... > >; declare_method(string, kick, (virtual_<AnyAnimal>)); define_method(string, kick, (Dog* dog)) { return "bark"; } define_method(string, kick, (Bulldog* dog)) { return "bark and bite"; } ...or even (following Steven's Basic Usage example): using AnyCounter = any< mpl::vector< copy_constructible<>, typeid_<>, incrementable<>, ostreamable<> >

...

;

declare_method(string, describe, (virtual_<AnyCounter>)); define_method(string, describe, (int* num)) { return "it's an integer"; } define_method(string, describe, (double* dog)) { return "it's a double"; } If I'm off the mark maybe you can provide an example to illustrate your question? On Fri, Jan 19, 2018 at 3:27 PM Zach Laine via Boost <boost@lists.boost.org> wrote:

...

On Fri, Jan 19, 2018 at 1:24 PM, Steven Watanabe via Boost < boost@lists.boost.org> wrote:

...
AMDG

On 01/19/2018 11:23 AM, Jean-Louis Leroy via Boost wrote:

...
in 2013 I posted a proposal for an "open multi-methods" library, see here: https://lists.boost.org/Archives/boost/2013/07/205383.php Due to insufficient interest, I did not go forward with a formal submission, and I published my work on github, under the name yomm11.

I have now completely rewritten the library. The new version - yomm2 - is available here: https://github.com/jll63/yomm2 This iteration is much better because it does not require any modifications to classes involved in method dispatch, yet the speed of a method call with one virtual argument is within 15% of the equivalent virtual function call.

So, if it's fully non-intrusive, can it be made to work with boost::any or Boost.TypeErasure?

This was the first thing that occurred to me, too. A graceful way to do dynamic re-binding of erased types is something I want.

zach

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Steven Watanabe

9:25 p.m.

AMDG On 01/19/2018 01:50 PM, Jean-Louis Leroy via Boost wrote:

...

I have zero experience using TypeErasure so what I am going to say here may miss the point entirely.

A priori it's two different worlds because open methods suppose that there is an is-a relationship between the virtual parameters in the method declaration and the corresponding arguments in the method definitions (aka overrides or specializations). It also supposes that virtual parameters are polymorphic types.

But why exactly do you require it? a) You can do a (static|dynamic)_cast from the base to the derived type. Replace this with any_cast<Derived*>(&a) or static_cast<Derived*>(any_cast<void*>(&a)) for any. b) You can determine the dynamic type of the object with typeid. Replace this with typeid_of(a) for any. Plain boost::any has similar features, although I don't think it supports any_cast<void*>.

...

But wait! yomm2 supports virtual_<std::shared_ptr<Animal>> parameters in addition to virtual_<Animal&> and virtual_<Animal*> - and shared_ptrs are not polymorphic types (neither is Animal* in fact) so there is some flexibility here.

Do you mean something like (inspired from my synopsis):

Yes, that's exactly what I mean. (Note that if you were to use virtual_<any<typeid_<> >, _self&>>, it would essentially be able to accept anything.)

...

using AnyAnimal = any< mpl::vector< ... > >;

declare_method(string, kick, (virtual_<AnyAnimal>));

define_method(string, kick, (Dog* dog)) { return "bark"; }

define_method(string, kick, (Bulldog* dog)) { return "bark and bite"; }

...or even (following Steven's Basic Usage example):

using AnyCounter = any< mpl::vector< copy_constructible<>, typeid_<>, incrementable<>, ostreamable<> >

...
;

declare_method(string, describe, (virtual_<AnyCounter>));

define_method(string, describe, (int* num)) { return "it's an integer"; }

define_method(string, describe, (double* dog)) { return "it's a double"; }

If I'm off the mark maybe you can provide an example to illustrate your question?

In Christ, Steven Watanabe

Jean-Louis Leroy

9:59 p.m.

OK I have been thinking about this in the meantime (hoping I understood the problem, which you confirmed).

...

But why exactly do you require it? a) You can do a (static|dynamic)_cast from the base to the derived type. Replace this with any_cast<Derived*>(&a) or static_cast<Derived*>(any_cast<void*>(&a)) for any. b) You can determine the dynamic type of the object with typeid. Replace this with typeid_of(a) for any.

Yes, I have a detail::virtual_traits<> that handles that sort of things. For the time being is is undocumented, so I can experiment. The other difficulty is the mechanism used to match declarations and definitions. Methods can be overloaded (see example here: https://github.com/jll63/yomm2/blob/master/examples/matrix.cpp#L51). Method declarations declare (but do not define) an additional function that has the same name and parameters as the dispatcher function, plus an extra one. It returns the matching method, which is fed to decltype(). That's how times(double, dense_matrix&) is matched with times(double, virtual_<matrix&>) and times(diagonal_matrix&, diagonal_matrix&) is matched with times(virtual_<matrix&>, virtual_<matrix&>). I will experiment during the weekend but there is some hope.

Jean-Louis Leroy

21 Jan 21 Jan

12:54 a.m.

OK after adding a bit of flexibility, and then taking advantage of it, I could make this work: using Any = any<mpl::vector<copy_constructible<>, typeid_<>, relaxed> >; declare_method(string, type, (virtual_<Any>)); begin_method(string, type, (int value)) { return "int " + std::to_string(value); } end_method; register_class(Any); register_class(int, Any); // I guess... struct Animal {}; BOOST_AUTO_TEST_CASE(type_erasure) { //yorel::yomm2::detail::log_on(&std::cerr); yorel::yomm2::update_methods(); Any x(10); BOOST_TEST(type(x) == "int 10"); } (complete code at here: https://github.com/jll63/yomm2/blob/boost-type-erasure/tests/boost_type_eras...) I feel a bit uneasy though. If yomm2 ever becomes a part of Boost, I do see the value of making it play ball with other Boost libraries. On the other hand...one of my major sources of inspiration is the paper by Pirkelbauer, Solodkyy and Stroustrup (the other being CLOS). I try to emulate the proposed syntax and functionality that they describe, in the hope that people get to try open methods, start liking them and, eventually, they become part of the language. So this is quite a stretch. That being said, I already drift from the paper by supporting std::shared_ptr virtual parameters...and my Matrix example shows that open methods without support for smart pointers would miss some important use cases.

Steven Watanabe

3:43 a.m.

AMDG On 01/19/2018 11:23 AM, Jean-Louis Leroy via Boost wrote:

...

in 2013 I posted a proposal for an "open multi-methods" library, see here: https://lists.boost.org/Archives/boost/2013/07/205383.php Due to insufficient interest, I did not go forward with a formal submission, and I published my work on github, under the name yomm11.

<snip>> Well I am just touching base to see if things have evolved and if there might now be an interest in seeing this in Boost.

A couple of random comments: - I feel like it should be possible to get rid of YOMM2_END by moving the function body out-of-line or using a lambda or something. - The documentation isn't clear on how the one-definition-rule applies (unless I missed it). My default assumption would be that YOMM2_DEFINE for a given overload should appear in exactly one translation unit, and that YOMM2_DECLARE and YOMM2_CLASS may appear in multiple translation units so long as the declarations are identical, but I notice that they're /all/ using unnamed namespaces. All of your examples only use a single source file, so I can't really guess your intentions from usage either. - I'm curious why I would ever want to use the "cute" names when they're actually longer than the YOMM2 names. In Christ, Steven Watanabe

Jean-Louis Leroy

8:22 p.m.

Thanks for the comments. I am still hesitating about some of the names and APIs. The library is pre-version 1.0.0, and it's the time to get things right. I am still hesitating about the syntax for defining a method (overridern). I could easily support: // no return type here // ---------V YOMM2_BEGIN(kick, (Dog& dog)) { return "bark"; } YOMM2_END; It breaks the symmetry with YOMM2_DECLARE though. Now onto your points...

...

- I feel like it should be possible to get rid of YOMM2_END by moving the function body out-of-line or using a lambda or something.

Of course I would like to get rid of YOMM2_END, but I haven't yet found a way of doing that that also ensures that the body of the method can be inlined inside the wrapper that casts the virtual arguments. I trie with lambdas to back in 2014. Then maybe I've overlooked something, if you have an idea...

...

- The documentation isn't clear on how the one-definition-rule applies (unless I missed it). My default assumption would be that YOMM2_DEFINE for a given overload should appear in exactly one translation unit, and that YOMM2_DECLARE and YOMM2_CLASS may appear in multiple translation units so long as the declarations are identical, but I notice that they're /all/ using unnamed namespaces. All of your examples only use a single source file, so I can't really guess your intentions from usage either.

Indeed I should be more explicit about this. As you guessed, YOMM2_DECLARE should go is header file worthy - it just defines an inline function and delcares a hidden function that is just used to guide the overload resolution. YOMM2_DEFINE is for *defintions* and as such only one is allowed per process; they're for .cpp files. YOMM2_CLASS is borderline. Normally it should be called once, whereever that happens - but it tolerates being called any number of times. That's for coping with situations where the same class may have several associated type_info objects - what I have in mind here is dynamically loaded Windows DLLs. I will improve the documentation...

...

- I'm curious why I would ever want to use the "cute" names when they're actually longer than the YOMM2 names.

I would use them ;-) They're more readable, and no uppercase loud. YOMM2 is an obscure acronym (Yorel's Open Multi Methods) that I picked to avoid name clashes. Lowercase names look more like keyworks. But indeed maybe I should write the entire doc and examples in terms of YOMM2* macros and de-emphasize the "cute" macros. J-L

Steven Watanabe

23 Jan 23 Jan

5:59 p.m.

AMDG On 01/21/2018 01:22 PM, Jean-Louis Leroy via Boost wrote:

...

Thanks for the comments.

I am still hesitating about some of the names and APIs. The library is pre-version 1.0.0, and it's the time to get things right.

I am still hesitating about the syntax for defining a method (overridern). I could easily support:

// no return type here // ---------V YOMM2_BEGIN(kick, (Dog& dog)) { return "bark"; } YOMM2_END;

It breaks the symmetry with YOMM2_DECLARE though.

I think I prefer having the return type present. (Does auto work?)

...

Now onto your points...

...
- I feel like it should be possible to get rid of YOMM2_END by moving the function body out-of-line or using a lambda or something.

Of course I would like to get rid of YOMM2_END, but I haven't yet found a way of doing that that also ensures that the body of the method can be inlined inside the wrapper that casts the virtual arguments. I trie with lambdas to back in 2014. Then maybe I've overlooked something, if you have an idea...

Can't you just rearrange it to: ... struct _YOMM2_SPEC { static RETURN_T body ARGS; }; ... register_spec<> init(); }} inline RETURN_T _YOMM2_NS::_YOMM2_SPEC::body ARGS There's nothing here that specifically prevents inlining. Whether the compiler actually inlines it is another question but that's already very compiler-specific and not guaranteed. Also, - "Each name that ... begins with an underscore followed by a capital letter is reserved to the implementation for any use" [global.names] - update_methods looks like it's totally thread-unsafe. You can probably get away with this if you only call it at the beginning of main, but it seems quite dangerous if you load or unload shared libraries. In Christ, Steven Watanabe

Jean-Louis Leroy

24 Jan 24 Jan

2:21 p.m.

Sorry for all the typos and mistakes in my previous post.

...

...
I am still hesitating about the syntax for defining a method (overridern). I could easily support:

// no return type here // ---------V YOMM2_BEGIN(kick, (Dog& dog)) { return "bark"; } YOMM2_END;

It breaks the symmetry with YOMM2_DECLARE though.

I think I prefer having the return type present. (Does auto work?)

It does now! That's a good compromise.

...

Can't you just rearrange it to: ... struct _YOMM2_SPEC { static RETURN_T body ARGS; }; ... register_spec<> init(); }} inline RETURN_T _YOMM2_NS::_YOMM2_SPEC::body ARGS

This wouldn't work as is, because _YOMM2_NS generates a new namespace name each time it is called (using __COUNTER__) but I found out about BOOST_PP_SUB and changed _YOMM2_NS so it can re-generate a previous namespace. So now YOMM2_END is gone.

...

There's nothing here that specifically prevents inlining. Whether the compiler actually inlines it is another question but that's already very compiler-specific and not guaranteed.

Not only does clang inline, but LLVM is smart: since the data needed for dispatch comes from three locations (the hash table, the method and the class), and the first two can be acquired independently, LLVM orders the instructions so they can be executed in parallel. That's probably what explains the (pleasantly) surprising speed of a 1-method call - within 15% of the equivalent compiler-generated virtual member function call.

...

- "Each name that ... begins with an underscore followed by a capital letter is reserved to the implementation for any use" [global.names]

I'm working on it.

...

- update_methods looks like it's totally thread-unsafe. You can probably get away with this if you only call it at the beginning of main, but it seems quite dangerous if you load or unload shared libraries.

That's a complex question. For starters, is dlopen thread safe? GNU dlopen is explicitly documented as "MT-Safe", but this SunOS page https://docs.oracle.com/cd/E26502_01/html/E26507/chapter3-7.html does not say anything on the subject. And then there are bug reports circulating about dlopen in multi-threadec context. And what of dlclose? Better make sure that a thread does not call dlclose while another is still executing the library's code. Or that each thread that uses the library calls dlopen itself (and increments the library's ref count). In the light of this, I have so far left it to "the application" to manage its calls to dlopen, dlclose and update_methods. But this is just the beginning. Adding a mutex to serialize calls to update_method (and the static ctors and dtors that are generated by the macros) is not enough, because a thread may be executing the method dispatch code while update_methods is running. I would need a read/write mutex, with the dispatch code acquiring a read lock until it has fetched the pointer to the appropriate function. But that would be too penalizing. I wonder if Pirkelbauer, Solodkyy and Stroustrup addressed this problem when they worked on open methods. Their paper mentions dynamic loading but it doesn't say much except that it's important to support it. I'll ask Solodkyy - we exchanged emails in the past.

Steven Watanabe

3:29 p.m.

AMDG On 01/24/2018 07:21 AM, Jean-Louis Leroy via Boost wrote:

...

<snip>

...
Can't you just rearrange it to: ... struct _YOMM2_SPEC { static RETURN_T body ARGS; }; ... register_spec<> init(); }} inline RETURN_T _YOMM2_NS::_YOMM2_SPEC::body ARGS

This wouldn't work as is, because _YOMM2_NS generates a new namespace name each time it is called (using __COUNTER__) but I found out about BOOST_PP_SUB and changed _YOMM2_NS so it can re-generate a previous namespace. So now YOMM2_END is gone.

A better way is to define an implementation macro that takes the namespace as a parameter, and then force __COUNTER__ to be expanded once by the outer macro. (BOOST_PP_SUB has a pretty low upper maximum.)

...

<snip>

...
- update_methods looks like it's totally thread-unsafe. You can probably get away with this if you only call it at the beginning of main, but it seems quite dangerous if you load or unload shared libraries.

That's a complex question.

For starters, is dlopen thread safe? GNU dlopen is explicitly documented as "MT-Safe", but this SunOS page https://docs.oracle.com/cd/E26502_01/html/E26507/chapter3-7.html does not say anything on the subject. And then there are bug reports circulating about dlopen in multi-threadec context.

And what of dlclose? Better make sure that a thread does not call dlclose while another is still executing the library's code. Or that each thread that uses the library calls dlopen itself (and increments the library's ref count).

That's no different from normal functions. My opinion is that dlopen/dlclose should be exactly as safe for multimethods as they are for normal functions.

...

In the light of this, I have so far left it to "the application" to manage its calls to dlopen, dlclose and update_methods.

But this is just the beginning. Adding a mutex to serialize calls to update_method (and the static ctors and dtors that are generated by the macros) is not enough, because a thread may be executing the method dispatch code while update_methods is running. I would need a read/write mutex, with the dispatch code acquiring a read lock until it has fetched the pointer to the appropriate function. But that would be too penalizing.

It might be better to assume that rebuilding the tables is rare. Then you can build a completely new table and swap it in atomically. Assumptions required for validity: - No call that would match any new overload is made until after update_methods returns. - All calls into the dll have completed before it is unloaded.

...

I wonder if Pirkelbauer, Solodkyy and Stroustrup addressed this problem when they worked on open methods. Their paper mentions dynamic loading but it doesn't say much except that it's important to support it. I'll ask Solodkyy - we exchanged emails in the past.

In Christ, Steven Watanabe

Jean-Louis Leroy

4:30 p.m.

...

A better way is to define an implementation macro that takes the namespace as a parameter, and then force __COUNTER__ to be expanded once by the outer macro. (BOOST_PP_SUB has a pretty low upper maximum.)

Ah, good idea, thanks.

...

My opinion is that dlopen/dlclose should be exactly as safe for multimethods as they are for normal functions.

I prefer to call them "open methods", unless the context requires insisting on multiple dispatch. A study (see here https://en.wikipedia.org/wiki/Multiple_dispatch#Use_in_practice) shows that multiple dispatch is very rarely needed. I don't want potential users to think "multimethods, nice, but I don't have any need for that", and miss on the value of *open* methods with one virtual argument. But of course you are free to use the words you want ;-)

...

It might be better to assume that rebuilding the tables is rare. Then you can build a completely new table and swap it in atomically.

I expect it to be rare and I *know* that it is costly. I may be able to pull this off at the cost of some redundancy in the dispatch data. As of now, there is a global structure, shared by all methods, which contains just three words: a multiplier, a shift factor and a vector of "words" (either pointers or ints) that is a mixed bag of things (a collision free hash table, the equivalent of a vtable for each class, dispatch tables and strides for *multi* methods, slots occupied by each method). Those three words can be folded together and accessed via a pointer (to be atomically swapped). Each method contains a pointer inside that global vector, to the first slot (followed by the second slot, a stride, a slot, etc for multi-methods). That's the problem: one pointer per method. But I think I can rearrange the dispatcher to work with just the pointer from the method. I will have to replicate the content of that global structure for each method, before the first slot. Then a method could be swapped atomically. All the methods would *not* flip to the new table atomically, but I think that this may still give the correct behavior...

2753

Age (days ago)

2758

Last active (days ago)

List overview

Download

12 comments

3 participants

participants (3)

Jean-Louis Leroy
Steven Watanabe
Zach Laine