a new release of the Join asynchronous concurrency library, please schedule a review

Hello, I have just uploaded a new release of the Join libray into boost vault under Concurrent Programming. For people who are not familiar with "Join", a short description is as following: Join is an asynchronous, message based C++ concurrency library. It is applicable both to multi-threaded applications and to the orchestration of asynchronous, event-based applications. It follows Comega's design and implementation and builds with Boost facilities. It provides a high level concurrency API with asynchronous methods, synchronous methods, and chords which are "join-patterns" defining the synchronization, asynchrony, and concurrency. The major changes of this release are for simplification and efficiency (executables are smaller and faster), summarized as following: 1. drop guard function in chord definitions: Guards introduce complexity and overhead to the overall design. in its current form, guards do not add much to the expressiveness of the original join calulus which is solely based on names / channels; and from my initial experience its usage is error prone and i didn't see a good solution from discussions with researchers at Microsoft Research and Moscova INRIA. 2. drop supporting multi synch methods in a chord: so a chord can have at most one synch method, that is consistent with Comega and Join Java. From my experience of implementing over a dozen samples, i found that rarely more than one synch method per chord is needed (and researchers at Microsoft and Moscova expressed the same feeling). Allowing single synch method per chord really simplifies the design and implementation. Synch method calls now behave really like normal function call: arguments are kept at stack, no need for queues; inside chord body, we use normal "return" statement to return result to the caller of the only synch method; also inside chord body exceptions are thrown using normal C++ "throw" and we don't need any tricks to pass results and exceptions among threads. In case we do need to synchronize two synchronous calls we can implement rendezvous similar to what Cw paper has described. 3. many changes have been done to reduce copy / buffering. document is updated with design changes and add more info about implementation and integration with other libraries. to do next: . better, earlier error reporting (compile time). now many errors related to chord definitions are thrown at runtime as exceptions. since the type info is already available, we should be able to report them during compilation using some template programming. In boost, what should i use similar to Andrei Alexandrescu's STATIC_CHECK()? . more optimization, profiling and benchmarking with the same source code and simple Jamfile, the tutorials are much slower in Windows/VC++ than in Linux/g++ (my box has Centrino Duo). I didnot find any profiling tools inside the downloaded VC++2005 express. Any suggestions about profiling tools available in Windows? especially free tools or open source tools? . design revisit considering possible rewriting of async / synch methods using "pimpl" idiom so that they have better integration with STL and other libraries. More info can be found: Source code: http://sourceforge.net/project/showfiles.php?group_id=157583 Documentation: http://channel.sourceforge.net/boost_join/libs/join/doc/boost_join_design.ht... Website: http://channel.sourceforge.net I'd really like (need) suggestions and corrections from more experts on this small library to make sure i am on the right track. Could a initial review be arranged for the Join library? Regards Yigong

I have just uploaded a new release of the Join libray into boost vault under Concurrent Programming.
For people who are not familiar with "Join", a short description is as following: Join is an asynchronous, message based C++ concurrency library. It is applicable both to multi-threaded applications and to the orchestration of asynchronous, event-based applications.
I'm very interested in this library and really appreciate how far you've taken it. A couple of questions 1) Do you think the library is ready for review for incorporation in boost yet (solid interfaces, reasonable implementation across several key platforms), or are you simply requesting many more people try it and support development more actively? 2) In your conversations and literature studies have you found any intellectual property claims that might limit access to 'join' methodology (Microsoft patents ?).

Hello, On 7/19/07, Paul Baxter <pauljbaxter@hotmail.com> wrote:
1) Do you think the library is ready for review for incorporation in boost yet (solid interfaces, reasonable implementation across several key platforms), or are you simply requesting many more people try it and support development more actively?
Yes, what i really mean is to invite more feed back, suggestions and corrections on this small library, That is why i am referring to "Initial" review (i remember i saw it here once for some other projects), not the formal / last one. If no such thing exist, please let me know. Currently i am developing Join at two platforms: WindowsXP/vc++ and Linux/g++. If it can be tried on other platforms, i will really appreciate it. How to design a good interface is a really hard subject, everyone has his/her own style. I am trying to present a "mediocre" / ordinary interface and hide most template magic inside library (STL also has a simple interface), hoping more users feel easy with it. Also better integration with existing libraries and OO designs are very important. 2) In your conversations and literature studies have you found any
intellectual property claims that might limit access to 'join' methodology (Microsoft patents ?).
the Join calculus / Jocaml was developed originally at Moscova INRIA. From my discussions with researchers at Microsoft and Moscova, all i received are encouragements. I don't have (use) their source code. The basic ideas come from the papers. Of course a generic programming design have its own specific implementation. There are also other Join based systems, such as Join Java and some Join based transaction processing. Regards yigong

Hello, On 7/19/07, Paul Baxter <pauljbaxter@hotmail.com> wrote:
2) In your conversations and literature studies have you found any intellectual property claims that might limit access to 'join' methodology (Microsoft patents ?).
I received clear confirmations from researchers at both Microsoft Research and Moscova INRIA that there is no patent for Join-Calculus and Cw. Also i do not have / use any source code from Cw, although my implementation is based on the design and ideas from Cw paper. Quote from researcher at Microsoft: "There's was a deliberate decision not to patent anything related to Polyphonic C# or the Comega concurrency constructs (other bits of Comega may have been patented). Nor are there any patents on the Joins library. So I think you are perfectly safe there." Thanks Yigong

Yigong Liu wrote:
to do next: . better, earlier error reporting (compile time). now many errors related to chord definitions are thrown at runtime as exceptions. since the type info is already available, we should be able to report them during compilation using some template programming. In boost, what should i use similar to Andrei Alexandrescu's STATIC_CHECK()?
There is a BOOST_STATIC_ASSERT
. more optimization, profiling and benchmarking with the same source code and simple Jamfile, the tutorials are much slower in Windows/VC++ than in Linux/g++ (my box has Centrino Duo).
You might want to directly look at the assembler output, it might simply have to do with some inlining that isn't done.
More info can be found: Source code: http://sourceforge.net/project/showfiles.php?group_id=157583 Documentation: http://channel.sourceforge.net/boost_join/libs/join/doc/boost_join_design.ht... Website: http://channel.sourceforge.net
I see that the string is taken by value in your small examples. Shouldn't it be taken by const-reference instead? Same for async_o etc.
I'd really like (need) suggestions and corrections from more experts on this small library to make sure i am on the right track. Could a initial review be arranged for the Join library?
I have a few questions: - why are you using boost.function? Aren't regular function objects enough? - how is concurrency handled? Locks? What kind? Is the scheduling fair?

Hello, On 7/19/07, Mathias Gaunard <mathias.gaunard@etu.u-bordeaux1.fr> wrote:
There is a BOOST_STATIC_ASSERT
You might want to directly look at the assembler output, it might simply have to do with some inlining that isn't done.
thanks for the suggestions . I see that the string is taken by value in your small examples.
Shouldn't it be taken by const-reference instead?
Same for async_o etc.
Yes, in real application code, it should be reference. In Join, async / synch methods take whatever signature users defined (similar to Boost.Signals), users are expected to follow general rule when making normal function calls, such as pass large data set thru pointers and references. The Join library just let user define the signature as normal. - why are you using boost.function? Aren't regular function objects enough?
- how is concurrency handled? Locks? What kind? Is the scheduling fair?
these are more involving. let me get back to it during night.
Thanks Yigong

Hello,
I see that the string is taken by value in your small examples.
Shouldn't it be taken by const-reference instead?
Same for async_o etc.
Yes, in real application code, it should be reference. In Join, async / synch methods take whatever signature users defined (similar to Boost.Signals), users are expected to follow general rule when making normal function calls, such as pass large data set thru pointers and references. The Join library just let user define the signature as normal.
Let me further clarify argument passing in the Join library a little bit (my rush reply in the morning is not accurate). The semantics of async method calls are one-way, no-result, non-blocking, returning immediately; possibly the arguments will be buffered in a queue for future consumption. So the callers of async methods could return, unwind stack and go on with other work long before the arguments get used. In general client code should not use pointers and references to data in stack as arguments to async methods (values or pointers / references to heap allocated objects are fine), Just remember that async methods should become the "owner" of argument data. For synch methods, since they will block waiting for results, normal function arguments can be used: references, pointers, and values. - why are you using boost.function? Aren't regular function objects enough?
I am not sure which specific places in code you are referring to. Boost.Function is a generalization of function pointers and callbacks, which could be free functions, methods, result of boost.bind() and of course function objects, For example in the Join library, the way to spawn a new task in thread pool is by calling the following async method "execute" which take boost.function as argument: class executor { public: async<boost::function<void()> > execute; .... }; this way we can spawn a task for a free function, method, result of bind() and function objects. - how is concurrency handled? Locks? What kind? Is the scheduling fair?
these are more involving. let me get back to it during night.
The question is kind of vague so i can only reply in general. Each object which define async /synch methods and chords will use only one mutex to protect the internal synchronization status and all data in async / synch methods. Besides this mutex, each synch method also has a condition variable for possible blocking of calling threads. When messages come in and more than one chords become ready to fire, there are 3 kinds of scheduling to decide which chord will fire and consume the messages: fire the first chord which is ready, fire the ready chord with most methods which will consume most buffered messages, and round-robin scheduling. You can find all the above info and more details in the design document. Again i am not sure if this is exactly what your question is about. If not, please clarify your questions a little bit. Thanks Yigong

Yigong Liu wrote:
I am not sure which specific places in code you are referring to.
Boost.Function is a generalization of function pointers and callbacks, which could be free functions, methods, result of boost.bind() and of course function objects
boost::function<signature> is simply a polymorphic wrapper that allows to contain any type that is callable with the signature 'signature'. It's only useful if you really need a single unique type for your function objects. It introduces quite some overhead over using a function object directly. I was simply wondering if it was really needed to have that facility. It seems you need it here though because actor<>::capture_events is actually virtual. (why it is virtual, I do not know)
The question is kind of vague so i can only reply in general. Each object which define async /synch methods and chords will use only one mutex to protect the internal synchronization status and all data in async / synch methods. Besides this mutex, each synch method also has a condition variable for possible blocking of calling threads. When messages come in and more than one chords become ready to fire, there are 3 kinds of scheduling to decide which chord will fire and consume the messages: fire the first chord which is ready, fire the ready chord with most methods which will consume most buffered messages, and round-robin scheduling.
I see. Shouldn't the scheduling policy be part of the type though? That would allow usage of compile-time polymorphism instead of run-time one.
You can find all the above info and more details in the design document.
Indeed, however it's not so easy to navigate that document. Maybe the documentation could be improved. You might want to look at the latest documentations for some boost libraries, which have started to become quite usable.
Again i am not sure if this is exactly what your question is about. If not, please clarify your questions a little bit.
That was more or less it. I have a a few rather random comments/questions, mostly stylistic, after I browsed the code for a few minutes: (I admit I didn't understood how the code works at all, unfortunately I do not have the time to study it at the moment) Why isn't executor passed by reference instead of by pointer in actor's constructor? Why are pointers so much used everywhere in the implementation even when references seem more suited? It also seems to me like some places lack const. Do you consider your code const-correct? I saw that line in the code: std::vector<boost::shared_ptr<chord_type> > chords_; //actor owns chords_ and will destroy them shared_ptr is better used for sharing. If actor owns the chords_, boost::ptr_vector may be more suited. I see that your code does some C-style casts with non-PODs, and quite a few of them. These are great sources of unsafety. Aren't there ways to not need them, or at least to restrict them to one place?

Hello, On 7/20/07, Mathias Gaunard <mathias.gaunard@etu.u-bordeaux1.fr> wrote:
Shouldn't the scheduling policy be part of the type though? That would allow usage of compile-time polymorphism instead of run-time one.
Indeed, i thought about separating the dispatching / scheduling logic out as separate "policy" classes, and never be able to get back to it. thanks for pointing it out and I'll consider it.
You can find all the above info and more details in the design document.
Indeed, however it's not so easy to navigate that document. Maybe the documentation could be improved. You might want to look at the latest documentations for some boost libraries, which have started to become quite usable.
Could you point out which libraries' documents you are mostly fond of? so i can have a clear idea.
I have a a few rather random comments/questions, mostly stylistic, after I browsed the code for a few minutes: (I admit I didn't understood how the code works at all, unfortunately I do not have the time to study it at the moment)
Why isn't executor passed by reference instead of by pointer in actor's constructor?
Not all actors need the executors thread pool to run its chord bodies. If the chord header contains a synch method, the chord body will run in the calling thread of synch method; only when all methods in the chord header are async, the chord body has to be dispatched to another thread (maybe in pool, maybe Asio's completion queue) to execute. So executor only need to be specified when the actor class contains a chord, all of whose methods are async. That is why i am using a pointer to executor in actor's constructor, its default value is NULL, meaning no executor is need. For example, class buffer only has one chord which has a synch method, so no executor is specified. Why are pointers so much used everywhere in the implementation even when
references seem more suited?
I have been trying to use references where it is appropriate, for example, all ports are stored in chords' as references. I could miss something, could you point out the exact places? It also seems to me like some places lack const. Do you consider your
code const-correct?
Again could you please point out the exact places in code? I saw that line in the code:
std::vector<boost::shared_ptr<chord_type> > chords_; //actor owns chords_ and will destroy them
shared_ptr is better used for sharing. If actor owns the chords_, boost::ptr_vector may be more suited.
I'll check out boost::ptr_vector to see if it is more suitable. My other consideration in implementation is using (or confining to) TR1's libraries as much as possible, since TR1 will become standard first and more widely available. I see that your code does some C-style casts with non-PODs, and quite a
few of them. These are great sources of unsafety. Aren't there ways to not need them, or at least to restrict them to one place?
I'll look into it and clean them up. Thanks for your comments. Yigong
participants (3)
-
Mathias Gaunard
-
Paul Baxter
-
Yigong Liu