Any interests in a template library for distributed message passing and event dispatching?

Hello All, Is there any interests in libraries for distributed message passing and event dispatching? There are many aspects involved in the design of these systems, such as choosing proper message/event id types, routing/dispatching algorithms, synchronization strategies for routing data structures and remote connection transport, etc.. Various application domains present different requirements for performance and functionalities (in embedded systems, integers or sub-int (bit-fields) are used as ids and routing is based on registration table, while in the e-commerce applications, class/structs with key attributes can be used as ids to facilitate more advanced routings). I am working on a C++ template library for generic publish/subscribe called Channel. Channel's major components (message IDs, routing algorithms...) are configurable as template parameters. As a namespace shared by peer threads, channels support publish/subscribe scope control, message filtering, and translation. Channel is used the same way as we use STL containers; by instantiating Channel template with proper id type, id trait and routing algorithms, user can create customized message passing and event dispatching facility for a particular application. More documentations and code can be found at : http://channel.sourceforge.net Right now Channel is built on top of ACE (adaptive communication environment) to gain platform independence. Besides classes related to socket connections and message marshaling, most channel classes are quite platform independent. I am refactoring the code and move the platform dependent code into a few policy classes. I would like to know is there any interests in boost for such frameworks. If so, I'll be glad to present a boostified version of the framework for a review process and use existing boost classes for platform abstractions. Thanks Yigong

Is there any interests in libraries for distributed message passing and event dispatching? There are many aspects involved in the design of these systems ...
Yes, definite interest from me, and I think you'll find a lot of interest from other developers or organizations. I assume you have (or will) port from using ACE to using Asio - unless major problems are found with Asio (and I don't think the major review from a few months ago found any), that would be the library to use for the lower-level socket and demuxing encapsulation. I don't see any need to duplicate the functionality of Asio inside your own library or framework (which is a higher-level layer than Asio). Besides (and probably in parallel with) the issues you're trying to solve, I've written libraries that provided: 1. Incoming and outgoing message filter chaining (functionality that was just mentioned last week in relation to Asio, I think). This allows applications or libraries to easily add functionality such as encryption, compression, validation, recording, monitoring, or other application-specific processing. 2. Customizable "message" encapsulation / boundary processing, in the sense that every distributed processing scheme has a defined protocol that delimits a message. E.g. a text stream message might be delimited by a new-line, a binary message might have the length encoded as the first field, a binary message might be a fixed-size "struct", a text protocol such as XML has specific matched delimiters, etc. I'd be very much interested in a higher-level layer that solved some of the needs associated with certain kinds of distributed processing: a. Pub-sub (as you've already mentioned) - is this based on an existing pub-sub spec or API? (E.g. DDS or JMS) There should be a good justification for providing yet another pub-sub model (versus basing it on something like DDS). b. Distributed state / data store (as provided by the library or framework, not by an external database). c. Scalability aspects. (I'll be happy to write more details, if needed.) d. Fault-tolerant aspects. (I'll be happy to write more details, if needed.) I'll catch up with the rest of the Boost e-mail to see if other people have written anything similar ... Cliff

Thanks for the comments and suggestions. I'l definitely look into Boost.Asioand try to build on top of existing boost facilities as much as possible. a. Pub-sub (as you've already mentioned) - is this based
on an existing pub-sub spec or API? (E.g. DDS or JMS) There should be a good justification for providing yet another pub-sub model (versus basing it on something like DDS).
The design ideas of Channel is based on my own experience with message passing/event dispatching systems. I worked on switches/routers control software and distributed enterprise applications; and designed and implemented proprietary messaging systems. The design of these systems include the common set of core design aspects. Channel intendes to be a framework to allow users plug and play, mix and match different parts of messaging systems such as message type or id, routing algorithms, connection transport, etc. to create a best-fit messaging system for a particular application. And it intends to be as easy to use as specializing STL containers. It is not intended to implement an existing standard. b. Distributed state / data store (as provided by the
library or framework, not by an external database). d. Fault-tolerant aspects. (I'll be happy to write more details, if needed.)
During my work on telecom backbone switches, i have experience implementing data replication and high availability control software. These are really important issues. However again, Channel is intended to be a framework of basic, core primitives for programmers to configure a messaging/event facility for his applications. In my experience, high availability and fault tolerance are built on top of messaging (such as heart-beat) and data replication. So Channel is not intended to include all of these inside. Thanks again. Yigong

On Feb 27, 2006, at 3:01 AM, Yigong Liu wrote:
Is there any interests in libraries for distributed message passing and event dispatching?
Yes, but I'm afraid that enumerating my own requirements would throw a wrench in the works. I work primarily with the Message Passing Interface (MPI) for message passing in high-performance computing applications. Although we still work with the message-passing paradigm (and could benefit from a higher-level message passing library), our performance requirements make it such that we need to avoid extraneous buffering at all costs. I hope to get a chance to look at Channel in the future, but for now I have a simple question for you: Have you looked at MPI before, and does it fit into your idea of distributed message passing? Doug

Thanks for the comments. I had some experience with MPI like systems. I implemented several parallel rendering algorithms on PVM workstation clusters before and helped developing CAM geometrical algorithms on PVM at one of my former employer. I have been out of this field since then. Although bearing the same name of message passing, Channel framework is for application domains quite different from those targeted by MPI/PVM. Channel is for developing message passing/event dispatching systems in distributed embedded systems, desktop and enterprise applications. During my last eight years of employment, i worked on various telecom switches/routers, and distributed enterprise applications; designed and implemented 3 small proprietary message passing systems. Later i found that they all share a common set of primitives. Channel intends providing a basic set of "light weight" primitives in a template framework for programmers to construct a message passing facility customized for their applications (in my experience, the message passing systems in embedded world and enterprise world have different requirements and tradeoff). Also Channel is easy to configure and use - as easy to use as STL containers :-) In this sense, Channel is quite in the similar field as GUI event/signal systems. By instantiating Channel template with "NULL" SynchPolicy, it can be used as a single threaded event dispatcher, and gain the extra benefit of being able to plugin different routing/dispatching algorithms and easily transformed into distributed eventing systems later. Yes, most design considerations of message passing systems still apply, such as avoiding data copying (Channel did it by message reference counting). More suggestions and comments are highly appreciated. Thanks Yigong On 2/27/06, Doug Gregor <dgregor@cs.indiana.edu> wrote:
On Feb 27, 2006, at 3:01 AM, Yigong Liu wrote:
Is there any interests in libraries for distributed message passing and event dispatching?
Yes, but I'm afraid that enumerating my own requirements would throw a wrench in the works. I work primarily with the Message Passing Interface (MPI) for message passing in high-performance computing applications. Although we still work with the message-passing paradigm (and could benefit from a higher-level message passing library), our performance requirements make it such that we need to avoid extraneous buffering at all costs. I hope to get a chance to look at Channel in the future, but for now I have a simple question for you: Have you looked at MPI before, and does it fit into your idea of distributed message passing?
Doug
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On 02/28/2006 01:09 AM, Yigong Liu wrote:
Yes, most design considerations of message passing systems still apply, such as avoiding data copying (Channel did it by message reference counting). More suggestions and comments are highly appreciated.
Hi, Yigong. I know almost nothing about distributed systems, but the above mention of reference counting reminded me that "weighted reference counting" is supposed to have some speed advantage in distributed systems. A reference is [lins91c] as shown on: http://www.cs.kent.ac.uk/people/staff/rej/gcbib/gcbibL.html IIRC, it saves time at the cost of memory. The extra memory is caused by the "weight" of of the reference count being stored in the smart pointer instead of a reference count in the object pointed-to. The weight is, I think, log2 of the normal reference count. Each time a copy is made, the weight is evenly divided between the from and to smart pointers. Only when an object is destroyed is communication with the pointee needed. Of course, there must be some provision for the weight dropping below 1 when too many copies are made. Would that be any use in your library?

On 02/28/2006 06:35 AM, Larry Evans wrote: [snip]
to smart pointers. Only when an object is destroyed is communication with the pointee needed. Of course, there must be some provision for Correction: Only when a smart pointer changes its pointee or is destructed is communication with the pointee needed.

Hi Larry, Thanks for the suggestions. I'll definitely look into it. Yigong On 2/28/06, Larry Evans <cppljevans@cox-internet.com> wrote:
On 02/28/2006 01:09 AM, Yigong Liu wrote:
Yes, most design considerations of message passing systems still apply,
such
as avoiding data copying (Channel did it by message reference counting). More suggestions and comments are highly appreciated.
Hi, Yigong.
I know almost nothing about distributed systems, but the above mention of reference counting reminded me that "weighted reference counting" is supposed to have some speed advantage in distributed systems. A reference is [lins91c] as shown on:
http://www.cs.kent.ac.uk/people/staff/rej/gcbib/gcbibL.html
IIRC, it saves time at the cost of memory. The extra memory is caused by the "weight" of of the reference count being stored in the smart pointer instead of a reference count in the object pointed-to. The weight is, I think, log2 of the normal reference count. Each time a copy is made, the weight is evenly divided between the from and to smart pointers. Only when an object is destroyed is communication with the pointee needed. Of course, there must be some provision for the weight dropping below 1 when too many copies are made.
Would that be any use in your library?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On Feb 28, 2006, at 2:09 AM, Yigong Liu wrote:
Although bearing the same name of message passing, Channel framework is for application domains quite different from those targeted by MPI/PVM. Channel is for developing message passing/event dispatching systems in distributed embedded systems, desktop and enterprise applications. During my last eight years of employment, i worked on various telecom switches/routers, and distributed enterprise applications; designed and implemented 3 small proprietary message passing systems. Later i found that they all share a common set of primitives.
Very interesting. I expected that PVM/MPI was in a different message- passing "space" than what you were proposing, but thought I should ask nonetheless. I still hope to get a chance to look at Channel in- depth, but for now I'll say this: don't mangle your design to fit PVM/ MPI. If they fit well, great; if not, that's fine too. Doug

Yigong Liu wrote:
In this sense, Channel is quite in the similar field as GUI event/signal systems. By instantiating Channel template with "NULL" SynchPolicy, it can be used as a single threaded event dispatcher, and gain the extra benefit of being able to plugin different routing/dispatching algorithms and easily transformed into distributed eventing systems later.
I have a great interest is this kind of use, but in a threaded environment. I've developed my own event-driven infrastructure but it's not designed to be super general-purpose. However, I am interest in actively participating in discussions around your proposal wrt this kind of use. Specifically, my needs for event-driven programming arise from designing machine simulators, which have slightly different requirements than a GUI system (tracking simulator time, for example). A policy-based design works well and is what I've designed for my own use. -Dave

I have a great interest is this kind of use, but in a threaded environment. I've developed my own event-driven infrastructure but it's not designed to be super general-purpose. However, I am interest in actively participating in discussions around your proposal wrt this kind of use.
Specifically, my needs for event-driven programming arise from designing machine simulators, which have slightly different requirements than a GUI system (tracking simulator time, for example). A policy-based design works well and is what I've designed for my own use.
Thanks for the comments. what is the threading requirement in your application? Channel is a light-weight core for general purpose pub/sub message passing and the most common use case is multiple threads publish/send messages/events and multiple threads subscribe/receive messages/events. Its internal data structures are protected by RW_Mutex which is defined as sub-type of SynchPolicy (borrowed from ACE). For multi-threaded application, Channel template is specialized with SynchPoly = MT_Policy. For special single threaded cases (such as some systems designed with eventlib and X-window/Xlib/Xt, only single main thread detects external events and dispatch events), Channel template can be specialized with SynchPolicy = NULL_Policy whose sub-types are "no-ops" removing synch overhead. Another important design aspect is its "namespace" concept (borrowed from plan9/inferno). Channel's namespace is all the message/event ids published / subscribed by its members. The namespace can be linear, hierachical, and associative based on routing algorithms (and all these are easily changed when specializing Channel template for a particular application by different template arguments). When 2 channels are connected thru "connectors", their namespaces are "merged" to facilitate transparent distributed message passing and event dispatching. This "merge" operation can be controlled by defining filters and translators. Sorry for all these murmruing, the project website: http://channel.sourceforge.net/ provides more details. I am studying existing boost libraries for system programming (Thread, Shmem, Asio and Serialization). Is there any libraries i missed?
Thanks Yigong

Hello there, Is there stilll any interests in a template library for distributed message passing and event dispatching? I finally got time to redesign and reimplement the library (Channel) based on Boost. Initial design document and code is available from sourceforge: http://channel.sourceforge.net/boost_channel/doc/design.html http://sourceforge.net/projects/channel I have done most of my coding and testing in linux (Fedora Core 3 &4). Havent try it on Windows yet. I'd like to hear from the community on the design first. FYI, the following is a introduction of the library. Thanks Yigong ------------------------------------------------------------------------------------- In Unix and most OSes, file systems allow applications to identify, bind to and operate on system resources and entities (devices, files,...) using a "name" (path name) in a hierarchical namespace (directory system) which is different from variables and pointers in flat address space. In Boost.Signal and libsigc++, callbacks/slots objects can be connected explicitly to specific signals objects to allow synchronous event dispatching. Channel is a C++ template library to provide namespaces for asynchronous, distributed message passing and event dispatching. Message senders and receivers bind to names in namespace; binding and matching rules decide which senders will bind to which receivers; then message passing and event dispatching could happen among bound senders and receivers. Channel's signature: template < typename idtype, typename platform_type = boost_platform, typename synchpolicy = mt_synch<platform_type>, typename executor_type = abstract_executor, typename name_space = linear_name_space<idtype,executor_type,synchpolicy>, typename dispatcher = broadcast_dispatcher<name_space,platform_type> > class channel; Various namespaces (linear/hierarchical/associative) can be used for different applications. For example, we can use integer ids as names to send messages in linear namespace or we can use path name ids to send messages in hierarchical namespace; User can configure namespace easily by setting a channel template parameter. Channel's other major components are dispatchers; which dispatch messages/events from senders to bounded receivers. Dispatcher is also a channel template parameter. The design of dispatchers can vary in several dimensions: - how msgs move: push or pull; - how callbacks executed: synchronous or asynchronous. Sample dispatchers includes : synchronous broadcast dispatcher, buffered asynchronous dispatchers,... Namespace and dispatchers are orthogonal; they can mix and match together freely; just as STL algorithms can be used with any STL containers by means of the iterator range concept, Namespace and dispatchers can be used together because of the name binding set concept. By combining different namespace and dispatching policies, we can achieve various models: - synchronous event dispatching - associative space model similar to tuple space - asynchronous messaging model similar to Microsoft CCR (Concurrency Coordination Runtime) Similar to distributed files systems, distributed channels can be connected or "mounted" to allow transparent distributed message passing. Filters and translators are used to control namespace changes. Channel is built on top of Boost facilities: - boost::shared_ptr for message/event data life-time management - boost::bind, boost::function for callback - boost::thread for synchronization - boost::serialization for message marshaling/demarshaling - Boost.Asio and Boost.Shmem are used to build transports among remote channels.
participants (6)
-
Cliff Green
-
David Greene
-
Doug Gregor
-
Douglas Gregor
-
Larry Evans
-
Yigong Liu