Programming practices for asynchronous operations (asio, etc)
Hello I have recently started to rework a network server project that uses a thread per connection (with blocking/synchronous network I/O) handling model to using asio and an asynchronous operation model (where the number of threads should depend on hardware resources and be decoupled of software design constraints). The problem I have with this is that it makes one write too much boiler plate code for all those "completion handlers". Take very this simple example of code that is using the one thread per connection synchronous I/O model: // synchronously resolves, connects, throws on error connection con("1.2.3.4"); { // application "packet", usually known as "record" // uses "con" and starts an application packet with a given type packet pack(con, PacketTypeRequest); // serializes portably given data and synchronously sends the data // throws on error pack << uint32(10) << std::string("/home/user") << uint8(3); } // calls packet::~packet() which signals in the "con" stream end of packet Now to transform this very simple (I repeat, this is a very simple example, a lot more complex examples write "packets" of dozens of fields of which the last one may have "infinte" length meaning you cannot know the size of it to put it in the stream and you just have to send it as much as you can and then signal the end of the "packet") into an asynchronous version one would need: connection con; // breaks invariants of "con" being always connected // should asynchronously resolve and connect con.async_connect("1.2.3.4", handle_connect); // break code flow here // handle_connect() // breaks invariant of allowing "serialization" only after packet type // has been sent over the network packet pack(con); pack.async_serialize(uint32(10), handle_serialize1); // return to caller // handle_serialize1 pack.async_serialize(std::string("/home/user"), handle_serialize2); // return to caller // handle_serialize2 pack.async_serialize(uint8(3)), handle_serialize3); // return to caller // handle_serialize3 // breaks RAIIness of original packet which automatically signaled // "end" from the dtor of it pack.async_end, handle_endpacket); // return to caller And imagine that the original code was just a small function a big class, so now each such small function transforms into dozens of smaller functions, the code explosion is huge. I am curious to the code practices that some of you employ to solve these issues (to still have compile time checked code as much as possible by strong invariants and RAII idioms and not have to write millions of small functions). Some of the things I have thought of that seem to solve these issues: - instead of packets being adhoc serialized have structures encapsulating the network packets and have serialization code moved into them and let them deal with all the small functions (they could use some buffer to cache serialization of the fixed fields and async_write that buffer contents in a single operation); this however means an additional copy of the data vs how the code was before and it just moves the problem, instead of having many small functions in the high level code you have them in the lower level packet structures serialiation code (thu the output buffer being can reduce some of them) - using template expressions or whatever do some kind of "lazy evaluation"; basically still use syntax similar to the synchronous solution like: pack << uint32(10) << std::string("/home/user") << uint8(3); but this code instead of doing network I/O would enqueue the serialization actions needed have all those completion handlers internally and completely abstract to the user all those details; the user just does a pack.async_write(user_handler) and "user_handler" will be called after all the serializations have been asynchronously written - if this weren't C++ but C then we could use setjmp/longjmp to transparently (to the user) simulate synchronous operation while asynchronous operation is done behind the scenes; the user writes code such as: pack << uint32(10) << std::string("/home/user") << uint8(3); but what it does is on each op<< (each asynchronous serialization) the code does an async_write() with an internal handler, saves the context (setjmp) then longjmp()s to the previously saved context in the main event loop; when the async_write completes the handler does setjmp to restore the original user context and continue with the next op<< and so on; this however does not work in C++ because the jumped code path with longjmp may have exceptions being thrown/catched and as the standard says that's UB not to mention form what I could gather on the Internet some C++ compilers call dtors of auto objects when you longjmp "back" thus not leaving the original context untouched (which is what I need) -- Mihai RUSU "Linux is obsolete" -- AST
On Sat, Aug 09, 2008 at 03:07:46PM +0300, dizzy wrote:
- if this weren't C++ but C then we could use setjmp/longjmp to transparently (to the user) simulate synchronous operation while asynchronous operation is done behind the scenes; the user writes code such as:
What you describe are coroutines. setjmp/longjmp are unsuitable for context switching. On UNIX you have swapcontext() function which does it properly (no problems with exceptions), and on Win32 you have the Fiber API. The downside is that you have to allocate a stack for each coroutine, so you have the same space overheads as with threads. (though, it is not hard to make user-level context switching and scheduling more efficient than the OS's) What I would recommend you to do is to transform your program into a producer-consumer pipeline: have N worker threads, each of which is written in a synchronous style, and hand them off work via queues. You can keep a global count of active (non-sleeping) workers, and adjust concurrency level on the fly. You have stumbled upon a hot topic (events vs. threads), you can read f.ex.: http://portal.acm.org/citation.cfm?id=1251058 http://portal.acm.org/citation.cfm?doid=502059.502057 (the latter paper might give you an idea about possible architecture for your code, along the lines of the paragraph above) Just out of curiosity, why are you rewriting thread-based code into event-based code? It might be easier/cheaper to port[*] it instead to an OS that handles large number of threads better.. [*] And if you're developing for UNIX, porting might be just a recompile.
Hello On Saturday 09 August 2008 17:23:39 Zeljko Vrba wrote:
Just out of curiosity, why are you rewriting thread-based code into event-based code? It might be easier/cheaper to port[*] it instead to an OS that handles large number of threads better..
Main platform is Linux with secondary Windows platform. The problem right now is not OS threading support but rather the complexity of the program to handle timeouts and errors. With an asynchronous aproach each async operation is also a point of failure or timeout. Thus the points of failure/timeout are basically handled through the normal flow of the code (the code already "breaks" flow and returns on each async operation, timeout is just an event like any). With a synchronous aproach timeout would have to be handled with exceptions or support breaking of flow and return (similar to the asynchronous operation). Also the pure multithreaded aproach imposes locking and thread synchronization where usually there is none needed (since there is no need that certain parts be ran concurrently). This makes the code more error prone and IMO more complex than it needs to be. I supose a compromise solution that combines both threading and asynchronous operation might be the best. -- Dizzy
Hi,
On 8/10/08, dizzy
........................................... I supose a compromise solution that combines both threading and asynchronous operation might be the best.
You might want to have a look at the following: ACE (Adaptive Communication Environment) IO Completion Ports (Windows) Reactor & Proactor Patterns -- Asif
participants (3)
-
Asif Lodhi
-
dizzy
-
Zeljko Vrba