asio review extended till Dec 30th

Jeff Garland

23 Dec 2005 23 Dec '05

1:53 a.m.

Thanks to everyone for the excellent reviews and discussions of asio! Some folks want additional time for the asio review, so the review will be extended until Dec 30th. Note that Chrisopher may not be able to respond to posts during a significant part of this time due to holiday commitments, but the feedback will eventually be addressed and discussed. In the way of reminder, here's the library details again: Today (Dec 22nd) is the first continuation of the formal review of the Asynchronous I/O library (asio) library by Christopher Kohlhoff. The review will run be extended until Friday December 30th. I will be serving as review manager. From the library synopsis: Boost.Asio is a cross-platform C++ library for network programming that provides developers with a consistent asynchronous I/O model using a modern C++ approach. Downloads of the library as well as online documentation can be found at: http://asio.sourceforge.net/ http://asio.sourceforge.net/boost-asio-proposal-0.3.6/libs/asio/doc/ As usual, please state in review comments how you reviewed the library and whether the you think the library should be accepted into Boost. Further guidelines for writing reviews can be found on the website at: http://www.boost.org/more/formal_review_process.htm#Comments Please review early and often! Thanks, Jeff ******************************************************* A Few Library Details: ******************************************************* Supported Platforms The following platforms and compilers have been tested: * Win32 using Visual C++ 7.1 and Visual C++ 8.0. * Win32 using Borland C++Builder 6 patch 4. * Win32 using MinGW. * Linux (2.4 or 2.6 kernels) using g++ 3.3 or later. * Solaris using g++ 3.3 or later. * Mac OS X 10.4 using g++ 3.3 or later. Rationale The Boost.Asio library is intended for programmers using C++ for systems programming, where access to operating system functionality such as networking is often required. In particular, Boost.Asio attempts to address the following goals: * Portability. The library should support, and provide consistent behaviour across, a range of commonly used operating systems. * Scalability. The library should allow, and indeed encourage, the development of network applications that scale to hundreds or thousands of concurrent connections. The library implementation for each operating system should use the mechanism that best enables this scalability. * Efficiency. The library should support techniques such as scatter-gather I/O, and allow protocol implementations that minimise data copying. * Model Berkeley sockets. The Berkeley sockets API is widely implemented and understood, as well as being covered in much literature. Other programming languages often use a similar interface for networking APIs. * Ease of use. Lower the entry barrier for new users by taking a toolkit, rather than framework, approach. That is, try to minimise the up-front investment in time to just learning a few basic rules and guidelines. After that, a library user should only need to understand the specific functions that are being used. * Basis for further abstraction. The library should permit the development of other libraries that provide higher levels of abstraction. For example, implementations of commonly used protocols such as HTTP. Although the current incarnation of Boost.Asio focuses primarily on networking, its concepts of asynchronous I/O can be extended to include other operating system resources such as files.

Show replies by date

Jody Hagins

29 Dec 29 Dec

2:48 a.m.

OK. I may or may not have time to finish the review, but here is the experience of my first use of the lib... I started at the beginning (i.e., Getting Started). I clicked on the Tutorial link, and followed it to the first example, timer1. While I have issues with the way timers work (mostly because it is just plain different), I wanted to build and run the first tutorial. I did not want to pollute my pristine boost installation, so I unpacked the stuff into a subdir, boost-asio-proposal-0.3.6, and set an environment variable ASIO_ROOT to point to the top. I then went to the tutorial source and tried to compile...

...

g++ -I${BOOST_ROOT} -I${ASIO_ROOT} -o timer timer.cpp

but I was rudely surprised to see the compilation fail with the following error... /home/jody/boost-asio-proposal-0.3.6/boost/asio/detail/mutex.hpp:25:3: #error Thread support is required! I was able to get it to build and run with this... g++ -I${BOOST_ROOT} -I${ASIO_ROOT} -D_REENTRANT -o timer timer.cpp -lpthread I find this highly problematic. I am not writing a multithreaded application, so why do I have to compile with threading enabled? If I am wrong, and there is some way to compile without threading, then you can ignore the following rant, as it applies to ASIO. However, it is still perfectly valid for the rest of Boost... This problem is further compounded because of the terrible way some Boost libraries treat threading. Some libraries (e.g., Boost.SmartPtr) embed mutexes in the objects based on compilation flags, and they ALWAYS run through the thread protection mechanisms. This is just crazy. I'm sorry, but there is no other way to put it. One of the objectives for asynchronous I/O is to reduce the dependency on multi-threaded concepts. However, here it is, tightly married to the wench. Maybe it is to do asynch resolutions or some other sort of system calls, but I still cry foul. Please do not take it wrongly, but at some point we MUST stop the madness. The boost philosophy is just plain wrong here. Libraries are saying that just because you compile with the ability to do multithreading, all parts of code pay the price, even parts that are not multithreaded. This library goes even further, by mandating that the application be built with threads enabled. I certainly hope that somewhere, after "getting started" there is some rationale. However, it would have to be pretty darn good rationale, because I can not seen any reason to allow a library to require threading (except, of course, the threads library).

Christopher Kohlhoff

1:03 p.m.

Hi Jody, --- Jody Hagins <jody-boost-011304@atdesk.com> wrote:

...

but I was rudely surprised to see the compilation fail with the following error...

/home/jody/boost-asio-proposal-0.3.6/boost/asio/detail/mutex.hpp:25:3:

...

#error Thread support is required!

I was able to get it to build and run with this...

g++ -I${BOOST_ROOT} -I${ASIO_ROOT} -D_REENTRANT -o timer timer.cpp -lpthread

I find this highly problematic. I am not writing a multithreaded application, so why do I have to compile with threading enabled?

As you mentioned, threads are required for the async DNS lookup. This has never really come up, and to my knowledge you are the first to ask for the ability to build without using thread support. I can check for BOOST_DISABLE_THREADS (in fact I've already made some changes along this line), but when threads are disabled functionality like asynchronous host resolution will have to be made unavailable. Cheers, Chris

Peter Dimov

1:41 p.m.

Christopher Kohlhoff wrote:

...

I can check for BOOST_DISABLE_THREADS (in fact I've already made some changes along this line), ...

It's better to check for absence of BOOST_HAS_THREADS. BOOST_DISABLE_THREADS is a user macro that causes BOOST_HAS_THREADS to be #undef'ed.

Jody Hagins

2:54 p.m.

On Fri, 30 Dec 2005 00:03:31 +1100 (EST) Christopher Kohlhoff <chris@kohlhoff.com> wrote:

...

As you mentioned, threads are required for the async DNS lookup.

This has never really come up, and to my knowledge you are the first to ask for the ability to build without using thread support.

I can check for BOOST_DISABLE_THREADS (in fact I've already made some changes along this line), but when threads are disabled functionality like asynchronous host resolution will have to be made unavailable.

That's a start, I guess (though I'm not sure what macros to check). However, this still propogates the current misuse, IMO, of how threads are to be specified. I'm sure there are many alternatives, but the first to mind (yet probably undesirable), is to fork a process and use IPC mechanisms and a very simple protocol in the absence of threading. Maybe you could make it policy based instead of ifdef'd. Of course, if the multithreading plicy is chosen, then the code should throw the #error if multithreading is not defined. I certainly do not think #threading should be mandatory for this library. In fact, you do not #want it to be multithreaded. You are going to be running the demuxer #from a single thread. You may do other things (like toss work on a #message queue for another thread to process -- especially long term #operations that can not be multiplexed into the asio framework). #However, the control flow for the demuxer will be executed in a single #thread, and should not be doing anything with synchronization #primitives. First, I should apologize for my tone in the last email. I just read it, and it did not come across very friendly, to say the least. I can try to blame it on being frustrated with something else, but it's most likely due to me just being a grumpy old man. Anyway, I think the current threading practice we use in boost is just plain wrong. It's no fault of yours that you are simply following that practice. I should be able to specify pieces of code that do not touch multithreading issues. Unfortunately, too much code simply adds multithreading into everything. Part of what bothers me is that boost is a modern C++ library, and as such has many tools to allow different practices, yet we still use IFDEFs and blindly insert synchronization primitives in places where they are not needed. If certain pieces of my application are known to be single threaded, then I want to compile it that way. However, I also may have other pieces of my application that are multithreaded, and I want to compile them that way as well. With boost, this is virtually impossible. You have to declare one or the other. I'm not sure of all the places this actually happens, but some of the core libraries do it, so it permeates. Further, other libraries seem to take the same stance. I find this very disturbing. I'm actually home sick, so I don't know how much I'll get done (it also may have something to do with my sour disposition, though I REALLY wish we would address this issue boost-wide).

Peter Dimov

5:38 p.m.

Jody Hagins wrote:

...

I'm sure there are many alternatives, but the first to mind (yet probably undesirable), is to fork a process and use IPC mechanisms and a very simple protocol in the absence of threading.

I'm not sure why is spawning a process and using IPC better than using a thread. Is adding -lpthread to the command line that much to ask? Where's the problem?

Jody Hagins

6:24 p.m.

On Thu, 29 Dec 2005 19:38:31 +0200 "Peter Dimov" <pdimov@mmltd.net> wrote:

...

Jody Hagins wrote:

...
I'm sure there are many alternatives, but the first to mind (yet probably undesirable), is to fork a process and use IPC mechanisms and a very simple protocol in the absence of threading.

I'm not sure why is spawning a process and using IPC better than using a thread. Is adding -lpthread to the command line that much to ask? Where's the problem?

If that's all it entailed, not it would not be too much to ask. However, several libraries in boost force multiprocessor functionality on the user if the code is compiled with threaded ability. A chief offender is SmartPointer, made even more heinous because it is used all over the place (no problem there -- I use it all over the place too). Thus, if I compile one piece of code for multithreading, now all my uses of shared_ptr<> and friends are going to require space for mutexes, and worse, they are going to acquire/release locks for every operation. The only way around it is to compile completely without threads, which is not what I want either. One great advantage of C++ is that you can design libraries so that you only pay for what you use. However, several major boost components do not adhere to this philosophy, at least not enough. Thus, IMO, multithread support in some boost libs is just plain wrong. Some libs force you to use it all the time or not at all. Very few applications require full MT synchronization primitives for everything. For this library, it even goes beyond that, I'm afraid. The demuxer::run() method always acquires/releases a mutex for each operation. I'm not sure where else it happens.

Peter Dimov

6:57 p.m.

Jody Hagins wrote:

...

On Thu, 29 Dec 2005 19:38:31 +0200 "Peter Dimov" <pdimov@mmltd.net> wrote:

...
Jody Hagins wrote:

...
I'm sure there are many alternatives, but the first to mind (yet probably undesirable), is to fork a process and use IPC mechanisms and a very simple protocol in the absence of threading.

I'm not sure why is spawning a process and using IPC better than using a thread. Is adding -lpthread to the command line that much to ask? Where's the problem?

If that's all it entailed, not it would not be too much to ask. However, several libraries in boost force multiprocessor functionality on the user if the code is compiled with threaded ability. A chief offender is SmartPointer, made even more heinous because it is used all over the place (no problem there -- I use it all over the place too).

I'm not sure what this has to do with the subject. Anyway...

...

Thus, if I compile one piece of code for multithreading, now all my uses of shared_ptr<> and friends are going to require space for mutexes, and worse, they are going to acquire/release locks for every operation. The only way around it is to compile completely without threads, which is not what I want either.

... this is not quite correct if you are on a platform where shared_ptr uses atomic operations; let's assume for the sake of argument that this is not the case. The alternative is obviously to supply another version of shared_ptr, one that isn't MT-safe. Interoperability issues aside, this would be OK for most people as long as their ST shared pointers never cross threads, but it won't solve the problem that you describe unless ALL libraries that use shared_ptr ALSO supply two variants of their classes or APIs. This is a maintenance problem that most people would rather avoid; it is not a coincidence that compiler vendors are moving away from ST/MT libraries and towards a single MT-safe library. The resources that are freed by dropping the ST variant are used to improve the performance of the MT library to a competitive level.

...

One great advantage of C++ is that you can design libraries so that you only pay for what you use. However, several major boost components do not adhere to this philosophy, at least not enough.

Thus, IMO, multithread support in some boost libs is just plain wrong. Some libs force you to use it all the time or not at all. Very few applications require full MT synchronization primitives for everything.

Libraries that have shared state have to protect it somehow. A pure value-semantics library like boost::bind can be blissfully unaware of threading issues, but not everyone can afford to ignore threads.

...

For this library, it even goes beyond that, I'm afraid. The demuxer::run() method always acquires/releases a mutex for each operation.

... and this is a problem because..?

Jody Hagins

7:26 p.m.

On Thu, 29 Dec 2005 20:57:38 +0200 "Peter Dimov" <pdimov@mmltd.net> wrote:

...

...
If that's all it entailed, not it would not be too much to ask. However, several libraries in boost force multiprocessor functionality on the user if the code is compiled with threaded ability. A chief offender is SmartPointer, made even more heinous because it is used all over the place (no problem there -- I use it all over the place too).

I'm not sure what this has to do with the subject. Anyway...

It is relevant because asio currently requires a multithread build, or it will not compile. Thus, it forces me to set multithreading in my compilations, which in turn means that all my code (or an other boost code) that uses Boost.SmartPointer or other boost libs that do the same thing, is going to be forced to go through multithreaded primitives and other MT safe code. Thus, to use asio, I'm being forced to pay for lots of multithreaded stuff that I'm not necessarily using.

...

... this is not quite correct if you are on a platform where shared_ptr uses atomic operations; let's assume for the sake of argument that this is not the case. The alternative is obviously to supply another version of shared_ptr, one that isn't MT-safe. Interoperability issues aside, this would be OK for most people as long as their ST shared pointers never cross threads, but it won't solve the problem that you describe unless ALL libraries that use shared_ptr ALSO supply two variants of their classes or APIs.

Right. ACE uses sunchromization policies, which solves the payment problem, but it does mean that other classes either need to specify their use, or also allow the synch policy in their interface.

...

This is a maintenance problem that most people would rather avoid; it is not a coincidence that compiler vendors are moving away from ST/MT libraries and towards a single MT-safe library. The resources that are freed by dropping the ST variant are used to improve the performance of the MT library to a competitive level.

I can certainly understand that. However, MT-safe is different than what I'm talking about. Some stuff is MT-safe without synchronization primitives.

...

Libraries that have shared state have to protect it somehow. A pure value-semantics library like boost::bind can be blissfully unaware of threading issues, but not everyone can afford to ignore threads.

I understand. However, simply by using some libs I am now forced to pay attention, whether I want to or not.

...

...
For this library, it even goes beyond that, I'm afraid. The demuxer::run() method always acquires/releases a mutex for each operation.

... and this is a problem because..?

Because every time through the loop, you pay for the synchronization, when it is not needed. The only reason for its existence is so that multiple threads can call demuxer::run() at the same time. Aside from implementation questions brought up earlier about calling os demux hooks from multiple threads, this is extremely undesirable in the more common case where a single thread calls run(). Maybe I'm not being very clear, or maybe you don't think it is bad to uselessly call to synchronization primitives.

Peter Dimov

8:42 p.m.

Jody Hagins wrote:

...

On Thu, 29 Dec 2005 20:57:38 +0200 "Peter Dimov" <pdimov@mmltd.net> wrote:

...
... this is not quite correct if you are on a platform where shared_ptr uses atomic operations; let's assume for the sake of argument that this is not the case. The alternative is obviously to supply another version of shared_ptr, one that isn't MT-safe. Interoperability issues aside, this would be OK for most people as long as their ST shared pointers never cross threads, but it won't solve the problem that you describe unless ALL libraries that use shared_ptr ALSO supply two variants of their classes or APIs.

Right. ACE uses sunchromization policies, which solves the payment problem, but it does mean that other classes either need to specify their use, or also allow the synch policy in their interface.

Right, and this can be a problem if shared_ptr is used only as an implementation detail; implementation details shouldn't affect the interface.

...

...
This is a maintenance problem that most people would rather avoid; it is not a coincidence that compiler vendors are moving away from ST/MT libraries and towards a single MT-safe library. The resources that are freed by dropping the ST variant are used to improve the performance of the MT library to a competitive level.

I can certainly understand that. However, MT-safe is different than what I'm talking about. Some stuff is MT-safe without synchronization primitives.

shared_ptr on x86/PPC, for example.

...

...
...
For this library, it even goes beyond that, I'm afraid. The demuxer::run() method always acquires/releases a mutex for each operation.

... and this is a problem because..?

Because every time through the loop, you pay for the synchronization, when it is not needed. The only reason for its existence is so that multiple threads can call demuxer::run() at the same time. Aside from implementation questions brought up earlier about calling os demux hooks from multiple threads, this is extremely undesirable in the more common case where a single thread calls run().

Maybe I'm not being very clear, or maybe you don't think it is bad to uselessly call to synchronization primitives.

Specific undesirable effects (such as reduced performance) can be bad. Synchronization, by itself, is not. It's merely a way (not the only way) to implement the documented thread safety guarantee of ::run. (One example of a specific undesirable effect that springs to mind is the possibility of deadlock if a callback invokes ::run.)

...

From a cursory look at epoll_reactor, I'd think that the costs of the lock are negligible for the ST case (barring a pathologically inefficient mutex implementation) since the critical regions are fairly expensive.

Jody Hagins

9:53 p.m.

On Thu, 29 Dec 2005 22:42:29 +0200 "Peter Dimov" <pdimov@mmltd.net> wrote:

...

Right, and this can be a problem if shared_ptr is used only as an implementation detail; implementation details shouldn't affect the interface.

Right. However, my main problem is that I don't have a choice if I use shared_ptr (and some others). I'm forced to use the synch prims even when I know it's never needed.

...

Specific undesirable effects (such as reduced performance) can be bad.

Synchronization, by itself, is not. It's merely a way (not the only way) to implement the documented thread safety guarantee of ::run.

Right. I am not belittling synchronization... only the use of it when it is unnecessary.

...

(One example of a specific undesirable effect that springs to mind is the possibility of deadlock if a callback invokes ::run.)

Indeed. I am firmly opposed to holding a lock across I/O operations, and the code here holds the lock across many I/O operations. I am almost as opposed to holding them across callback functions.

...

From a cursory look at epoll_reactor, I'd think that the costs of the lock are negligible for the ST case (barring a pathologically inefficient mutex implementation) since the critical regions are fairly expensive.

The CRs could be made a lot less expensive, and in this case, the sync prims are probably not as bad. However, the pattern of writing code in this manner is what I'm really having problems with... However, when you are processing many thousands of messages per second, ANY extra unnecessary work becomes non-negligible. Note that network applications have inherent delays, so programmers of these beasts try to remove everything possible. It's even more important when your program is "competing" with another program and you are already at a disadvantage because of connection distance. Every microsecond counts. I *really* only want to pay for what is necessary. If there are extra bells, features, whistles, let me pay for them as I go. This is the main reason I do not use Boost.Signals any more. All the additional features have bloated the lean use case so much that I have found its use impractical for high performance applications.

Christopher Kohlhoff

11:59 p.m.

--- Peter Dimov <pdimov@mmltd.net> wrote: <snip>

...

Specific undesirable effects (such as reduced performance) can be bad. Synchronization, by itself, is not. It's merely a way (not the only way) to implement the documented thread safety guarantee of ::run.

Regarding other ways of implementing the demuxer without synchronisation... I'm no expert (not even close) on lock-free lists and the like, but it is my belief that the task_demuxer_service can be implemented using some sort of lock-free list or queue plus some atomic integer operations. The locking_dispatcher implementation is also a candidate for this treatment. When it comes to the epoll_reactor, I think it could be implemented without locking by having it adopt a one-way message passing interface. It would use a locking_dispatcher internally to ensure that the critical regions are not called concurrently. Alternatively the epoll_reactor might be amenable to an implementation using lock-free lists directly, i'm not sure.

...

(One example of a specific undesirable effect that springs to mind is the possibility of deadlock if a callback invokes ::run.)

Just to clarify, the task demuxer's lock is not held when invoking callbacks. I'm not sure what it actually means for a program to make a nested call to demuxer.run(), but it should not deadlock. The reactor lock is held when making I/O calls, but these are non-blocking. It is also on my to-do list to investigate separating the reactor wait from the subsequent I/O calls so that it can scale better across multiple CPUs (i.e. the I/O calls on different sockets can be made concurrently). Cheers, Chris

Darryl Green

30 Dec 30 Dec

2:42 a.m.

Christopher Kohlhoff wrote:

...

The reactor lock is held when making I/O calls, but these are non-blocking.

Non-blocking in some sense - but you can't rely on it not being a preemption point/taking another (kernel) lock etc. I do agree it isn't as bad as making a blocking I/O call with a lock held of course.

...

It is also on my to-do list to investigate separating the reactor wait from the subsequent I/O calls so that it can scale better across multiple CPUs (i.e. the I/O calls on different sockets can be made concurrently).

warning - another rant on need to expose policy follows - stop reading if you are sick of it.... The reactor-based async I/O emulation basically does user/kernel buffer copies in the I/O call. It would seem brave to assume an advantage on all systems from trying to utilize multiple CPUs for that, considering the potential cost of dispatching to another thread + rescheduling when the reactor waits again vs just doing the copy. Clearly this shouldn't be used on a uniprocessor at least. I guess if the dispatching resulted in the I/O and subsequent handler running in the same thread without another crossing of thread boundaries this would be close to performance neutral at worst wrt the current system, but I can't really see how to do that effectively (allowing for locking dispatch etc). Doesn't the current dispatcher use a leader/followers approach so that the I/O will be done by the thread that just woke-up from the select/epoll/whatever? Thinking about that - why would that thread need to hold a lock while doing the I/O anyway? Basically this is another feature that should encourage user selectable policy in the public interface of the library. A hugely complex set of policies/options to get something reasonably scalable to work isn't going to be widely appreciated, but I can't see why some sensible defaults can't be assembled/packaged/selected easily. On the threading issue in general, I do see considerable merit in an async io system that includes the mechanisms needed for message passing/deferred execution integrated with the io handling so as to allow "background" processing (vs reactive handlers). As you mentioned elsewhere, the locking dispatcher has the potential to avoid explicit locking in user code. The combination of these facilities surely supports the view that the threading policy applicable to a particular use of asio is not necessarily the same threading policy applicable to other components used in the same app. This clearly isn't an issue for asio alone, but does suggest that even when threading is used, asio should allow more fine-tuning than using a boost-wide or application-wide build option. Regards Darryl Green.

...

Cheers, Chris

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

-- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.371 / Virus Database: 267.14.9/216 - Release Date: 29/12/2005

Christopher Kohlhoff

29 Dec 29 Dec

9:29 p.m.

Hi Jody, --- Jody Hagins <jody-boost-011304@atdesk.com> wrote:

...

That's a start, I guess (though I'm not sure what macros to check).

If I do as Peter suggested and check for the absence of BOOST_HAS_THREADS you probably don't need to do anything. <snip>

...

However, the control flow for the demuxer will be executed in a single thread, and should not be doing anything with synchronization primitives.

Actually the demuxer is partly intended to provide you with a thread-safe way of passing work across threads, i.e. demuxer.post() can be called from any thread to invoke a function in the single thread that is calling demuxer.run(). <snip>

...

Part of what bothers me is that boost is a modern C++ library, and as such has many tools to allow different practices, yet we still use IFDEFs and blindly insert synchronization primitives in places where they are not needed.

I'm going to talk about this more in another reply, but I believe that the current interfaces can be implemented to be thread-safe without using mutexes. Cheers, Chris

Jody Hagins

9:51 p.m.

On Fri, 30 Dec 2005 08:29:33 +1100 (EST) Christopher Kohlhoff <chris@kohlhoff.com> wrote:

...

If I do as Peter suggested and check for the absence of BOOST_HAS_THREADS you probably don't need to do anything.

You'd have to do that in the demuxer and any other place as well. Maybe you should define a null mutex type for use in those cases. That way, any code that uses synchronization primitives will automatically do nothing.

...

...
However, the control flow for the demuxer will be executed in a single thread, and should not be doing anything with synchronization primitives.

Actually the demuxer is partly intended to provide you with a thread-safe way of passing work across threads, i.e. demuxer.post() can be called from any thread to invoke a function in the single thread that is calling demuxer.run().

Look at the implementation of epoll_reactor though. There are a few issues with it, especially when you have multiple threads in the run() function.

...

...
Part of what bothers me is that boost is a modern C++ library, and as such has many tools to allow different practices, yet we still use IFDEFs and blindly insert synchronization primitives in places where they are not needed.

I'm going to talk about this more in another reply, but I believe that the current interfaces can be implemented to be thread-safe without using mutexes.

OK, thanks! I look forward to it. BTW, I am trying to reimplement (using asio) a major component in wide use everyday in a bunch of networking applications. I'm running out of time, and I'm actually home sick right now, so I may not finish. Anyway, I thought it would give me some good ground on which to base a review. I will possibly have issues that may not pertain to the review itself. Can I send them to your private email address, or should I keep them here? Again, thanks for the hard work on asio.

Christopher Kohlhoff

30 Dec 30 Dec

1:22 a.m.

Hi Jody, --- Jody Hagins <jody-boost-011304@atdesk.com> wrote:

...

You'd have to do that in the demuxer and any other place as well. Maybe you should define a null mutex type for use in those cases. That way, any code that uses synchronization primitives will automatically do nothing.

Yep, already done exactly that! <snip>

...

I will possibly have issues that may not pertain to the review itself. Can I send them to your private email address, or should I keep them here?

To add to the available options: you can also use the asio mailing list hosted on sourceforge. I'm happy to leave the choice up to you :) Cheers, Chris

Jody Hagins

29 Dec 29 Dec

4:41 p.m.

Looking at asio::demuxer... I see that run() can be called from multiple threads. To see how that is done (there are many ways), I took a quick peek at the source. After the last email, I was not surprised to see an explicit mutex. Pertaining to my last post, this should probably be a policy so that a null mutex can be used (or some other mechanism for saying that this demuxer is single threaded). I took a brief look at the epoll implementation. I don't see EPOLLET anywhere, so you are doing Level Triggered. Curious why not Edge Triggered? I imagine the reason is to prevent the implementation from having to either keep track of the FD still being ready (until EAGAIN) or keeping its own internal buffer to hold the "overflow" but I'm still curious. It seems a framework like this is good for ET since it can take care of the added complexities. epoll_ctl() will return EEXIST if the fd is alredy registered, though it is possible to have it succeeed if called from separate threads. It shouldn't matter, though. However, of some concern is the fact that you call epoll_wait() from multiple threads using the same epoll fd. Assume N different threads are blocked on epoll_wait(). Data arrives. All N threads will be notified of the data available. All N threads will then try to read from the sockets. Bad performance issues. Worse, if a socket in not NBLOCK, all the threads hang reading the same socket. I'm also curious as to to manner in which the FDs are inserted/removed from the epoll mechanism. This may go to the overall design. I see the benefit in saying "read X bytes and let me know when they arrive." However, I would think a more common case, for long TCP/IP sessions would be, "keep reading until I tell you to stop, and let me know when each chunk of data arrives." I do not see an interface for the latter, though I'm sure it's there... Also, it seems a waste to call epoll_ctl() after each I/O operation, regardless of whether it will change the events. Before I severely overstep, can you point me to a good example of async_read operations? I see this as a the primary use case, yet it does not appear in any of the tutorials. async_read() does not provide the buffer to the handler. It seems the only way to use it is with a function object that contains the buffer. Is that a correct understanding, or do I just need to go back to sleep? It would be nice if the refrence web pages had a link to the source. I'd like to confirm what I see in the doc with what the source says. Maybe even another set of pages that are annotated with the implementation source code.

Christopher Kohlhoff

9:54 p.m.

Hi Jody, --- Jody Hagins <jody-boost-011304@atdesk.com> wrote: <snip>

...

I took a brief look at the epoll implementation. I don't see EPOLLET anywhere, so you are doing Level Triggered. Curious why not Edge Triggered? I imagine the reason is to prevent the implementation from having to either keep track of the FD still being ready (until EAGAIN) or keeping its own internal buffer to hold the "overflow" but I'm still curious. It seems a framework like this is good for ET since it can take care of the added complexities.

When I first added epoll support it was to meet the requirements of handling tens of thousands of connections. I used the level-triggered interface because it mapped easily from the existing select_reactor implementation. However, I have already been thinking about converting to use edge-triggered epoll to reduce the number of epoll_ctl calls. Some recent changes I have made post-review-version are steps in this direction. <snip>

...

However, of some concern is the fact that you call epoll_wait() from multiple threads using the same epoll fd.

Actually epoll_wait is only called from one thread at a time. This coordination is managed by the task_demuxer_service. <snip>

...

Also, it seems a waste to call epoll_ctl() after each I/O operation, regardless of whether it will change the events.

Yep, using edge-triggered epoll should take care of this.

...

Before I severely overstep, can you point me to a good example of async_read operations? I see this as a the primary use case, yet it does not appear in any of the tutorials.

Yeah, good point! I should add tutorials that demonstrate more complex interactions. Maybe have a look at the chat or serialization examples, since the message format used in these programs is a fixed length header followed by a body.

...

async_read() does not provide the buffer to the handler. It seems the only way to use it is with a function object that contains the buffer. Is that a correct understanding, or do I just need to go back to sleep?

Binding it into the function object is one way. But often this binding is indirect, in the sense that you bind a this pointer and the buffer is a data member of the class. Cheers, Chris

christopher baus

10:43 p.m.

...

When I first added epoll support it was to meet the requirements of handling tens of thousands of connections. I used the level-triggered interface because it mapped easily from the existing select_reactor implementation.

However, I have already been thinking about converting to use edge-triggered epoll to reduce the number of epoll_ctl calls. Some recent changes I have made post-review-version are steps in this direction.

Ok good. I'm glad somebody pointed this out because it was one of the three concerns I was going to mention in my review. If I can't finish my review in the next day, I feel that they've all been addressed else where. Looking at Jody numbers on select() vs epoll(), I realized this is probably the reason why epoll looked bad when there were many "always ready" FDs. There is no reason to wait on an FD until EWOULDBLOCK is returned from read() or write(). Just associate the readiness state with the socket.

Jody Hagins

10:51 p.m.

On Fri, 30 Dec 2005 08:54:01 +1100 (EST) Christopher Kohlhoff <chris@kohlhoff.com> wrote:

...

...
However, of some concern is the fact that you call epoll_wait() from multiple threads using the same epoll fd.

Actually epoll_wait is only called from one thread at a time. This coordination is managed by the task_demuxer_service.

So, that class calls the impl run() method, and it makes sure it is only entered once? What if a handler calls demuxer.run()?

...

Maybe have a look at the chat or serialization examples, since the message format used in these programs is a fixed length header followed by a body.

OK. Thanks. You have an interface to read N bytes. What about an interface to read until a message terminator is found?

...

...
async_read() does not provide the buffer to the handler. It seems the only way to use it is with a function object that contains the buffer. Is that a correct understanding, or do I just need to go back to sleep?

Binding it into the function object is one way. But often this binding is indirect, in the sense that you bind a this pointer and the buffer is a data member of the class.

But, what if I want to use a free function as my handler? I can't because the data is not provided. Also, now I have coupled the start of the read with its handling. What about my earlier question of "keep reading and notifying me until EOF?" That's not how I worded it, but let's say I have an application and I want the same handler called for all IO until the socket is closed. Right now, I *think* I have to keep creating async objects and calls each time I process some data. This seems a bit wasteful. I'd like to "reuse" the same handler without binding again, or I'd like to just tell the framework... Keep calling this handler for every piece of data.

7105

Age (days ago)

7112

Last active (days ago)

List overview

Download

19 comments

6 participants

participants (6)

christopher baus
Christopher Kohlhoff
Darryl Green
Jeff Garland
Jody Hagins
Peter Dimov