
--- Caleb Epstein <caleb.epstein@gmail.com> wrote:
On Linux, the differences between async and sync results are much more striking.
I have spent some time investigating this today, in the course of which I implemented optimisations to eliminate memory allocations and reduce the number of system calls in the asynchronous case. Even with the optimisations, on Linux the test still showed approximately the same results as those reported by Caleb. However, changing the test to have a network between the sender and receiver shows a marked improvement in async's relative and absolute performance. In this case, the performance of sync and async is virtually the same. My conclusion is that the single-host test exhibits pathological behaviour on Linux (and possibly other OSes). The problem arises due to UDP being an unreliable protocol. Let's consider the behaviour of the async test. We have: - One thread performing synchronous sends in a tight loop. - One thread performing asynchronous receives via the demuxer. Typically a UDP send will not block, so the synchronous loop performs sends until its timeslice finishes. This will rapidly fill the buffer on the receiving socket, and once that buffer is full the additional datagrams are discarded. The receiver will continue to receive whatever datagrams are available without giving up its timeslice, but once those are gone it will block on select/epoll/etc. The net result is that it takes the receiver more timeslices, and therefore more time, to receive its quota of packets. The synchronous test, on the other hand, appears to be getting flow control for free from Linux. That is, a thread blocking on a synchronous receive seems to be woken up as soon as data is available, so the socket's buffer never fills. This is borne out by introducing simple flow control to the async test. I added a short sleep to the synchronous send loop like so: if (m % 128 == 0) { timeval tv; tv.tv_sec = 0; tv.tv_usec = 1000; select(0, 0, 0, 0, &tv); } and the performance of the async test was boosted to approximately 2/3 of the sync test. A more realistic test involves putting the sender and receiver on different hosts. I did this with the following setup: - Dedicated 100Mbps ethernet connection - Sender: Windows XP SP2, 1.7GHz Pentium M, 512MB RAM - Receiver: Linux 2.6.8 kernel, 900MHz Pentium 3, 256MB RAM Running the test with packets of 256, 512 and 1024 bytes showed identical performance for the async and sync cases. I'm not saying that async operations will always perform as well as the equivalent sync operations. A one-socket test like this naturally favours synchronous operations, because an asynchronous implementation involves additional demultiplexing costs. However in a use case involving multiple sockets, these costs are amortised. Cheers, Chris --- Caleb Epstein <caleb.epstein@gmail.com> wrote:
On 12/20/05, Rene Rivera <grafik.list@redshift-software.com> wrote:
I ran the same 100,000*1K*6*2*3 tests with both debug and release compiled code. As can be seen from the attached output in the best
case,
of release code, there is a 5.6% "overhead" from the async to sync cases. For the debug code the difference is a more dramatic 25.2%.
On Linux, the differences between async and sync results are much more striking. Here are the results from Rene's program compiled with gcc 4.0.2-O2 on Linux 2.6 (epoll). I had to make a number of small changes to get it to compile, and the SYNC test hangs at the end
--- ASYNC... ### TIME: total = 4.62879; iterations = 100000; iteration = 4.62879e-05; iterations/second = 21603.9 ### TIME: total = 5.37136; iterations = 100000; iteration = 5.37136e-05; iterations/second = 18617.3 ### TIME: total = 5.03588; iterations = 100000; iteration = 5.03588e-05; iterations/second = 19857.5 ### TIME: total = 5.09588; iterations = 100000; iteration = 5.09588e-05; iterations/second = 19623.7 ### TIME: total = 4.60645; iterations = 100000; iteration = 4.60645e-05; iterations/second = 21708.7 ### TIME: total = 4.55167; iterations = 100000; iteration = 4.55167e-05; iterations/second = 21970 -- ...ASYNC: average iterations/second = 19951.8 --- SYNC... ### TIME: total = 1.38579; iterations = 100000; iteration = 1.38579e-05; iterations/second = 72161.2 ### TIME: total = 1.3561; iterations = 100000; iteration = 1.3561e-05; iterations/second = 73741 ### TIME: total = 1.34804; iterations = 100000; iteration = 1.34804e-05; iterations/second = 74181.9 ### TIME: total = 1.35522; iterations = 100000; iteration = 1.35522e-05; iterations/second = 73788.5 ### TIME: total = 1.36956; iterations = 100000; iteration = 1.36956e-05; iterations/second = 73016.4 ### TIME: total = 22.2436; iterations = 100000; iteration = 0.000222436; iterations/second = 4495.68 -- ...SYNC: average iterations/second = 73682
I had to interrupt the program by attaching a debugger to get it to run to completion (explaining the low result for the last SYNC loop). One thread seems to get stuck in a "recv" call (sync_server::run) that does not get interrupted by main's call to s0.stop(). Not sure if this is a bug in the test program or in asio.
-- Caleb Epstein caleb dot epstein at gmail dot com _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost