Re: [boost] [review] Asio formal review.

28 Dec 2005

      --- Caleb Epstein <caleb.epstein@gmail.com> wrote:
...
On Linux, the differences between async and sync results are
much more striking.
I have spent some time investigating this today, in the course
of which I implemented optimisations to eliminate memory
allocations and reduce the number of system calls in the
asynchronous case. Even with the optimisations, on Linux the
test still showed approximately the same results as those
reported by Caleb.

However, changing the test to have a network between the sender
and receiver shows a marked improvement in async's relative and
absolute performance. In this case, the performance of sync and
async is virtually the same.

My conclusion is that the single-host test exhibits pathological
behaviour on Linux (and possibly other OSes). The problem arises
due to UDP being an unreliable protocol.

Let's consider the behaviour of the async test. We have:

- One thread performing synchronous sends in a tight loop.

- One thread performing asynchronous receives via the demuxer.

Typically a UDP send will not block, so the synchronous loop
performs sends until its timeslice finishes. This will rapidly
fill the buffer on the receiving socket, and once that buffer is
full the additional datagrams are discarded.

The receiver will continue to receive whatever datagrams are
available without giving up its timeslice, but once those are
gone it will block on select/epoll/etc. The net result is that
it takes the receiver more timeslices, and therefore more time,
to receive its quota of packets.

The synchronous test, on the other hand, appears to be getting
flow control for free from Linux. That is, a thread blocking on
a synchronous receive seems to be woken up as soon as data is
available, so the socket's buffer never fills.

This is borne out by introducing simple flow control to the
async test. I added a short sleep to the synchronous send loop
like so:

    if (m % 128 == 0)
    {
      timeval tv;
      tv.tv_sec = 0;
      tv.tv_usec = 1000;
      select(0, 0, 0, 0, &tv);
    }

and the performance of the async test was boosted to
approximately 2/3 of the sync test.

A more realistic test involves putting the sender and receiver
on different hosts. I did this with the following setup:

- Dedicated 100Mbps ethernet connection
- Sender: Windows XP SP2, 1.7GHz Pentium M, 512MB RAM
- Receiver: Linux 2.6.8 kernel, 900MHz Pentium 3, 256MB RAM

Running the test with packets of 256, 512 and 1024 bytes showed
identical performance for the async and sync cases.

I'm not saying that async operations will always perform as well
as the equivalent sync operations. A one-socket test like this
naturally favours synchronous operations, because an
asynchronous implementation involves additional demultiplexing
costs. However in a use case involving multiple sockets, these
costs are amortised.

Cheers,
Chris

--- Caleb Epstein <caleb.epstein@gmail.com> wrote:
...
On 12/20/05, Rene Rivera <grafik.list@redshift-software.com> wrote:
...
I ran the same 100,000*1K*6*2*3 tests with both debug and release
compiled code. As can be seen from the attached output in the best
case,
...
of release code, there is a 5.6% "overhead" from the async to sync
cases. For the debug code the difference is a more dramatic 25.2%.
On Linux, the differences between async and sync results are much
more
striking.  Here are the results from Rene's program compiled with gcc
4.0.2-O2 on Linux
2.6 (epoll).  I had to make a number of small changes to get it to
compile,
and the SYNC test hangs at the end
--- ASYNC...
### TIME: total = 4.62879; iterations = 100000; iteration =
4.62879e-05;
iterations/second = 21603.9
### TIME: total = 5.37136; iterations = 100000; iteration =
5.37136e-05;
iterations/second = 18617.3
### TIME: total = 5.03588; iterations = 100000; iteration =
5.03588e-05;
iterations/second = 19857.5
### TIME: total = 5.09588; iterations = 100000; iteration =
5.09588e-05;
iterations/second = 19623.7
### TIME: total = 4.60645; iterations = 100000; iteration =
4.60645e-05;
iterations/second = 21708.7
### TIME: total = 4.55167; iterations = 100000; iteration =
4.55167e-05;
iterations/second = 21970
--  ...ASYNC: average iterations/second = 19951.8
--- SYNC...
### TIME: total = 1.38579; iterations = 100000; iteration =
1.38579e-05;
iterations/second = 72161.2
### TIME: total = 1.3561; iterations = 100000; iteration =
1.3561e-05;
iterations/second = 73741
### TIME: total = 1.34804; iterations = 100000; iteration =
1.34804e-05;
iterations/second = 74181.9
### TIME: total = 1.35522; iterations = 100000; iteration =
1.35522e-05;
iterations/second = 73788.5
### TIME: total = 1.36956; iterations = 100000; iteration =
1.36956e-05;
iterations/second = 73016.4
### TIME: total = 22.2436; iterations = 100000; iteration =
0.000222436;
iterations/second = 4495.68
--  ...SYNC: average iterations/second = 73682
I had to interrupt the program by attaching a debugger to get it to
run to
completion (explaining the low result for the last SYNC loop).  One
thread
seems to get stuck in a "recv" call (sync_server::run) that does not
get
interrupted by main's call to s0.stop().  Not sure if this is a bug
in the
test program or in asio.
--
Caleb Epstein
caleb dot epstein at gmail dot com
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] [review] Asio formal review.

Christopher Kohlhoff