On Mon, Jun 1, 2009 at 1:33 AM, Roman Shmelev
Async file reading could be done in Windows. What I can give is the link to Russian blog: http://evgeny-lazin.blogspot.com/2008/12/boostasio.html
Everything that you described - could be achieved by asio. You can configure number of working threads by running io_service.run() from each of them. I guess, at this point you _will_ need to syncronize access to "controller" through some mutex or asio::strands and it will not significantly lower performance.
So let me see if I understand the basic model correctly. In Windows I only have to "think" about 1 thread, which is the main controller thread that runs the event loop. The event loop looks something like this (pseudocode): int bytes_transferred = 0; event_params params; uint64 next_read_offset = 0; uint64 next_write_offset = 0; int outstanding_reads = 0; int outstanding_writes = 0; //Fill up the threads with read work for (int i=0; i < max_worker_threads; ++i) { async_request(READ_REQUEST, next_read_offset, input_handle, 4096); next_read_offset += 4096; } while (bytes_transferred < bytes_total) { request what_happened; block_until_something_happens(&what_happened); switch (what_happened.code) { case READ_REQUEST: --outstanding_reads; ++outstanding_writes; request.type = WRITE_REQUEST; request.offset = next_write_offset; request.handle = output_handle; request.bytes = request.bytes; /* write the same number of bytes that were just read */ next_write_offset += request.bytes; async_request(&request); break; case WRITE_REQUEST: --outstanding_writes; bytes_transferred += params.bytes_written; if (bytes_transferred < bytes_total) { ++outstanding_reads; request.type = READ_REQUEST; request.offset = next_read_offset; request.handle = input_handle; request.bytes = 4096; /* always read in 4k chunks */ next_read_offset += 4096; async_request(&request); } } } This way nothing ever has to be synchronized because there is only 1 thread initiating all the reads and writes, what happens in the other threads is completely controlled by the operating system, even the threads are created by the operating system. So to achieve something similar with Boost.Asio, my understanding is that: a) I can use a standard io_service (don't need to provide a custom implementation) for the controller thread, and 1 additional io_service instance for all worker threads combined. So a total of 2 io_service instances no matter how many threads I want. b) I create the worker threads manually (so if I want to be able to have 5 outstanding I/O operations spread among reads and writes, I create 5 boost::thread objects, and at the very beginning call service.run() on the single io_service instance used for work. c) In the read_handler() and write_handler() methods that get called on the individual worker threads, I simply call controller_io_service.post(ReadFinished) or controller_io_service.post(WriteFinished). Would something like this work? (I could also just try this and _see_ if it works, but I've already been trying it and for some reason the pieces aren't all fitting together in my head). At what point in this process would I call run() on the master controller io_service?