Re: [boost] [AFIO] Callback API
Moving the discussion from boost-users to boost-dev, and thus the full context is quoted. On 12/21/2013 01:53 AM, Niall Douglas wrote:
On 20 Dec 2013 at 19:16, Bjorn Reese wrote:
Can you suggest something? I honestly can't think of anything simpler which also provides strong write ordering guarantees.
I have not given this much thought so consider the following a brainstorm.
We really ought to move this off boost-users ... but we'll see how it goes.
I am thinking about an API that uses handles that looks more like Asio sockets. Write ordering can be handled, not by batching operations together, but rather calling the next write operation from the callback of the previous operation (Asio-style.) This will not always yield good performance, but oftentimes that is less relevant. If you need performance, then the "advanced" dispatcher API is available.
So there could be a file handle (and directory handle) class for file (directory) manipulation calls, which hides all the details of the dispatcher etc.
class file_handle { public: void read(buffer, read_callback); void write(buffer, write_callback); // and so on };
class directory_handle { public: void create(name, create_callback); // uses file(single) or dir(single) void remove(name, remove_callback); // uses rmdir(single) void watch(name, watch_callback); // directory monitoring // and so on };
I'm struggling to see the merit in such an approach - it would be incredibly verbose and complex to write even simple solutions, because i/o on files is not like sockets, *especially* that you almost never must strongly order i/o across a sequence of multiple sockets, but that is a very common case with files e.g. during ACID.
First, it follows a pattern that is familiar to Asio users. So rather than you being the only one who can write applications on top of Afio, you get a broader community who are already familiar with the callback approach. Second, you can more easily combine it with Asio socket code to create compound services that mixes socket and file access. For instance, a peer-to-peer file distribution mechanism.
AFIO tries to help users to not write callbacks except when necessary, but if you do want a user defined callback you simply chain a call() or completion() operation to the item whose completion you want called back upon.
The idea is, you see, that you subclass the async_io_dispatcher class with additional completion handlers, and use those as building blocks for further subclasses of async_io_dispatcher. That hopefully gets people to break up their callbacks into reusable completion handlers, and saves people writing and debugging code.
Before you say "this should be in the documentation", yes it should and will be after Paul gets his directory monitoring implementation working. I'm thinking that will form the fourth section in the beginner's tutorial - how to modularise and make reusable the normally bespoke glue code which makes up traditional callbacks. It might actually be worth adding a section 3b which shows how the naïve approach in 3a is a very stupid idea :)
As much as this sort of operation dependency graph based design works well for its niche, I agree there is a swathe of code which ends up looking like the find in files implementation, and for that hopefully fibers will make look much more sane.
I am not arguing against the current API. Instead, I am arguing for the addition of a more familiar Asio-style callback API. I have no trouble with a fiber-enchanced API (similar to Asio with coroutines.)
On 28 Dec 2013 at 12:12, Bjorn Reese wrote:
I'm struggling to see the merit in such an approach - it would be incredibly verbose and complex to write even simple solutions, because i/o on files is not like sockets, *especially* that you almost never must strongly order i/o across a sequence of multiple sockets, but that is a very common case with files e.g. during ACID.
First, it follows a pattern that is familiar to Asio users. So rather than you being the only one who can write applications on top of Afio,
:) The current API design really is very easy to program against once you get used to thinking about i/o in that way. I find it very "natural" to write, though like with most C++ writing many more lines of source usually equals better quality code. I do wish C++ weren't like that, but there it is.
you get a broader community who are already familiar with the callback approach.
Understood. And I did try during AFIO design review to replicate ASIO's API identically, until I realised that wasn't wise, purely because of improved familiarity with the user base, and besides libuv uses ASIO style callbacks.
Second, you can more easily combine it with Asio socket code to create compound services that mixes socket and file access. For instance, a peer-to-peer file distribution mechanism.
Can I clarify something so? Where you stated:
I am thinking about an API that uses handles that looks more like Asio sockets. Write ordering can be handled, not by batching operations together, but rather calling the next write operation from the callback of the previous operation (Asio-style.)
By "calling the next write operation", I'm thinking that you're thinking that a kernel API ought to be issued right there and then yes just like ASIO does?
As much as this sort of operation dependency graph based design works well for its niche, I agree there is a swathe of code which ends up looking like the find in files implementation, and for that hopefully fibers will make look much more sane.
I am not arguing against the current API. Instead, I am arguing for the addition of a more familiar Asio-style callback API. I have no trouble with a fiber-enchanced API (similar to Asio with coroutines.)
Oh sure. But we'll test your assumptions about how file i/o works in practice first. If they're the same as mine, I think I'll come to understand exactly what you're looking for, and then I can implement it. Thanks for the feedback Bjorn. Niall -- Currently unemployed and looking for work. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
On 12/28/2013 08:56 PM, Niall Douglas wrote:
By "calling the next write operation", I'm thinking that you're thinking that a kernel API ought to be issued right there and then yes just like ASIO does?
Yes.
understand exactly what you're looking for, and then I can implement it.
I have started working on a prototype to see how feasible it is.
On 1 Jan 2014 at 12:31, Bjorn Reese wrote:
By "calling the next write operation", I'm thinking that you're thinking that a kernel API ought to be issued right there and then yes just like ASIO does?
Yes.
understand exactly what you're looking for, and then I can implement it.
I have started working on a prototype to see how feasible it is.
Be aware I didn't reject the naïve ASIO-style API design due to the API design itself. No, rather I rejected it because file i/o is always blocking [1] and issuing kernel API calls immediately resulted in pathological performance. In short, socket i/o API calls are nice and deterministic with bounded worst case execution times, whereas file i/o is nasty and unpredictable and NP worst case execution times. [1]: This is a very tl;dr discussion worthy of a 90 minute presentation, but in short yes this is true on all mainstream operating systems. Linux *can* turn on non-blocking for file handles, but the feature has "dragons live here" all over it. On Windows an overlapped file i/o operation can and does go synchronous on many an occasion - I've seen an overlapped WriteFile() take 1.6 seconds to return "operation in progress". Niall -- Currently unemployed and looking for work. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
[1]: This is a very tl;dr discussion worthy of a 90 minute presentation, but in short yes this is true on all mainstream operating systems. Linux *can* turn on non-blocking for file handles, but the feature has "dragons live here" all over it. On Windows an overlapped file i/o operation can and does go synchronous on many an occasion - I've seen an overlapped WriteFile() take 1.6 seconds to return "operation in progress".
To add to the discussion with an example (at the risk of repeating what may have been better said): to understand how nightmarish file i/o can be, you just need to think about network file storage. Imagine that you want to write to a NAS whose disks are currently sleeping. It may (or may not, depending on the write request and the configuration of the NAS) require the disk to spin up just to acknowledge the request, because it's not the same error to have a non-working device, a privilege error or a full disk (and these are just examples I can think of). On top of that, you can cancel asynchronous requests. But what to do once the data is written to the disk? Roll it back? Not all file systems are transactional. As you can see the job of the OS isn't the same. One of the great difficulty with asynchronous file i/o is that you want asynchronicity but you also want your data to somehow be correctly written to the disk. On the other hand when doing asynchronous network i/o you "just send or receive data". What you do with this data is up to you (from the OS point of view). The contract isn't the same. My point of view is that sharing a property (in this case asynchronicity) with an existing library does not imply sharing the same philosophy. Kind regards. --Edouard
Generally, I have little to add to your excellent points, but I will mention what AFIO does about the concerns you raised: On 3 Jan 2014 at 11:41, Alligand Edouard wrote:
On top of that, you can cancel asynchronous requests. But what to do once the data is written to the disk? Roll it back? Not all file systems are transactional. As you can see the job of the OS isn't the same.
AFIO currently has no ability to cancel scheduled i/o. If someone can suggest how to do this portably, I am all ears.
One of the great difficulty with asynchronous file i/o is that you want asynchronicity but you also want your data to somehow be correctly written to the disk. On the other hand when doing asynchronous network i/o you "just send or receive data". What you do with this data is up to you (from the OS point of view). The contract isn't the same.
Most filing system implementations provide strong guarantees about write reordering e.g. writes will never be reordered by more than thirty seconds. AFIO exposes that very useful property to the author.
My point of view is that sharing a property (in this case asynchronicity) with an existing library does not imply sharing the same philosophy.
There are lots of asynchronous implementations of platform facilities. WinRT is a very good example of implementing much of a platform API asynchronously. I know Microsoft are planning even more asynchronousity to come too, there was a recent news item about experimental new platform layers coming out of MSR. Niall -- Currently unemployed and looking for work. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
participants (3)
-
Alligand Edouard
-
Bjorn Reese
-
Niall Douglas