
Hi all, This is my review of boost.afio. In fact I have still more comments on the library, but this is taking already too long and I'm well past the review deadline, so I'm sending what I have. I'm hopefully going to post the second part in the next few days. 1. Should Boost.AFIO be accepted into Boost? Please state all conditions for acceptance explicitly. No, not at this time. While the library is potentially quite useful, possibly to larger niche than expected to the author, I have many issues with both the documentation and the API itself, which would need to be resolved before acceptance. To be clear, this is not a conditional acceptance. I believe a full review is warranted to verify that the issues have been addressed. The author is obviously quite experienced in this area (or at least has done a lot of research), so I strongly encourage him to resubmit the library in the future as it has the potential to be be a great contribution to boost. 2. What is your evaluation of the design? My preference, but wouldn't require for acceptance, would be to use ASIO async protocol that leaves the choice of the completion notification strategy to the user . If the library wants all async functions to return futures, it should follow the standard future idioms. First of all, having futures which claim 'standard conformance' and then carry extra hidden error payload is unacceptable. Also functions should take handlers as parameters and not futures. The user should be responsible of explicitly 'then' chaining futures. The library should follow the standard semantics for handle types (i.e. single ownership move only wrappers system handles), and similarly for the future wrappers for such handles (i.e. future<handle>, not shared_future<handle>). Other reviewers have discussed these issues better than I could, so I won't expand further. The dispatcher should probably be called filesystem because that's what it is. The overridable per thread global dispatcher is evil. Functions should take a dispatcher explicitly (or through a handle_ptr) or use a static global dispatcher that represent the global filesystem. Does the library need to distinguish between files and symlinks at open time by calling differently named functions? I'll assume that's the case, but I would like to know the rationale. File, directory, symlink functions should be verbs and called open_file, open_directory, open_symlink. They should return differently typed handles (i.e. file_handle, dir_handle, sym_handle). As an aside, the handle types should have value semantics and be movable. To help generic code, rmdir, rmfile and rmsymbol should be unified under the same function name (rm, or better delete). The parameter type can be used to select the correct overload. Functions that return handles must do so explicitly (i.e. no implicit handle payload). The library should *never* cause the program to crash because of external causes (i.e. files and dirs disappearing or truncated, external locks taken, lock forcibly removed, up to and including disk failure). It is ok to abort on a failed internal consistency check, although it would be nice if the check+abort was encapsulated in an overridable macro (BOOST_AFIO_ASSERT maybe?). It is fine and even encouraged for the check be present even in release builds. The library *must* not suppose that every other program interacting with the filesystem is using AFIO. It is fine if the library were to give some additional guarantees (for example advisory locking becomes non-advisory), but these guarantees should be properly documented. For example, writes in append should be guaranteed atomic no matter what the program doing the writes is doing. In practice, this means that afio should document in detail the implementation of each operation for each backend for each operation so that a knowledgeable user can make the necessary inferences. In practice this seems already to be mostly the case. The documentation has a huge amount of notes on how the various FS and OSs differ and what are the different guarantees. AFIO should provide a standardised set of flags describing the various capabilities and a query interface, so that the inevitable different code paths do not need to be fs name dependent or, worse, #ifdef based. A good starting point for deciding which flags to expose, are the various FS tables scattered through the docs. 3. What is your evaluation of the implementation? I haven't had a chance to do an as in depth evaluation. Additionally, files with thousands of lines do not make it easy to read the code. The git repo contains both v1 and v2 subdirectories, and I'm not sure which one is the one actually under review (the docs say 1.5, not sure which one it is)' A few comments: Overuse of shared_ptr is in my experience code smell as the unclear ownership semantics lead to bugs down the line. The amount of ifdefs is staggering. Most of them are probably necessary, but the number of options that the library can be configured with certainly doesn't help.' It depends on external non-boost libraries, which it seems (but I can't be sure because of the layer of versioning macros) inject names outside the boost::afio namespace, which it shouldn't happen. With two spinlock acquisition per move, the lightweight future/modand seems anything but lightweight. The library should really use boost::future instead of rolling its own. The author should attempt to fold possible improvements directly into boost.thread. Similarly for any necessary improvements to Boost.Asio (for the WoW issue) and Boost.Filesysterm (for boost::path). The abstraction overhead seems excessive, especially for a library suposedly IO bound. The author should at least try to mach the performance of the naive openmp implementation. BTW, nonvolatile ram based storage will be becoming mainstream within the year, so the author might need to revisit very soon the relative cost of various operations. Random suggestions (feel free to ignore these): - look into non temporal stores to help with cache pollution. - have a libnfs backend + NFS over loopback for truly asynchronous filesystem operations (performance is probably going to atrocious). - experiment with the "O_NONBLOCK disk read" patches for the linux kernel. 4. What is your evaluation of the documentation? As noted by others, the documentation does not seem to be of boost standard quality; At times it feels more like a stream of consciousness; often irrelevant or less important information is presented before more important ones. Disquss Comments do not belong in the documentation. With JS disabled, some pictures are not shown, at least in the single page docs. I'm less bothered than others about the 'try AFIO online button'. In fact it would be nice to have such a button after each example (I'm not suggesting that the author should provide it though). Any reference to not-implemented features should removed and at best moved to a separate section at the end of the documentation. A common patters is: "" AFIO implements * A * B (not implemented) * C (not implemented) * D (not implemented)"" which is, honestly, a bit ridiculous. Similarly scattering references to preivious versions of the library all over the documentation is confusing. These should go to a separate history page and/or changelog. The misuse of Terms of Art ('monad' 'precondition') seem a bit worrying. At every occurrence of the term 'filing system' my brain trips. For a while I thought it meant something subtly different from the more common 'filesystem' or 'file system'. This could be just me, so feel free to ignore this unless others complain. The documentation often references C++1z coroutines, which as far as I understand are far from being standardisation ready. It would be better if examples using boost.coroutine, boost.context or even asio.coroutine were provided. ** Intro The first paragraph is a nice introduction, but it should mention prominently that AFIO helps prevents file system races. Move the sub points of bullet 8 to top level, instead deemphatize and sythetize the performance claims under a single bullet. Before claiming performance, describle what the library can do. Reduce any reference to APIBind to a single line (something 'AFIO can be used as a standalone library thanks to APIBind', details elsewhere; also do not claim that APIBind is a boost library). Remove any references to backends from this section. Totaly remove references to unimplemented backands. Supported compilers and platforms, jenkins and uni tests etc do not belong here. Neither do links to dispatch benchmarks. You might want to mention that it is a C++11 library written as part of the GSoC. A reference to ASIO might or might not belong here (No ASIO class or concept seem to be part of the interface). Remove the Batch section from the simple example. Also the rest should stand by itself, without the comments. About the example itself: I find it surprising that you call the results using verbs and operations using names. A better naming convention would be nice. Also in general please prefer type result = op(params); to type result(op(params)); at least when op is an important operation. The initialisation syntax together with the noun based naming convention makes it harder to parse examples. ** Cheat Sheet The frist few paragraphs are interesting, but should go after the table. The table as is not useful, other than an alternate index. I would expecte something like; function name, paramenters, result, very short description. I have no idea what the other columns of the table are trying to convey. The related functionality at the end does not belong here. ** Design Introduction and Rationale In this page I would expect to find an analysis of a generic afio async operation, why a certain interface was chosen, how futures help compose operation, how an operation is implemented internally. For every choice a discussion of the design tradeoff should be presented. Unfortunately, after an interesting "why did I write this library", the chapter loses itself. The version history does not belong here unless is used to justify design choices, but other than saying that previous versions where slower, no other information is conveyed. In particular, at this point of the documentation I would expect a thorough explanation of the concept of a FS race and the minimum guarantees given by the library with respect to file system races. Additionally as the library comes with its own future implementation, a detailed explanations of why boost::futures is not sufficient should be given. As an aside, the author claims that the library is completely lock free, and it would be wait free if not for the the use of spinlocks. This does not makes any sense as the use of spinlocks also make the library *not* lock free, unless the author is using a non-standard meaning for these terms (again see previous comment about misuse of Terms of Art). ** Write Ordering Constraint The explanation on how sync behaves like a barrier and the description of scheduling ordering and dependencies is very useful, and it should be a great starting point for a formal treatment of race freedom. The description of TripleGit is completely unnecessary and should be removed. ** Compilation This section is very confusing. Let' say I want the default (whatever it is) and do not define anything. I get this: After defining your build macros of choice, making latest AFIO available to your code is as simple as: #include "boost.afio/include/boost/afio.hpp" // latest version #include "boost.afio/include/boost/v1/afio.hpp" // specific version if being used standalone or if being used as part of Boost: #include "boost/afio.hpp" // latest version #include "boost/afio/v1/afio.hpp" // specific version I have no idea what any of the above means. ** Boost AFIO Installation into Boost Why is this information not part of the previous chapter? Any reference to non boost.afio should be dropped (except possibly to a link at the end to an external non boost distribution). I understand the the external build instructions are useful for reviewers, but they shouldn't have been part of the documentation under review. ** Supported compilers not much to say here other than, assuming AFIO is accepted, the table would be static and reflect the specific released version instead of trunk. ** Example 1 All references to the macro based versioning should be dropped. I have no idea why the example datastore has to return a monad<shared_ptr<iostream> >. The shared pointer is gratuitous, and for the ""simplest interface" errors can be communicated by exceptions As defined I would expect an interface like this for the simplest store: optional<std::string> read(std::string key) void write(std::string key, std::string content) or something like this for something more complex: some_special_container_range read(std::string key) template<class Range> void write(std::string& key, const Range& range); Possibly the author has in mind something different than a key value system. ** Example 2 Because the interface is non-ideal, it is hard to evaluate the next examples. Freedom from directory renames in nice, but I wouldn't count it very high in the list of requirements for an object store. A dispatcher is introduced, without explanation of what it is. The cheat-sheet mentioned it, but it should be described in more details here. The VM shenigans required to make something like utils::file_buffer_allocator<> work can have serious and hard to pinpoint performance and scalability considerations and shouldn't be presented as a performance win uncritically. The tradeoffs should be discussed. Also read mappings can defeat the point of async IO as the application threads will be subject to blocking page faults. Both previous comments are not trying to suggest that mmap is not useful, but it shouldn't be treated as a transparent optimisation as it can have application wide consequences well outside IO code. As far as I understand boost.afio leaves the choice to the client code and never does mmap behind its back; still the examples should present best practices. The example is too long and has too many distractions. The demonstration a streambuf can be implemented on top of afio is quite interesting, but should be relegated to a separate example. ** Example 3 BOOST_MONAD_PROPAGATE, in addition to being in the wrong namespace, is not documented anywhere (at least a search in the one page doc does not lead anywhere) and seems completely unnecessary. I guess it expands to either a return or a throw. In 'The problems with this naive design', the table does not have unit of measure. The information is interesting, although, without any comparison of durability, concurrency and atomicity guarantees of any of the solution, is hard to infer anything (not that I expect much from a NoSQL db...). ** Example 4 ' Linux is different from other operating systems in that you must explicitly request metadata updates to be sent to physical storage' I'm not an expert, but I thought this was true only for ext3/4 (and only with some, albeit default, mount options) and generally considered broken even by Linus himself. If that's the case might be important to point it out so users can choose other FSs. The atomic key replacement example is nice. Should the library provide pre-made recipes for these kind of uses? In The FS table, the first few column titles should be reworded so that all checkmarks mean that a positive feature is available and 'x'es mean a lack of it. 'This is a good example of where Intel's Hyperthreading can really penalise overall performance sometimes: filing systems tend to parallelise linearly to real CPU core count only.' Why is that the case? I would expect the exact opposite in fact. Are you sure the performance bottleneck is not somewhere else (TLB invalidation, cache misses, badly sized hash lock pools)? ** Example 5 The extent based and append behaviour are great. 'Any writes we make within a 4Kb boundary is probably safe'. 'Probably safe' is not something I want to see when implementing a DB. The example is probably way too complex for a tutorial, also the fact that is incomplete makes it hard to evaluate. ** Hello World Please remove the long comments from the examples. Either make them part of the surrounding explanation or footnotes. ** A less toy example The documentation sometimes claims that AFIO is either faster than dumb stream based IO and sometimes assert that it is slower. Please pick one and stick to it. Or better, do not make any performance claims outside a benchmark page. Performance should not be the biggest claim for AFIO anyway. The example uses stl_future which is not defined anywhere in the documentation as far as I can see. when_all_p(ihs).get(); should probably be when_all_p(ihs_req).get() In general, unless I've completely misunderstood the example, all the .get() calls impose synchronisation, which means that the parallel_concatenate is not as asynchronous as possible. Also it should really schedule handle closes at the end and return a future<void>. for(auto &i : ihs) lasts.push_back(dispatcher->depends(ohresize, i)); The meaning of dispatcher->depend on this code is not obvious at all. From mailing list discussions I assume that depend(future a, bufure b) is a.then([b] { return b; }), but what's the role of the dispatcher here is opaque. Even more surprisingly, a few lines below depend is called as a free function. Also, a few lines above this line is present: when_all_p(ohresize).get(); which should impose an explicit synchronisation, making the depend superfluous. This is the implementation I would like to see: using br = boost::range; auto async_size = [](auto&& name) { return async_info(name).then(auto&& info) { return info.size(); }; }; auto open =[](auto&& name) { reutrn asyn_open(name); }; auto get = [](auto&& future) { return future.get(); }; auto always = [](auto&& x) { return [x=std::move(x)] { return x;}; }; auto unwrap = [](auto&& fr) { std::vector<typename decltype(fr)::value_type> ret; copy_to(ret, mapped(fr, get)); return ret; } template<class PathRange> std::future<std::size> asyn_concat(std::path destname, PathRange&& srcnames) { auto sizes = when_all(br::mapped(srcnames, async_size)) .then(unwrap)).share(); auto total_size = sizes .then([](auto&& range) { return br::accumulate(range.get(), 0);; }); auto srcs_range = when_all(br::mapped(srcnames, open)); auto dest = when_all(async_open(destname), total_size) .then([](auto fdest, auto size ) [] { auto dest = fdest.get(); return async_truncate(dest, size).then(always(dest)); }); return wait_all(srcs_range, dest, fbuffers) .then([](auto&& fsrcs, auto && fdest, auto&& fsizes) { auto&& sizes = fsizes.get(); std::vector<handles> srcs = unwrap(fsrcs); std::vector<std::future<> > writes; struct ctx { off_t offset = 0; ssize_t len = 0; std::vector<char> buf = std::vector<char>(1024*4); }; std::vector<ctx> ctxs; ctx.reserve(src.size()); std::size_t off = 0; auto dest = make_unique(fdest.get()); for(int i = 0; i != src.size(); ++i) { ctxs.push_back({ off, sizes[i] ); off += sizes[i]; auto& ctx = ctxs.back(); auto read = [&ctx, &s = srcs[i]] { if (ctx.len < ctx.buffer.size()) ctx.buffer.resize(ctx.len); ctx.len -= ctx.buffer.size(); return async_read(s, ctx.buffer); }; auto write = [&ctx, &dst = dest.get()] { auto off = ctx.off; ctx.off += ctx.buffer.size(); return async_write(s, off, ctx.buffer); }; auto read_write = [&ctx, read, write](auto self){ if(ctx.len) return read().then(write).then(self); return make_ready_future(); }; writes.push_back(read_write(read_write)); } return wait_all(writes).then( [ctx=std::move(ctx), dst=std::move(dst), src=std::move(src)] { return wait_all( mapped(srd, [](auto&& s) { return async_close(s); }) .then([d = async_close(dst)] { return d; })) }); }); }; Note that everything is perfectly parallel and dependencies are minimal. Most operations use unique semantics except for 'sizes' where I didn't bother to do better. The lifetimes of buffers and handles in the read/writes are admittedly subtle, but not too much. Instead of creating N futures per file, I'm creating just a single one at a time. ** (all examples) Overall the examples are nice, the progression of additional features makes sense. The code should be cleaned up to remove superflous code. Comments should be removed and the code should be discussed in the prose (possibly referencing line numbers). I suggest that the strawman interface using iostream is removed. The lookups should simply return initially optional<string>, then future<string>. future<string> (or future<vector<char> >) of course does not work well for large buffers. You could return something like future<chunk>, where chunk has the following interface: struct chunk { const char * begin(); const char * end(); std::size_t size(); // an empty chunk means we are done future<chunk> next(); // invalidates this }; but the reality is that futures are not great for lazy streams. You could look into reactive streams, but that's another can of worms. For pure read/writes I suggest an asio-like interface which uses callbacks would be optimal and readily familiar to boost users. 5. What is your evaluation of the potential usefulness of the library? I had never had a chance to look at the library in the past, and my only impressions of it came by the many references and claims made by the author in the boost mailing list. In particular I was baffled by the author's claim that the ASIO callback interface is not adequate for file IO. I could not understand how async read/writes from a file could be so significantly different from socket io. Turns out that the actual I/O part is just a small part of the library, and the async part even less. The bulk of the library is about writing roboust filesystem manipulations. In my mind, where boost.filesystem is designed as a simple interface to file system operations for script-like programs, AFIO attempts to be the goto interface for robust applications. 6. Did you try to use the library? With what compiler? Did you have any problems? I didn't trey to compile or use the library. 7. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? An in depth study of the documentation. I didn't keep track of the hours, probably 10 or more. 8. Are you knowledgeable about the problem domain? Asynchronous File IO? No. Race free file IO? Neither, but had to deal with these kind of issues in the past. Something like AFIO might have been useful.

On 1 Sep 2015 at 2:35, Giovanni Piero Deretta wrote:
This is my review of boost.afio. In fact I have still more comments on the library, but this is taking already too long and I'm well past the review deadline, so I'm sending what I have. I'm hopefully going to post the second part in the next few days.
*Wow* what a detailed review! Many, many thanks. I hope you don't mind me logging your entire review as a single issue at https://github.com/BoostGSoC13/boost.afio/issues/115. There were so many points it would take me all night to break it up into individual items. A huge thank you. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
participants (2)
-
Giovanni Piero Deretta
-
Niall Douglas