On 29/03/2017 17:32, Lee Clagett via Boost wrote:
Read this [paper on crash-consistent applications][0]. Table 1 on page 5
I particularly like the sentence: "However, not issuing such an fsync() is perhaps more safe in modern file systems than out-of-order persistence of directory operations. We believe the developers’ interest in fixing this problem arises from the Linux documentation explicitly recommending an fsync() after creating a file." I agree with them. fsync() gives false assurance. Better to not use it, and certainly never rely on it.
should be of particular interest. I _think_ the bucket portion of NuDB's log has no size constraint, so its algorithm is either going to be "single sector append", "single block append", or "multi-block append/writes" depending on the total size of the buckets. The algorithm is always problematic when metadata journaling is disabled. Your assumptions of fsync have not been violated to achieve those inconsistencies.
One of my biggest issues with NuDB is the log file. Specifically, it's worse than useless, it actively interferes with database integrity. If you implemented NuDB as a simple data file and a memory mapped key file and always atomic appended transactions to the data file when inserting items, then after power loss you could check if the key file mentions extents not possible given the size of the data file. You then can rebuild the key file simply by replaying through the data file, being careful to ignore any truncated final append. That would be a reasonable power loss recovery algorithm. A little slow to do recovery for large databases, but safe, reliable, predictable and it would only run on a badly closed database. You can also turn off fsync entirely, and let the atomic appends land on storage in an order probably close to the append order. Ought to be quicker than NuDB by a fair bit, much fewer i/o ops, simpler design. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/