
On Wed, Jan 27, 2010 at 5:15 PM, <strasser@uni-bremen.de> wrote:
Zitat von Bob Walters <bob.s.walters@gmail.com>:
Yes, I can deal with it in that way. You mention that you have no checkpoint, but I thought you would need occasional msync() of the backing store, in order to eliminate the need for some of the logs,
is that what you mean by checkpoints? I assumed you meant exporting the entire state from time to time, so checkpoint + log = current state.
Well, not the whole state, but rather just the changes since the last checkpoint. In effect, it is the equivalent of writing to the log, but doing a lazy msync() of the memory mapped region only once every N seconds so that if there is good spacial locality to the updates being done by the user, there is chance of reducing the I/O load, and also of using more sequential I/O. My checkpoint is (unfortunately) more a matter of explicitly writing out the changes, rather than just msync(), but the concept is the same, and so fits with the algorithm discussed below.
anyway, this is currently managed by the storage log itself, though this is not set in stone: in addition to a commit message there is a "success" message written to the log, i.e. "this transaction has reached disk". the log only removes old logs when there's been a success message for each transaction in it. so the RM can delay the success messages (and therefore the sync) as long as it wants, it is only prompted by storage_log::overflow()==true to post its success messages. so I guess it won't be too much coordination if we end up using this log.
It sounds like it. I can always have a thread which does periodic checkpointing, then interacts with the log when prompted by overlow to indicate the transactions which have been written to disk.
True. It would be great to ensure that when the different boost transaction-capable libraries are used together, the log can be shared.
do you also include the TM log in this?
No. I'm assuming here that one/both of us eventually gets an RM created which combines our two RMs under a common log, as you had mentioned previously. As a result, the TM would recognize only one RM, and thus could avoid any need for a log of it's own, and just do 1 phase commit calls (pass-through.) IIUC it also wouldn't need a log also in the case of 1 persistent RM and 1 non-persistent RM. So that means all 3 libraries under discussion could be used together without the overhead of a distributed TM having its own log and sync points.
e.g. when there is a RDBMS, and a logging boost library, the boost RM maintains a log and the TM does for the distributed transactions with the RDBMS RM.
so the RM and the TM could also use a shared log, but I tend to think this is not worth the effort, as this would span the interface between RM and TM.
I think it's been done both ways. i.e. TM has its own log resource, and also TM shared a log with one of the RMs it is managing. (e.g. Oracle RDBMS). However, any sharing probably isn't worth much, because the TM would still need its own sync()s as it orchestrated the different RMs, even if it was sharing a log with one of them.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost