
Zitat von Bob Walters <bob.s.walters@gmail.com>:
so what the RMs need to do is split up the commit in 2 phases with a prepare message written to the log at the end of the prepare phase, and change the recovery process a bit.
Typically, this results in a lot of sync()s, but I'm not sure how to avoid that. e.g. prepare() has to sync because the resource manager is not allowed to forget the transaction due to machine death. commit() presumably must sync again. The TM does its own logging (or perhaps shares a resource manager) to keep track of transaction progress. Once a two-phase commit protocol comes into play, things slow down.
One result of this is that we should be sure that the TM is optimized for one-phase commits when only one RM is in play.
I agree. it's probably 5 syncs for a transaction across 2 RMs. the TM should also take into consideration if resources are persistent or not. Boost.STM for example is transactional memory only and does not maintain a log on disk, so when it is used together with 1 other persistent resource, the TM can perform a two-phase commit on the transient resource and a one-phase on the persistent resource. transient_resource->prepare(); persistent_resource->commit(); //not prepared transient_resource_commit(); the same is probably the case when your library is used in a non-file region? I´ve been thinking about introducing a RM category for that: typedef one_phase_tag category; //only supports one-phase commit. or typedef persistent_tag category; //supports two-phase, persistent or typedef transient_tag category; //supports two-phase, non-persistent only when 2 or more persistent_tag RMs are used in a global transaction the TM has to prepare both and write to its log to have a unique commit point, otherwise the commit of the only persistent_tag RM is the commit point. (any ideas for a better name for "one_phase_tag"?)
I wonder if either of you have also considered the notion of sharing a transactional disk infrastructure (e.g. write-ahead log and checkpointing mechanism) so that it would actually be possible for the 3 different libraries to share a transactional context and result in a single write-ahead log record per transaction. i.e. One log overall, instead of one log per RM.
do you see a reason why that would need support from the TM? when we've talked about the off-list I always assumed to implement that on the RM level. so when RM 1 offers service A and RM 2 offers service B, and they should share a log, a RM 3 is implemented that maintains the shared log and offers services A and B, forwarding service calls to RM 1 and 2. so as far as the TM is concerned, there is only one RM, so it performs a one-phase commit.
Specifically: prepare() might have each RM create a buffer containing the data that they need in order to recover that RMs changes for the transaction. The sum of this data for each RMs is written to the write-ahead log for the transaction. As a result, there is no need for a two-phase commit. Instead, during recovery processing, each RM is told to recover based on their particular portion of the log record contents. Checkpoints would likewise need to be coordinated in some fashion like this.
I don't use checkpoints, so I guess there would be no coordination needed, as long as your RM has access to the shared log?
The reason I mention this is because I think that the write-ahead logging and checkpointing being done by stdb and Boost.Persistence do sound comparable at this point, so I'm starting to think that looking at the problem from the vantage point of shared logging might prove optimal.
Is this alternative worth looking at?
I think it's definitely worth looking at as an optimization, but imho not as an alternative to the distributed transaction approach, as you cannot make all resources share a log, e.g. a RDBMS resource.