CS262B Reading Summary

Transaction Management in the R* Distributed Datbase Management System

C. Mohan et al.

Summary by Feng Zhou
2/25/2004

Strong points of the paper are:

  1. Hierarchical 2P is a simple but useful extension to the original 2PL protocol.  Basically nonleaf, nonroot processes act as both coordinators (for their child processes) and subordinates (for their parent processes). The protocol proceeds in a hierarchical way recursively. The intermediate processes wait for all their children to finish preparation before it moves to "prepared" state and reports to its coordinator, etc.
  2. The PA protocol exploits the fact that in normal 2PL, aborts are assume is the coordinator knows nothing about a transaction. Therefore the coordinator can actually forgets about the transaction as soon as it determines the end result of a transaction is "abort". Also no log-force or ACK is required for aborts because of this fact, on both the coordinator and the subordinates.  Another optimization is for partially read-only transactions, read-only processes only needs to participate in the first phase of the transaction.
  3. The PC protocol optimizes for the common case of transaction succeeding, which is reasonable. Instead of assuming aborts as in 2PL and PA, PC assumes transactions to be successful.  In order to make the protocol correct, a "collect" record is forced to the log before the prepare messages are sent by the coordinator.  It marks the start of the prepare phase and records all participators.  With this modification, the coordinator has the ability to explicitate abort all subordinates during its own recovery process in case it crashes before committing a transaction. Therefore it doesn't need to make sure all commits are done before forgetting about a transaction. Thus the subordinates don't need to force the commit record nor send ACK for the commit operation.
One major flaw.

The reasons listed for choosing a local victim in case of global deadlock is not convincing.  The authors mention that extra latency could be introduced if a remote victim is chosen, which could be bad.  But why this offsets the cost of aborting a more expensive transaction is not clear.