distributed-transactionsxa2phase-commit

Two Phase Commit and merging execution and prepare phase


As far as I understand before two phase commit is even run a round trip communication to send the transactions to each site is needed. Each site excecutes their part of the transaction and when the coordinator gets a response from all sites then it runs two phase commit. This initiates the prepare phase, etc.

Why is it necessary to have the prepare phase be separate from the execution that precedes two phase commit? Is there a reason for not merging execution and the prepare phase, thus cutting out a round trip communication cost?

This is a followup to my previous question.


Solution

  • There are several good reasons for doing it like this:

    1. Operations may require data from other sites. How would you implement a swap(a,b) operation between data items at different sites if you merge execution and prepare phases?
    2. The coordinator is likely to become a performance bottleneck. Merging the execution and propose phases would have it involved in relaying application data, further overloading it.
    3. Transparency and encapsulation. Note that code between begin/commit in the reply to your previous question (i.e., business logic) is not concerned with distributed transactions at all. It doesn't need to know which sites, or even how many, will be involved! It invokes arbitrary procedures that may be (or not...) remote calls to unknown sites. To merge execution and prepare you'd have to explicitly package your application logic in independent callbacks for each participant.

    Moreover, when thinking about performance with persistence involved, you should be concerned about round-trips that imply flushing the log. Invocations during the execution phase don't need to flush the log and thus should be much cheaper than the prepare and commit phases, that do.