jtaxa2phase-commit

2 Phase Commit Global Transaction Status after 2nd Phase failure


My question is this:

Say I have a Transaction Manager and 2 Resource Managers.

  1. TM tells RMs to prepare.
  2. RMs acknowledge they are prepared/vote yes.
  3. TM tells RMs to commit.
  4. RM 1 commits and acknowledges commit.
  5. RM 2 never gets the commit message because of network failure.

In this scenario I know that RM 2 is sitting in a waiting state, then the session times out in the database and is put into in-doubt state.

If the TM does not reconnect with the RM before the AbandonTimeout is exceeded, then the transaction is abandoned.

My question is, what happens to the global transaction while the TM continues to attempt recovery of the RM?

Does the TM send back an exception to the application when it starts trying the recovery?

Does the TM send back success even though one of the RMs never sent an acknowledgement?

The AbandonTimeout is default of 24hours. Does the TM hold the transaction for 24 hours and then once the timeout is reached, send back an exception?

In this link 2 phase Commit the end of phase two states:

  1. The coordinator sends a commit message to all the cohorts.
  2. Each cohort completes the operation, and releases all the locks and resources held during the transaction.
  3. Each cohort sends an acknowledgment to the coordinator.
  4. The coordinator completes the transaction when all acknowledgments have been received.

So what happens to the global transaction if the acknowledgement of the commit is never received?

I cannot find anything surrounding the resolution of a global transaction during a recovery operation. Any help would be appreciated.

Thanks, Matt


Solution

  • Only when all the participants return ok, the transaction will be returned to the database as committed. If the TM cannot reconnect it will stay as in doubt, potentially locking database pages (This generally requires manual cleanup).

    Depending on timeout settings, the client application can receive errors. Some db systems like oracle allow to simulate different error conditions. The following link describes that http://docs.oracle.com/cd/B28359_01/server.111/b28310/ds_txnman009.htm#ADMIN12285