oracle-database2phase-commit

Oracle ALTER SESSION ADVISE COMMIT?


My app to recovers automatically from failures. I test it as follows:

  1. Start app
  2. In the middle of processing, kill the application server host (shutdown -r -f)
  3. On host reboot, application server restarts (as a windows service)
  4. Application restarts
  5. Application tries to process, but is blocked by incomplete 2-phase commit transaction in Oracle DB from previous session.
  6. Somewhere between 10 and 30 minutes later the DB resolves the prior txn and processing continues OK.

I need it to continue processing faster than this. My DBA advises that I should prefix my statement with

ALTER SESSION ADVISE COMMIT;

But he can't give me guarantees or details about the potential for data loss doing this.

Luckily the statement in question is simply updating a datetime value to SYSDATE every second or so, so if there was some data corruption it would last < 1 second before it was overwritten.

But, to my question. What exactly does the statement above do? How does Oracle resolve data synchronisation issues when it is used?


Solution

  • Can you clarify the role of the 'local' and 'remote' databases in your scenario.

    Generally a multi-db transaction does the following

    1. Starts the transaction
    2. Makes a change on on database
    3. Makes a change on the other database
    4. Gets the other database to 'promise to commit'
    5. Commits locally
    6. Gets the remote db to commit

    In doubt transactions happen if step 4 is completed and then something fails. The general practice is to get the remote database back up and confirm if it committed. If so, step (5) goes ahead. If the remote component of the transaction can't be committed, the local component is rolled back.

    Your description seems to refer to an app server failure which is a different kettle of fish. In your case, I think the scenario is as follows :

    1. App server takes a connection and starts a transaction
    2. App server dies without committing
    3. App server restarts and make a new database connection
    4. App server starts a new transaction on the new connection
    5. New transaction get 'stuck' waiting for a lock held by the old connection/transaction
    6. After 20 minutes, dead connection is terminated and transaction rolled back
    7. New transaction then continues

    In which case the solution is to kill off the old connection quicker, with a shorter timeout (eg SQLNET_EXPIRE_TIME in the sqlnet.ora of the server) or a manual ALTER SYSTEM KILL SESSION.