algorithmtransactionsacidaries

What happens when a DBMS crash during the recovery phase of ARIES algorithms?


From what I understood of ARIES algorithms, to support ACID transactions one must use WAL (Write Ahead Logging) : all writes are logged.

It is said to give the ability to the database to rollback changes made by an uncommitted transaction before the crash.

For each write, we log informations about the actual write (how to REDO it, how to UNDO it).

During the recovery phase, we analyze the log to perform REDO operations:

And then, to perform an UNDO, a new log entry is written (because it is a write after all), then the change is applied to the database during the checkpoint.

During a checkpoint, I guess we just perform the REDO for all committed entries.

I haven't found any informations on what happens if:

In those cases, some changes have been applied to the database and are not reflected by the log, leaving the database in an inconsistent state.

NB: Here is some of the links I used to learn more about ACID transactions and ARIES algorithm:

I am currently reading the source code of SQLite in order to understand how the whole thing is implemented.

Thanks in advance for any clarifications on this topic.


Solution

  • During REDO the log is read and the data is modified if needed. If a crash occurs during REDO then the next recovery will run REDO again. Some of the log records that did modify data on first recovery attempt will be no-op because the data modification was saved. Other will not be saved and will be 'redone' again.

    Checkpoint is still a transacted operation. In memory data is saved, then at the very last a checkpoint log record is written in the log. If a crash occurs during checkpoint, it can occur only before the checkpoint record was written. After the crash recovery is run again and starts with the REDO (since there is no record of a new checkpoint). The point above applies, REDO can be run repeatedly. Some log records will be no-op because the data changes were already saved, some will be 'redone' again.

    UNDO is working by generating and writing a compensating log record. If a crash occurs during UNDO then on next recovery there is one more record to analyze and REDO (the compensating record). This post UNDO crash recovery will then run it's UNDO phase starting after the last successful UNDO log record saved. That is, if the original log contains two operations in an uncommitted transaction, say OP1 and OP2 and then starts UNDO, it writes the compensating UNDO-OP1 and crashes. Recovery will then UNDO starting with OP2, since for OP1 there is already a compensating record in the log (the UNDO-OP1).

    There is never any window of inconsistency in a correctly implemented ARIES.