postgresqlbackupwalpgbackrest

Difference between incremental backup and WAL archiving with PgBackRest


As far as I understood

So, assuming my WAL archiving is setup correctly

  1. Why would I need incremental backups?
  2. Shouldn't the cost of incremental backups be almost zero?

Most of the documentation I found is focusing on a high level implementation (e.g. how to setup WAL archiving or incremental backups) vs the internal ( what happens when I trigger an incremental backup)

My question can probably be solved with a link to some documentation, but my google-fu has failed me so far


Solution

  • Backups are not copies of the WAL files, they're copies of the cluster's whole data directory. As it says in the docs, an incremental backup contains:

    those database cluster files that have changed since the last backup (which can be another incremental backup, a differential backup, or a full backup)

    WALs alone aren't enough to restore a database; they only record changes to the cluster files, so they require a backup as a starting point.

    The need for periodic backups (incremental or otherwise) is primarily to do with recovery time. Technically, you could just hold on to your original full backup plus years worth of WAL files, but replaying them all in the event of a failure could take hours or days, and you likely can't tolerate that kind of downtime.

    A new backup also means that you can safely discard any older WALs (assuming you don't still need them for point-in-time recovery), meaning less data to store, and less data whose integrity you're relying on in order to recover.

    If you want to know more about what pgBackRest is actually doing under the hood, it's all covered pretty thoroughly in the Postgres docs.