postgresqldatabase-backupslast-modified

How list all tables with data changes in the last 24 hours?


We had an ugly problem, by mistake, a balancer redirect some requests to a test instance with pretty similar data than production, now I know that there are data recorded in the test Postgres that belongs to production

Is there a way to list all the tables with data changes in the last 24 hours in Postgres?

Postgres version is 9.3 and I have around 250 tables.


Solution

  • First consider my comment.

    Postgres up to and including 9.4 does not by itself record timestamps when rows were inserted or updated.

    There are some system columns in the row headers that can help in the forensic work. The physical order of rows (ctid) can be an indicator if nothing else has happened to the table since. In simple cases new rows are appended to the physical end of a table when inserted, so the ctid indicates what was inserted last - until anything changes in the table. Postgres is free to rearrange the physical order of rows at will, for instance with VACUUM. Any UPDATE also writes a new row version, which can change the physical position. The new version does not have to be at the end of the table. Postgres tries to keep new row version on the same data page if possible (HOT update) ...

    That said, here is a simple query to get the physically last rows for a given table:

    SELECT ctid, *
    FROM   tbl
    ORDER  BY ctid DESC
    LIMIT  10;
    

    Careful with table inheritance or partitioning. Then there can be multiple physical tables involved and ctid is not unique within the scope. See:

    Related answers on dba.SE with detailed information:

    The insert transaction id xmin can be useful:

    If you happen to have a backup for the test DB from right before the incident, that would be helpful. Restore the old state to a separate schema of the test DB and compare tables ...

    Typically, I add one or two timestamptz columns to important tables for when the row was inserted, and / or when it was updated the last time. That would be tremendously useful for you right now ...

    What would also be great for you: the "temporal" features introduced in the SQL standard with SQL:2011. But that's not implemented in Postgres, yet.
    There's a page in the Postgres Wiki.
    There is also an unofficial extension on PGXN. I have not tested it and can't say how far it is.

    Postgres 9.5 introduces a feature to record commit timestamps (like @Craig commented). Needs to be enabled manually before it starts recording. The manual:

    track_commit_timestamp (bool)

    Record commit time of transactions. This parameter can only be set in postgresql.conf file or on the server command line. The default value is off.

    And some functions to work with it.