postgresqldatabase-replicationlogical-replication

PostgreSQL logical replication - ignore pre-existing data


Imagine dropping a subscription and recreating it from scratch. Is it possible to ignore existing data during the first synchronization?

Creating a subscription with (copy_data=false) is not an option because I do want to copy data, I just don't want to copy already existing data.


Example: There is a users table and a corresponding publication on the master. This table has 1 million rows and every minute a new row is added. Then we drop the subscription for a day.

If we recreate the subscription with (copy_data=true), replication will not start due to a conflict with already existing data. If we specify (copy_data=false), 1440 new rows will be missing. How can we synchronize the publisher and the subscriber properly?


Solution

  • You cannot do that, because PostgreSQL has no way of telling when the data were added.

    You'd have to reconcile the tables by hand (or INSERT ... ON CONFLICT DO NOTHING).