I have two databases. One (DB A) is continiously fed with data from a measurement device and contains level 0 (raw) data. The other database (DB B) is an 'extended' database which contains the same data as DB A but also additional tables that contain processed data (serves also as a raw data backup). I now want to syncronize all new data from DB A to DB B without effecting the additional tables on DB B. I understand that dump and restore would delete the addtional files on DB B. How can I avoid this?
You can set things up so that it's happening continuously using Logical Replication: B would be following all events on A and mirroring all changes.
If that's a one-off task and you're only looking to copy the data over safely, pg_dump
and psql
can handle that without wiping anything. Assuming the table name and schema match on A and B and there's a primary key column:
pg_dump -h your_source_db_host -p your_source_db_port -U your_source_db_user \
--inserts --data-only --on-conflict-do-nothing \
--schema='your_source_table_schema' \
--table='your_source_table_schema.your_source_table_name' your_source_db_name \
|psql -h your_target_db_host -p your_target_db_port -U your_target_db_user \
-v ON_ERROR_STOP=1 your_target_db_name &>> your_source_table.clone.log &
It's only copying the data and attempting to insert it on B if it wasn't copied already, based on the primary key. Note that if you have the space, it might be way faster to use default copy
mode aimed at a new table on target, copying the full thing onto the target and resolve the new vs existing locally. Compared to copy
mode, insert
mode is pretty slow.
You can also link the source table to be visible on the target using a foreign data wrapper, postgres_fdw
. With that configured:
insert into local_table_on_B
select from table_on_A_visible_on_B AS a
--where id not in (select id from local_table_on_B)
--where not exists (select from local_table_on_B AS b where b.id=a.id)
--where created_at>(select max(created_at) from local_table_on_B)
on conflict do nothing;