postgresqlpg-restore

pg_restore -F d -j {n} internal steps and estimation of time


I'm running a looooong pg_restore process of a database with 70 tables and 800Gb. The process is taking 5 days now. I'm monitoring some aspects of the process to evaluate how long will it take but I've some things missing and this is why I'm asking.

I run pg_dump with parameters -F d -j 10 the dump took about 12 hours. I noticed each one of the 10 threads took responsibility of a single table from start to end. After ending of processing a single table, the same process (pid) started with another table not taken by another process.

Running pg_restore is taking much longer (5 days and still working). The main reason is that I'm restoring to a NAS external drive mounted using nfs and that drive is very slow compared to a local hard drive. This is NOT a problem, I'll migrate the information back from the NAS to the original hard drive once I format the hard drive again and install the new operating system.

I'm doing two things to monitor progress:

At this time, all processes are just doing ALTER TABLE and the overall used space almost matches the initial space, but the process does not ends (and it is taking 5 days now).

So I'm asking if someone has more experience and can tell me what pg_restore is doing with the ALTER_TABLE command and if there is any other mechanism to estimate how long will it take.

Thanks! Ignacio


Solution

  • The ALTER TABLE statements at the end of a pg_restore create primary and unique keys as well as foreign key constraints. They could also be attaching partitions, but that is normally very fast.

    Look into pg_stat_progress_create_index if you have a recent enough PostgreSQL version (you didn't say), then you can monitor the progress of primary and unique key indexes being created.