postgresqlamazon-rdsaws-dms

Postgres add PRIMARY KEY to partitioned very active table without lock


I have a Postgres 11 database that has a few partitioned tables missing primary keys.

We need to upgrade to newer versions of Postgres, and unfortunately in order to use AWS (DMS) Database Migration Service to do this, it requires primary keys on every table.

One of these tables has around 1,000-5,000 inserts every second, with around 10 million rows at any given time, and downtime really is not an option for us.

Is there some way that we could add primary keys to these partitioned tables, with the least amount of downtime, including locks or taking the actual database down?

High level thought would be to create the column on the table, backfill with UUID primary keys, then run command to set as primary key. From my understanding though, this will result in a lock on the table.

Are there any other options for adding the primary key with least amount of downtime? Or any other suggestions?

The thought behind using UUID is to avoid collision or any kind of waits, etc, from using sequential value. This will be larger by using UUID, so i'm open to any other suggestions as well. Thanks!

This is to try and help resolve the issue of slow queries after removing 500m rows from another question I posted here: Postgres very slow queries after deleting 500 million rows


Solution

  • So the way we were able to achieve this with very minimal downtime, was by doing this inside a transaction, creating a temporary table, and then dropping original and renaming tmp to the original:

    BEGIN;
    CREATE TABLE mytable_tmp (LIKE mytable) PARTITION BY LIST (locationid);
    ALTER TABLE mytable_tmp ADD COLUMN id BIGSERIAL NOT NULL;
    ALTER TABLE mytable_tmp ADD PRIMARY KEY(id, locationid);
    CREATE TABLE mytable_tmp_office PARTITION OF mytable_tmp FOR VALUES IN ('office');
    
    INSERT INTO mytable_tmp SELECT * FROM mytable;
    
    DROP TABLE mytable;
    
    ALTER TABLE mytable_tmp RENAME TO mytable;
    ALTER TABLE mytable_tmp_office RENAME TO mytable_office;
    COMMIT;
    

    This does NOT copy over the indexes which we manually added afterwards (as this added about 120 seconds extra). If you want to include indexes use this instead to create the temp table:

    CREATE TABLE mytable_tmp (LIKE mytable INCLUDING ALL) PARTITION BY LIST (locationid);
    

    Since the INSERT command took about 120 seconds to run, there was about that much time of inserts that were lost (since they did not exist when the command started), but if that is a concern to you, you could instead do an ALTER command to rename the original table to something like mytable_old and then copy over entries based on a timestamp that were inserted after your copy.