postgresqlduplicatessql-delete

Delete all records that violate new unqiue constraint


I have a table that has the following fields

----------------------------------
|  id  |  user_id   |   doc_id   |
----------------------------------

I want to create a new unique constraint to make sure that there are no repeat user_id and doc_id records. Aka a user can only be linked to a doc one time. That is simple enough.

ALTER TABLE mytable
    ADD CONSTRAINT uniquectm_const UNIQUE (user_id, doc_id);

The issue is I have records that currently violate that constraint. I was wondering if there is an easy way to query for those records or to tell postgres just delete anything that violates the constraint.


Solution

  • Identifying records that violate your new key:

    SELECT *
    FROM
       (
          SELECT id, user_id, doc_id
             , COUNT(*) OVER (PARTITION BY user_id, doc_id) as unique_check
          FROM mytable
       )
    WHERE unique_check > 1;
    

    Then you can figure out from those duplicates, which should be deleted and perform the delete.

    To my knowledge there is no other way to perform this since any automated "Delete any duplicates" command would leave the database engine to decide which of the two-or-more duplicate records to get rid of.

    If the entire record is a duplicate (all columns match) then you could just create a new table with your new unique constraint and do a INSERT INTO newtable SELECT DISTINCT * FROM oldtable but I'm betting that isn't the case.