I have a table (table_a
) with columns: id
, column_b
, column_c
, column_d
, and archived
. The id
column is the primary key. The combination of column_b
and column_c
should be unique when archived = 0
. If archived = 1
, there are no restrictions, and duplicate records are allowed. This table is being inserted into and updated by multiple instances in our microservice architecture.
In the code, before inserting a record, it checks whether a record already exists for the given column_b
and column_c
values. If it exists, it will update the record. However, sometimes while one instance is inserting a record, another instance tries to insert another record with the same column_b
and column_c
values, resulting in duplicate records.
To prevent this, I added the following unique partial index:
CREATE UNIQUE INDEX unique_column_b_column_c ON table_a (column_b, column_c) WHERE archived = 0;
I also modified the existing insert query. This change works fine in Development environment, but I need to identify any potential issues that might arise from using database constraints and adding unique partial indexes with PostgreSQL. Additionally, I am unsure about the performance impact.
Old query:
INSERT INTO table_a (
column_b, column_c, column_d
)
VALUES (2, 2, 2)
RETURNING column_b;
New query:
INSERT INTO table_a (
column_b, column_c, column_d
)
VALUES (2, 2, 2)
ON CONFLICT (column_b, column_c) WHERE archived = 0 DO UPDATE SET
column_b = EXCLUDED.column_b,
column_c = EXCLUDED.column_c,
column_d = EXCLUDED.column_d,
RETURNING column_b;
I need to identify any potential issues that might arise from using database constraints and adding unique partial indexes with PostgreSQL. Additionally, I need to know performance impact.
Your solution with the partial unique index and INSERT ... ON CONFLICT
is the correct solution.
The additional index is not free (it will make inserting rows with archived = 0
more expensive), but that's the price you have to pay for consistency.