database postgresql spring-boot sql-insert

Prevent Violation of Unique Index with Multiple Concurrent Inserts

I'm using a Postgres 9.5 database with a front of Spring Boot.

On the database end I have an event_attendee table that users can insert attendees into. In the table there is a unique index on person_id,event_id.

Table

create table production.event_attendee (
event_attendee_id bigint,
event_id bigint not null,
person_id bigint not null,
);

create unique index idx_event_attendee on production.event_attendee(event_id,person_id);

OK, so when a user selects a person and an event from the front-end and clicks "Add" it all works fine. I have a not exists on the insert to make sure it doesn't violate the unique index as in:

INSERT STATEMENT

insert into production.event_attendee(event_id,person_id)
select ?
       , ?
where not exists (
                 select 1
                 from production.event_attendee
                 where event_id = ?
                     and person_id = ?
                 )

So far, so good.

Now, when a user somehow double clicks the "Add" button the database tries to commit both queries and it results in a database error of a unique index violation.

Here is what I don't understand: Even if the user somehow double clicks the button, shouldn't the database not attempt to insert the 2nd statement once the first one has run?

Maybe not all INSERT statements are queued up in a single-file line waiting for the prior INSERT to COMMIT before starting the next one, however it seems odd to me the consistency with which this constraint violation appears to be coming up.

That said, I will now disable the button after clicking it one time so the user is not able to double click in any case. However I just don't understand why Postgres is not being smart enough regardless to not attempt to double insert when I specifically include a where not exists clause in the INSERT statement.

I would appreciate any technical data on how this could be the case, so I can better architecturally design my approach in further systems to prevent these errors. It may be something I am missing on concurrency, etc.

Solution

Explanation

PostgreSQL internally uses a transaction ID called xid for each transaction that starts writing. When running such a transaction, in order not to block other, read-only, transactions, PostgreSQL always presents selects the state of the database in the last commited xid (stated as "The intermediate states between the steps are not visible to other concurrent transactions" in the intro to transactions), without locking at first:
the insert makes the table enter the ROW EXCLUSIVE lock mode, which still allows selects (ACCESS SHARE mode) but still on this old snapshot xid.

Thus if your database starts at xid 99, the first transaction 's insert will create xid 100, but the second transaction's select will still be based on the snapshot at xid 99 without the new entry, thus not exists will be true, so it'll enter its insert stage, and there only enter ROW EXCLUSIVE lock mode, which will wait for trans 100 to finish before starting the insert, which then will fail on the unique index.

Solution

You could make PostgreSQL enter a lock mode as soon as before the select, so it has to wait for the other transaction to finish before looking for the presence of the new row;
select for update is not sufficient, as it only locks rows, not the table (and as the targeted row is not in there yet, it can't be selected for update). Thus you'll have to explicitely lock table production.event_attendee in share mode; before your insert … select query:
then your second transaction will halt on the lock until the first one has finished,
and thus the select will see the new row.

begin;
…
lock table production.event_attendee in share mode;
insert into production.event_attendee(event_id,person_id)
select ?
       , ?
where not exists (
                 select 1
                 from production.event_attendee
                 where event_id = ?
                     and person_id = ?
                 );
commit;

Reproducing

Here a small shell script to test the different modes on a test database:

#!/bin/sh

one()
{
    local v="$1" selmode lockmode
    case "$2" in
        "for update") selmode="$2" ;;
        "") true ;;
        *) lockmode="lock table test1 in $2 mode;" ;;
    esac
    {
        echo "begin;"
        echo "$lockmode"
        echo "insert into test1 (id, a) select 1, '$v' where not exists (select 1 from test1 where id = 1 $mode);"

        sleep 1

        echo "commit;"
    } | psql test
}

two()
{
    {
        echo "drop table if exists test1;"
        echo "create table test1 (id bigserial primary key, a text, t timestamp default clock_timestamp());"
    } | psql test

    one a "$@" & sleep .3 ; one b "$@"

    {
        echo "select * from test1 order by t;"
        echo "drop table test1;"
    } | psql test
}

two
two "for update"
two "share"