Why MariaDB Galera cluster thrown a deadlock instead of waiting?

I'm trying to move my application from a single server database to a Galera cluster, this application uses some FOR UPDATE locks.

Consider this case:

create table rooms ( id int not null, owner int null, PRIMARY KEY (`id`));
insert into rooms values (1, null);

I want to be sure to not overwrite room owner in case of concurrent queries. I will set the owner ID in the room only if the current owner is NULL.

1st TEST - SINGLE DATABASE SERVER

TRX1 - SERVER A

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
SELECT * FROM rooms WHERE id=1 FOR UPDATE;

TRX2 - SERVER A

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
SELECT * FROM rooms WHERE id=1 FOR UPDATE;

-> here the TRX2 is on hold waiting TRX1.

TRX1 - SERVER A

UPDATE rooms SET owner=1 WHERE id=1;
COMMIT;

-> here the TRX2 is released and the room is retrieved with the updated owner from TRX1. This is the current correct behaviour for a correct concurrency handling.

2st TEST - GALERA CLUSTER WITH 2 DATABASE SERVERS

TRX1 - SERVER A

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
SELECT * FROM rooms WHERE id=1 FOR UPDATE;

TRX2 - SERVER B

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
SELECT * FROM rooms WHERE id=1 FOR UPDATE;

-> here the TRX2 is NOT put on hold!

TRX1 - SERVER A

UPDATE rooms SET owner=1 WHERE id=1;
COMMIT;

TRX2 - SERVER B

UPDATE rooms SET owner=2 WHERE id=1;
COMMIT;

-> here the TRX2 fails with DEADLOCK error.

Is there any solution to let the second transaction waits the commit of the first trx even if it is running on a second clusetered server?

Solution

Galera is optimistic locking. This pattern assumes there are going to be no conflicts during the commit and if so, the deadlock is throws on the COMMIT SQL. The assumption of optimistic means that there is no pre-lock mechanism communicated over the cluster and the only communication happens on commit.

This also means there is no lock wait, or an early deadlock, if another Galera node as a transaction in progress for the same rows that are being updated locally.

This means there is, as observed, a difference in behaviour between local transaction and transactions across multiple nodes in a Galera cluster.

As the entire pattern is optimistic, while you may be able to avoid deadlocks in a local scenario by using a consistent order of acquiring locks, the same is not possible in a Galera cluster. Minimization techniques like having correct indexes that match the update query are effective in Galera as they are locally, so in the question asked, ensuring that id is a primary key or the first element of a secondary index.

Deadlock handling code must be done in a Galera cluster to handle conflicts. There are deadlocks as a exception because at the SQL cannot determine the logical course of action. Its up to the business logic to decide what happened in these cases. As this is a reservation system on rooms, the logical (as determined by me, a non-business case aware observer), is to give the error to the user they where too slow to book that room and present other options.

To avoid the deadlock you would need to make your application perform room booking on a predictable Galera node to get the same effects of a lock wait.

Note that select for update .. skip locked, a useful pattern in reservations that don't strictly depend on which row, also only applies locally.