mysqltransactionsinnodbrepeatable-read

Is it possible to do a Phantom read to a row someone just updated?


From the MySQL glossary:

phantom: A row that appears in the result set of a query, but not in the result set of an earlier query. For example, if a query is run twice within a transaction, and in the meantime, another transaction commits after inserting a new row or updating a row so that it matches the WHERE clause of the query.

Is the bolded part correct? If I have

CREATE TABLE  t1 (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `c1` varchar(45) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

and the isolation level is REPEATABLE READ and I do

mysql> start transaction;
mysql> SELECT * FROM t1 WHERE c1 < 10;
+----+------+
| id | c1   |
+----+------+
|  1 | 4    |
+----+------+
mysql> SELECT * FROM t1 WHERE c1 < 10;
+----+------+
| id | c1   |
+----+------+
|  1 | 4    |
+----+------+

I could sometimes get different result from the later query even if no one does any INSERTs but only UPDATEs? My MySQL version is 5.7.

SQL standard indicates phantom reads are related only to concurrent INSERTs although the word generate is a bit confusing. From the ISO/IEC 9075:1992, Database Language SQL- July 30, 1992 (Second Informal Review Draft):

P3 ("Phantom"): SQL-transaction T1 reads the set of rows N that satisfy some search condition. SQL-transaction T2 then executes SQL-statements that generate one or more rows that satisfy the search condition used by SQL-transaction T1. If SQL-transaction T1 then repeats the initial read with the same search condition, it obtains a different collection of rows.


Solution

  • InnoDB REPEATABLE-READ transaction isolation level prevents phantom rows, but only if your SELECT query is a non-locking query.

    Thus you can SELECT with the same query conditions multiple times during a transaction, and you're guaranteed to get the same result time after time, even while other sessions are inserting, updating, or deleting rows in ways that would affect your result set. As soon as you begin a new transaction, your query will see the changes to the rows that have been going on in the meantime.

    But InnoDB has an odd case: if you run a locking read query like one of the following:

    SELECT * FROM t1 WHERE c1 < 10 FOR UPDATE
    
    SELECT * FROM t1 WHERE c1 < 10 LOCK IN SHARE MODE
    
    SELECT * FROM t1 WHERE c1 < 10 FOR SHARE -- MySQL 8.0 syntax
    

    Then a SELECT will "see" the results of concurrent changes to data, as if your transaction had been started as a READ-COMMITTED transaction.

    You can even switch back and forth between locking read queries and non-locking read queries within the same REPEATABLE-READ transaction, and you'll see different result sets for each. So be aware of this if you use locking SELECT statements.

    I think the word "generate" in the excerpt you showed is meant to apply to either INSERT or UPDATE. They needed a term to apply for both cases, because I guess they didn't feel like writing a more clear phrase like "insert or update."