My MySQL version is 5.7 and I created a test table with 1,332,660 records:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data_name` varchar(500) DEFAULT NULL,
`data_time` varchar(100) DEFAULT NULL,
`data_value` decimal(50,8) DEFAULT NULL,
`data_code` varchar(100) DEFAULT NULL,
`create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `idx_name_time_value` (`data_name`,`data_time`,`data_value`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
there is an index bydata_name
,data_time
,data_value
.
I have three SQL and there results :
explain select * from test where data_name ='abc' and data_time = '2022-06-15 00:00:00' and data_value=75.1
explain select * from test where data_time = '2022-06-15 00:00:00' and data_value=75.1
explain select data_name from test where data_time = '2022-06-15 00:00:00' and data_value=75.1
According to the leftmost prefix matching
the second sql does not use index, but the third one actually used an index.
My question is why can the thired sql use index, and why does a full table scan occur even when indexes are used (The seconde one's Explain result rows
is equal to the thired one's rows
).
Can someone help me ? Thans a lot!!!!!!
Query 3 -- Why it uses the index but is slow...
select data_name
from test
where data_time = '2022-06-15 00:00:00'
and data_value=75.1
needs 3 columns, and all are in a single index. So using that index is probably faster than scanning the table. This is called a "covering index". It is indicated in EXPLAIN
with Using index
. "Leftmost" is not relevant; "covering" is.
Still, it will have to scan the entire index, as indicated by "Rows" being approximately the number of rows in the table.
Your datetime string looks like MySQL's datetime; it would be better to declare the column DATETIME
instead of VARCHAR
.
Query 1 is likely to be fast since you are testing all 3 columns of the index with =
.
Query 2 is likely to be very slow since the columns in the WHERE
are not "leftmost".
select *
from test
where data_time = '2022-06-15 00:00:00'
and data_value=75.1
would benefit either of these:
INDEX(data_time, data_value)
INDEX(data_value, data_time)
Query 4:
where data_name ='abc'
and data_time = '2022-06-15 00:00:00'
and data_value >= 75
and data_value < 76
This will be fast because "leftmost" lets the index be used.
Query 5:
where data_name ='abc'
and data_time >= '2022-06-15'
and data_value = 75.1
This will be only somewhat fast because "leftmost" will stop with the data_time
. That is, it stopped after the first "range" test. Adding this would make query 5 fast and fully use the index:
INDEX(data_name, data_value, -- tested with "="
data_time) -- tested with "range"
Note that the order of columns in the index important; the order of tests in WHERE
is not.