Supposing I had a product database of some 50,000 products supplying data to a back end system and a website, some are live, some are archived and some are “switched off” as far as the website is concerned (available only in back end admin for some reason).
A query for the website might look something like this:
SELECT name, category, price FROM products WHERE category=‘1234’
(Obviously extremely simplified)
Now as mentioned above I only want those which are not archived and which are toggled to display on the website.
SELECT name, category, price FROM products WHERE category=‘1234’ AND display=true AND archived=false
This obviously will work.
I am deliberately not mentioning indexes. I know that in the above examples whether my ‘category’ column is indexed or not will make a big difference to query speed, but this is not my question.
Supposing I know that about half of the 50,000 products in this database are old news, archived items, my question is:
Is:
SELECT name, category, price FROM products WHERE archived=false AND category=‘1234’ AND display=true
A faster query than the query I wrote earlier?
My thinking being that if MySQL eliminated 25,000 products from the query immediately on ‘archived=false’ before even considering which category these products have, it might be faster (assuming an index on ‘archived’ of course)
Hence my title “Is MySQL Where Clause linear” - does it eliminate rows sequentially in order in the criteria of the WHERE clause as written?
As other answers have said, you should create indexes to optimize, not rely on the order of terms in your WHERE clause. MySQL's optimizer knows how to reorder the terms to match the order of columns in an index. In other words, MySQL knows that AND
is commutative.
But to answer your original question more directly: MySQL also knows how to shortcut boolean expressions.
Here's a demonstration: I filled a table with 512 rows, and set just a few rows to have display=true
:
mysql> select count(*) from mytable;
+----------+
| count(*) |
+----------+
| 512 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from mytable where display = true;
+----------+
| count(*) |
+----------+
| 3 |
+----------+
1 row in set (0.03 sec)
There's no index on the display
column for this test. So the query will do a table-scan, examining every row.
Now I query with a boolean expression using the sleep()
function. If MySQL does not do short-cuts, it will evaluate the sleep()
for every row, and take 512 seconds. If it does short-cuts, it will evaluate sleep()
only for rows where the first term is true.
mysql> select count(*) from mytable where display = true and sleep(1);
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (3.01 sec)
Interesting - even if we reverse the order of terms, MySQL still short-cuts. Apparently, it knows to evaluate against row data before evaluating other expressions.
mysql> select count(*) from mytable where sleep(1) and display=true;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (3.01 sec)
Without a term for the display=true, it just waits. I'm not going to let it run the full 512 seconds, but running SHOW PROCESSLIST shows that it's going to just keep running:
+----+-----------------+-----------+------+---------+--------+------------------------+---------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+-----------+------+---------+--------+------------------------+---------------------------------------------+
| 9 | root | localhost | test | Query | 82 | User sleep | select count(*) from mytable where sleep(1) |
| 11 | root | localhost | NULL | Query | 0 | starting | show processlist |
+----+-----------------+-----------+------+---------+--------+------------------------+---------------------------------------------+