sqlmariadbquery-optimization

Select query appearing to not use the most appropriate index


I have a table with approximately 80 columns. Several of these are indexed. These include age, shoesize and height - all are integers.

I then have a query that I am trying to optimise:

select * from people where age > 18 and shoesize > 10 and height > 150

To try and speed the query up I have created a new combined index of the three columns in question - age, shoesize and height. I have called this index combinedIndex

If I then run:

explain select * from people where age > 18 and shoesize > 10 and height > 150

Then I see this output:

id = 1
select_type = SIMPLE
table = people
type = ALL
possible_keys = age, shoesize, height, combinedIndex
key = NULL
key_len = NULL
ref = NULL
rows = 8947017
extra = using where

So it appears that no index is being used when running my query.

I would have expected my combinedIndex to be used for sure.

I am using MariaDB 10.6.19

Where have I gone wrong? Thank you


Solution

  • A multicolumn BTREE index (here in MariaDB / MySQL and in other DBMSs using BTREE) has limited applicability for handling inequality predicates.

    Your multicolumn index is (age, shoesize, height). It handles this query, with equality predicates, just fine.

    SELECT *
      FROM people
     WHERE age = 18 AND shoesize = 10 AND height = 150
    

    And, it handles this one just fine, with the first two index columns being tested for equality and only the last one containing the inequality.

    SELECT *
      FROM people
     WHERE age = 18 AND shoesize = 10 AND height > 150
    

    That's because the query planner can random-access the index to the first entry meeting your criteria, then scan the index sequentially to get the rest of the entries it needs, and stop scanning when it has everything it needs.

    But your query contains three inequalities.

    SELECT *
      FROM people
     WHERE age > 18 AND shoesize > 10 AND height > 150
    

    For this to work, it can random access your index to the first entry with age > 18, but it then must examine everything in the rest of the index taking some entries and discarding others to handle the other two inequalities.

    MySQL's query planner no doubt compared the cost of doing this to the cost of just romping through your table examining rows and taking some. It decided romping through your table was cheaper, so that's what it did.

    Notice that to satisfy SELECT * it has to look at the actual rows of the table anyway.

    You might get the planner to use your index with this query because your index covers the table. The query planner can satisfy this query directly from the index without looking at the table.

    SELECT age, shoesize, height
      FROM people
     WHERE age > 18 AND shoesize > 10 AND height > 150