postgresqlindexingpostgresql-performancecompound-index

Compound index with three keys, what happens if I query skipping the middle one?


With PostgreSQL, I want to use a compound index on three columns A, B, C. B is the created_at datetime, and occasionally I might query without B.

What happens if I compound index on (A, B, C) but then query with conditions on A and C, but not B? (That is, A and C but want it over all time, not just some specific time range?)

Is Postgres smart enough to still use the (A, B, C) compound index but just skip B?


Solution

  • Postgres can use non-leading columns in a multicolumn B-tree index, but in a far less efficient mode.

    If the first column is very selective (only few rows per A) then you will hardly notice a difference in performance since either access method is cheap. The performance hit grows with the number of rows per A.

    For the case you describe I suggest to create the index on (A, C, B) or (C, A, B). Just make sure B comes last. Or drop B from the index altogether. Or create another index for use cases with B. This way you get best performance for queries on (A, B, C) and on (A, C) alike.

    Unlike the sequence of columns in the index, the sequence of predicates in the query does not matter.

    We have discussed this in great detail on dba.SE:

    Note that it does not matter whether you lead with A, C or C, A for the case at hand:

    There are also some other considerations, but your question does not have all the relevant details.