I have two index :
I want to get all players belonging to an active team.
explain analyze
select p.*
from player p
inner join team t on t.id = p.team_id
where t.active is true;
This query do not use my index, it does a sequential scan on player table (10 000 rows).
explain analyze
select p.*
from player p
where p.team_id in (select id from team t where t.active is true);
Same result as the first query, it does a sequential scan on player table (10 000 rows) without using player_team_id_idx
index.
select t.id from team t where t.active is true;
There is only one active team, the result is 1
explain analyze
select p.*
from player p
where p.team_id in (1);
This query uses player_team_id_idx index which results in much faster execution time.
How can I force PostgreSQL to use player_team_id_idx for Query 1 and Query 2 ? (I'm using PostgreSQL 16.9)
Make sure to analyze
(and possibly vacuum
too), or let the autovacuum daemon do it:
ANALYZE
collects statistics about the contents of tables in the database, and stores the results in thepg_statistic
system catalog. Subsequently, the query planner uses these statistics to help determine the most efficient execution plans for queries.
Your indexes can be optimised but they are already useful and in my test, vacuum analyze
immediately fixes the problem, equipping the planner with the stats it needs to consider the index scans: demo at db<>fiddle
explain analyze
select p.*
from player p
inner join team t on t.id = p.team_id
where t.active is true;
QUERY PLAN |
---|
Hash Join (cost=512.00..2347.04 rows=40000 width=532) (actual time=2.466..19.350 rows=95 loops=1) |
Hash Cond: (p.team_id = t.id) |
-> Seq Scan on player p (cost=0.00..1625.00 rows=80000 width=532) (actual time=0.012..9.056 rows=80000 loops=1) |
-> Hash (cost=387.00..387.00 rows=10000 width=8) (actual time=2.050..2.052 rows=24 loops=1) |
Buckets: 16384 Batches: 1 Memory Usage: 129kB |
-> Seq Scan on team t (cost=0.00..387.00 rows=10000 width=8) (actual time=0.115..2.042 rows=24 loops=1) |
Filter: (active IS TRUE) |
Rows Removed by Filter: 19976 |
Planning Time: 0.598 ms |
Execution Time: 19.389 ms |
vacuum analyze team;
vacuum analyze player;
explain analyze
select p.*
from player p
inner join team t on t.id = p.team_id
where t.active is true;
QUERY PLAN |
---|
Nested Loop (cost=0.58..470.89 rows=96 width=49) (actual time=0.018..0.194 rows=95 loops=1) |
-> Index Scan using team_active_index on team t (cost=0.29..9.23 rows=24 width=8) (actual time=0.009..0.031 rows=24 loops=1) |
Index Cond: (active = true) |
-> Index Scan using player_team_id_index on player p (cost=0.29..19.20 rows=4 width=49) (actual time=0.003..0.006 rows=4 loops=24) |
Index Cond: (team_id = t.id) |
Planning Time: 0.446 ms |
Execution Time: 0.220 ms |
Consider a compound, multiple-key-column index create index on team(active,team_id);
, or a partial index create index on team(id)where(active);
and move t.active
from where
to join..on
condition list:
drop index team_active_index;
create index team_active_index on team(active,id);
vacuum analyze team;
explain analyze
select p.*
from player p
inner join team t on t.id = p.team_id and t.active;
QUERY PLAN |
---|
Nested Loop (cost=0.58..466.37 rows=96 width=49) (actual time=0.019..0.176 rows=95 loops=1) |
-> Index Only Scan using team_active_index on team t (cost=0.29..4.71 rows=24 width=8) (actual time=0.009..0.013 rows=24 loops=1) |
Index Cond: (active = true) |
Heap Fetches: 0 |
-> Index Scan using player_team_id_index on player p (cost=0.29..19.20 rows=4 width=49) (actual time=0.003..0.006 rows=4 loops=24) |
Index Cond: (team_id = t.id) |
Planning Time: 0.260 ms |
Execution Time: 0.216 ms |
Note the partial index doesn't need to use (active is TRUE)
as a filter or an index scan condition, because the db knows from the predicate all tuples in this index meet that condition. This index is also smaller.
drop index team_active_index;
create unique index team_active_index on team(id)where(active);
vacuum analyze team;
explain analyze
select p.*
from player p
inner join team t on t.id = p.team_id and t.active;
QUERY PLAN |
---|
Nested Loop (cost=0.43..470.16 rows=96 width=49) (actual time=0.014..0.161 rows=95 loops=1) |
-> Index Only Scan using team_active_index on team t (cost=0.14..8.50 rows=24 width=8) (actual time=0.005..0.008 rows=24 loops=1) |
Heap Fetches: 0 |
-> Index Scan using player_team_id_index on player p (cost=0.29..19.20 rows=4 width=49) (actual time=0.003..0.005 rows=4 loops=24) |
Index Cond: (team_id = t.id) |
Planning Time: 0.260 ms |
Execution Time: 0.184 ms |