I have a query where I filter the rows by comparing their insertion timestamps by five months ago.
This field does not get updated, we may think of it immutable if it helps.
CREATE TABLE events (
id serial PRIMARY KEY,
inserted_at timestamp without time zone DEFAULT now() NOT NULL
);
SELECT *
FROM events e
WHERE e.inserted_at >= (now() - '5 minutes'::interval);
And EXPLAIN ANALYZE VERBOSE
:
Seq Scan on public.events e (cost=0.00..459.00 rows=57 width=12) (actual time=0.738..33.127 rows=56 loops=1)
Output: id, inserted_at
Filter: (e.inserted_at >= (now() - '5 minutes'::interval))
Rows Removed by Filter: 19944
Planning time: 0.156 ms
Execution time: 33.180 ms
It seems PostgreSQL performs sequence scan on the field, which increases the cost on that behalf.
Do I have a chance of creating a B-tree partial index, or anything more to optimize that query?
Partial index on last 5 minutes will be in need of rebuild every some time. You can build it concurrently (as you relation is in intensive use) with cron, dropping old indexes. Such approach would give you faster selects on last inserted data of course, but consider the fact that at least every 5 minutes you have to rescan table to build short partial index.
The workaround is math - you can split index build in stages (as function):
select now()- inserted_at >= '5 minutes'::interval
from events
where id > (currval('events_id_seq') - 5*(1000000/30))
that is get id lower then last id value minus approximate inserted in last 5 minutes.
If the result is true then build index in dynamic query with same math, if not, enlarge the step.
This way you scan only PK to build index on timestamp - will be much cheaper.
Another point - if you apply such calculations, you might not need partial index at all?..