postgresqlindexingquerying

SQL query using index on timestamp with timezone column


PostgreSQl I have table like this:

create table if not exists some_table
               (
        some_id         varchar(40)              not null,      
        
        created_at      timestamp with time zone not null,
        
        constraint some_pkey
            primary key (some_id)

next step i created index on created_at field

create index concurrently if not exists some_table__created_at
        on statement (created_at);

i can't create partial index on created_at with condition because this condition includes mutable values

if i use query like this:

EXPLAIN ANALYSE
select t1.created_at
FROM some_table t1
where t1.created_at < '2023-06-19 10:17:20.830627+00:00';

index scan is working

Index Only Scan using statement__created_at on statement t1  (cost=0.42..5991.71 rows=210245 width=8) (actual time=0.016..27.510 rows=210277 loops=1)
  Index Cond: (created_at < '2023-06-19 10:17:20.830627+00'::timestamp with time zone)
  Heap Fetches: 0
Planning Time: 0.098 ms
Execution Time: 37.884 ms

But i want to get field some_id with using index scan

EXPLAIN ANALYSE
select t1.some_id
FROM statement t1
where t1.created_at < '2023-06-19 10:17:20.830627+00:00';

and here i get

Seq Scan on statement t1  (cost=0.00..6535.50 rows=210245 width=37) (actual time=0.007..33.904 rows=210277 loops=1)
  Filter: (created_at < '2023-06-19 10:17:20.830627+00'::timestamp with time zone)
  Rows Removed by Filter: 2723
Planning Time: 0.099 ms
Execution Time: 44.463 ms

I tried to use index on 2 columns(some_id and created). It didn't help me.

Maybe i should use not btree index or something else?


Solution

  • Why is there no index-only-scan when you select the some_id field? Because an index-only-scan is performed when all the data required can be fetched from the index and so the actual table has not to be visited at all. When you select the some_id field the actual table has to be visited anyways to fetch the values for that field.

    In that scenario it depends on the selectivity of your where clause. If the planner determines that your where clause has a bad selectivity, i.e. it will select most of the rows, then it makes sense to do a sequential table scan even if an index is available because there is no point in visiting the index first if afterwards almost all of the table has to be loaded anyways to get the data from the matching rows.

    To verify this behaviour make sure you have decent amount of rows in the table (a million maybe?) and use a timestamp in your where clause that matches only a small portion of those. You should see a query plan that uses the index instead of doing a sequential scan, although it will not be an index-only-scan.