postgresqlindexingquery-optimization

How to use index to search if trim is used in search value


My Postgres 17 database has a large table with 245 columns containing an indexed column ribakood:

CREATE TABLE firma2.toode
(
...
    ribakood character(20) COLLATE pg_catalog."default",
...
);

CREATE INDEX toode_ribakood_idx
    ON firma2.toode 
    (ribakood COLLATE pg_catalog."default" ASC NULLS LAST);

The query with trim():

explain analyze select toode,ostuhind, nimetus, pangateen
  from firma2.toode where ribakood=TRIM('TESTTOODE/H ')

does not use the index:

Gather  (cost=1000.00..575155.04 rows=4927 width=114) (actual time=101.341..2257.639 rows=1 loops=1)
  Workers Planned: 3
  Workers Launched: 0
  ->  Parallel Seq Scan on toode (cost=0.00..573662.34 rows=1589 width=114) (actual time=101.186..2257.436 rows=1 loops=1)
        Filter: ((ribakood)::text = 'TESTTOODE/H'::text)
        Rows Removed by Filter: 986481
Planning Time: 0.098 ms
Execution Time: 2257.653 ms

The same query without trim():

explain analyze select toode,ostuhind, nimetus, pangateen
  from firma2.toode where ribakood='TESTTOODE/H'

uses the index:

Index Scan using toode_ribakood_idx on toode  (cost=0.42..12.45 rows=2 width=114) (actual time=0.475..0.477 rows=1 loops=1)
  Index Cond: (ribakood = 'TESTTOODE/H'::bpchar)
Planning Time: 0.147 ms
Execution Time: 0.490 ms

Postgres casts the ribakood column to type text and cannot find the index. How to force the first query with trim() to use the index?

Using: PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, 64-bit


Solution

  • Calling trim() on the target value is clearly a bit odd - at the very least ltrim() probably makes more sense. Since there is no ltrim() defined on the fixed-width character type obviously it will end up as text, so to use an index you just want to cast the result back.

    With the test-table I've got here, the planner has decided it lies a bitmap index scan, but the principle is still good.

    => EXPLAIN ANALYSE SELECT * FROM tt WHERE c = 'row 5                   ';
    ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
    │                                                      QUERY PLAN                                                      │
    ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
    │ Bitmap Heap Scan on tt  (cost=139.18..7153.38 rows=5000 width=88) (actual time=0.026..0.026 rows=1 loops=1)          │
    │   Recheck Cond: (c = 'row 5                   '::bpchar)                                                             │
    │   Heap Blocks: exact=1                                                                                               │
    │   ->  Bitmap Index Scan on tt_c_idx  (cost=0.00..137.93 rows=5000 width=0) (actual time=0.023..0.023 rows=1 loops=1) │
    │         Index Cond: (c = 'row 5                   '::bpchar)                                                         │
    │ Planning Time: 0.078 ms                                                                                              │
    │ Execution Time: 0.045 ms                                                                                             │
    └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
    
    => EXPLAIN ANALYSE SELECT * FROM tt WHERE c = ltrim('row 5  ')::character(20);
    ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
    │                                                      QUERY PLAN                                                      │
    ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
    │ Bitmap Heap Scan on tt  (cost=139.18..7153.38 rows=5000 width=88) (actual time=0.040..0.041 rows=1 loops=1)          │
    │   Recheck Cond: (c = 'row 5               '::character(20))                                                          │
    │   Heap Blocks: exact=1                                                                                               │
    │   ->  Bitmap Index Scan on tt_c_idx  (cost=0.00..137.93 rows=5000 width=0) (actual time=0.031..0.031 rows=1 loops=1) │
    │         Index Cond: (c = 'row 5               '::character(20))                                                      │
    │ Planning Time: 0.129 ms                                                                                              │
    │ Execution Time: 0.051 ms                                                                                             │
    └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
    

    For people who might come across this and aren't the original poster, just don't use the fixed-width character type at all ever. It is almost always the wrong choice.