postgresqlindexing

Postgres doesn't use my indexes if there's an OR condition between 2 indexed ILIKE conditions


We have the following indexes for our Client and Warehouse tables:

CREATE UNIQUE INDEX "idx_49921_PK__PERSONA" ON public.persona USING btree ("ID");

CREATE INDEX persona_full_name ON public.persona USING gin (((("NOMBREPERSONA" || ' '::text) || "APELLIDOPERSONA")) gin_trgm_ops, "EMAILPERSONA" gin_trgm_ops);

CREATE UNIQUE INDEX "idx_49740_PK_BODEGA" ON public.warehouse USING btree (id);

CREATE INDEX warehouse_name ON public.warehouse USING gin (name gin_trgm_ops);

And if I do the following query:

explain analyze select *
from
    package as "Package"
inner join persona as "Persona" on
    "Package"."customer_id" = "Persona"."ID"
left outer join "warehouse" as "Warehouse" on
    "Package"."IDBODEGA" = "Warehouse"."id"
where
    "Warehouse"."name" ILIKE '%test name%' or
    "Persona"."EMAILPERSONA" ILIKE '%test name%' or
    "Persona"."NOMBREPERSONA" || ' ' || "Persona"."APELLIDOPERSONA" ILIKE '%test name%'

It returns this:

Gather  (cost=19659.56..66121.41 rows=250 width=261) (actual time=1526.310..1680.903 rows=0 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Hash Left Join  (cost=18659.56..65096.41 rows=104 width=261) (actual time=1502.623..1502.714 rows=0 loops=3)
        Hash Cond: ("Package"."IDBODEGA" = "Warehouse".id)
        Filter: (("Warehouse".name ~~* '%test name%'::text) OR ("Persona"."EMAILPERSONA" ~~* '%test name%'::text) OR ((("Persona"."NOMBREPERSONA" || ' '::text) || "Persona"."APELLIDOPERSONA") ~~* '%test name%'::text))
        Rows Removed by Filter: 282885
        ->  Parallel Hash Join  (cost=18530.66..64038.97 rows=353608 width=196) (actual time=309.438..540.409 rows=282885 loops=3)
              Hash Cond: ("Package".customer_id = "Persona"."ID")
              ->  Parallel Seq Scan on package "Package"  (cost=0.00..26217.08 rows=353608 width=154) (actual time=0.284..103.231 rows=282886 loops=3)
              ->  Parallel Hash  (cost=12532.18..12532.18 rows=281718 width=46) (actual time=91.819..91.819 rows=225375 loops=3)
                    Buckets: 65536  Batches: 16  Memory Usage: 3968kB
                    ->  Parallel Seq Scan on persona "Persona"  (cost=0.00..12532.18 rows=281718 width=46) (actual time=0.103..35.280 rows=225375 loops=3)
        ->  Hash  (cost=79.51..79.51 rows=3951 width=37) (actual time=1.216..1.216 rows=3951 loops=3)
              Buckets: 4096  Batches: 1  Memory Usage: 299kB
              ->  Seq Scan on warehouse "Warehouse"  (cost=0.00..79.51 rows=3951 width=37) (actual time=0.213..0.700 rows=3951 loops=3)
Planning Time: 0.998 ms
Execution Time: 1681.071 ms

So it's not using either index.

But if I do this condition alone (while keeping the rest of the query):

where "Warehouse"."name" ILIKE '%test name%'

It returns this:

Gather  (cost=1064.45..28219.84 rows=34 width=261) (actual time=37.400..40.770 rows=0 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Nested Loop  (cost=64.45..27216.44 rows=14 width=261) (actual time=0.318..0.319 rows=0 loops=3)
        ->  Hash Join  (cost=64.03..27209.48 rows=14 width=191) (actual time=0.317..0.318 rows=0 loops=3)
              Hash Cond: ("Package"."IDBODEGA" = "Warehouse".id)
              ->  Parallel Seq Scan on package "Package"  (cost=0.00..26217.08 rows=353608 width=154) (actual time=0.196..0.196 rows=1 loops=3)
              ->  Hash  (cost=64.02..64.02 rows=1 width=37) (actual time=0.062..0.062 rows=0 loops=3)
                    Buckets: 1024  Batches: 1  Memory Usage: 8kB
                    ->  Bitmap Heap Scan on warehouse "Warehouse"  (cost=60.00..64.02 rows=1 width=37) (actual time=0.061..0.062 rows=0 loops=3)
                          Recheck Cond: (name ~~* '%test name%'::text)
                          ->  Bitmap Index Scan on warehouse_name  (cost=0.00..60.00 rows=1 width=0) (actual time=0.060..0.060 rows=0 loops=3)
                                Index Cond: (name ~~* '%test name%'::text)
        ->  Index Scan using "idx_49921_PK__PERSONA" on persona "Persona"  (cost=0.42..0.50 rows=1 width=37) (never executed)
              Index Cond: ("ID" = "Package".customer_id)
Planning Time: 0.589 ms
Execution Time: 40.821 ms

It uses the warehouse_name index, and the query takes much less time.

Similarly, if I just do:

where
    "Persona"."EMAILPERSONA" ILIKE '%test name%'
    OR "Persona"."NOMBREPERSONA" || ' ' || "Persona"."APELLIDOPERSONA" ILIKE '%test name%'

It returns:

Gather  (cost=1666.98..28849.55 rows=167 width=261) (actual time=38.049..41.531 rows=0 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Nested Loop Left Join  (cost=666.98..27832.85 rows=70 width=261) (actual time=1.846..1.848 rows=0 loops=3)
        ->  Hash Join  (cost=666.70..27812.01 rows=70 width=187) (actual time=1.845..1.847 rows=0 loops=3)
              Hash Cond: ("Package".customer_id = "Persona"."ID")
              ->  Parallel Seq Scan on package "Package"  (cost=0.00..26217.08 rows=353608 width=154) (actual time=0.171..0.171 rows=1 loops=3)
              ->  Hash  (cost=665.04..665.04 rows=133 width=37) (actual time=1.621..1.622 rows=0 loops=3)
                    Buckets: 1024  Batches: 1  Memory Usage: 8kB
                    ->  Bitmap Heap Scan on persona "Persona"  (cost=177.06..665.04 rows=133 width=37) (actual time=1.621..1.621 rows=0 loops=3)
                          Recheck Cond: (("EMAILPERSONA" ~~* '%test name%'::text) OR ((("NOMBREPERSONA" || ' '::text) || "APELLIDOPERSONA") ~~* '%test name%'::text))
                          ->  BitmapOr  (cost=177.06..177.06 rows=133 width=0) (actual time=1.619..1.620 rows=0 loops=3)
                                ->  Bitmap Index Scan on persona_full_name  (cost=0.00..88.49 rows=65 width=0) (actual time=0.981..0.981 rows=0 loops=3)
                                      Index Cond: ("EMAILPERSONA" ~~* '%test name%'::text)
                                ->  Bitmap Index Scan on persona_full_name  (cost=0.00..88.51 rows=68 width=0) (actual time=0.637..0.637 rows=0 loops=3)
                                      Index Cond: ((("NOMBREPERSONA" || ' '::text) || "APELLIDOPERSONA") ~~* '%test name%'::text)
        ->  Index Scan using "idx_49740_PK_BODEGA" on warehouse "Warehouse"  (cost=0.28..0.30 rows=1 width=37) (never executed)
              Index Cond: (id = "Package"."IDBODEGA")
Planning Time: 0.541 ms
Execution Time: 41.705 ms

So it also uses the index.

The problem is when I try to use both indexes at the same time.

Is there anything I can do here? The original query is much longer and takes almost 3 seconds, I made this query trying to isolate this issue.


Solution

  • There is no mechanism for doing a BitmapOr combining bitmaps across different tables.

    You can break it into two queries where each one only references one table in the ILIKE condition, and then combine them into one query with UNION or UNION ALL:
    demo at db<>fiddle

    explain analyze select *
    from
        package as "Package"
    join persona as "Persona" on
        "Package"."customer_id" = "Persona"."ID"
    left join "warehouse" as "Warehouse" on
        "Package"."IDBODEGA" = "Warehouse"."id"
    where
        "Warehouse"."name" ILIKE '%test name%'
    union distinct 
    select *
    from
        package as "Package"
    inner join persona as "Persona" on
        "Package"."customer_id" = "Persona"."ID"
    left outer join "warehouse" as "Warehouse" on
        "Package"."IDBODEGA" = "Warehouse"."id"
    where
        "Persona"."EMAILPERSONA" ILIKE '%test name%' or
        "Persona"."NOMBREPERSONA" || ' ' || "Persona"."APELLIDOPERSONA" ILIKE '%test name%'
    

    You would need to ensure that it handles duplicates in a manner you find acceptable.