postgresqlquery-planner

Match any value from a long list with a postgres GIN index


The schema:

# Table name: foos
#
#  id       :integer  not null, primary key
#  bar_ids  :integer  is an Array
#
# Indexes
#
#  index_foos_on_bar_ids    (bar_ids) USING gin

The query:

SELECT * FROM foos
WHERE (
  bar_ids && '{
    28151247,
    17295392,
    …300 more bar_ids…,
    29368568,
    45191356
  }')

This results in a very slow sequential scan. I tried rewriting as a series of ORs, but that just gave me a parallel sequential scan. Splitting it up into smaller queries works—it can do up to about 70 terms before switching to an inefficient plan–but that requires running many more queries. Is there a way I can write the query to get a more efficient plan?


Solution

  • The best solution I found was to temporarily disable sequential scans:

    SET LOCAL enable_seqscan = 'off';

    That should last just for the connection. You can check it with SHOW enable_seqscan;.

    That forced the planner to take advantage of the index_foos_on_bar_ids index and run a much faster query.

    Another alternative, which I wouldn't recommend as much because it's more complicated and less semantic, is splitting the one big query up into many smaller queries, all of them below the threshold for using a sequential scan, and using UNION to merge them back into one query.