I have a table of events
where each event has a title
and description
. Searches should search both columns:
title: Dick(ens) and Jane
description: Meet weekly to discuss classic books!
Given the above, I would like a query of book club
to match.
My index:
CREATE INDEX evsearch_idx on events using gist((title || ' ' || description) gist_trgm_ops)
My query:
SELECT * FROM events WHERE title || ' ' || description ILIKE '%book club%'
I think the problem is that I'm fundamentally misunderstanding how an index works and therefore it's attempting to match the entire string book club
to various three-letter combinations but I'm not sure how to fix this.
ILIKE
will always try to match the whole string. For example, to match ILIKE '%book club%'
, a string needs to include the whole string book club
.
If you want to use the similarity matching potential of pg_trm
, you need to use the matching operators for which it was designed. Taken from the doc:
%
returns true if its arguments have a similarity that is greater than the current similarity threshold set by pg_trgm.similarity_threshold.<%
returns true if the similarity between the trigram set in the first argument and a continuous extent of an ordered trigram set in the second argument is greater than the current word similarity threshold set by pg_trgm.word_similarity_threshold parameter.Side note #1: Make it case insensitive by adding lower(...)
in your index definition and in your queries.
Side note #2: title || ' ' || description
will return NULL
if title
or description
is NULL. If you want to guard against one of the 2 values being NULL, create the index on COALESCE(title, '') || ' ' || COALESCE(description, '')
Side note #3: pg_trgm can only do so much. For really complex queries, you might want to look into full text search