I have a Postgres database with around 1.5 million records. In my Ruby on Rails app, I need to search the statement_text
field (which can contain anywhere from 1 to hundreds of words).
My problem: I know I can use the pgSearch
gem to create scopes like search_all_words
or search_any_words
, but I'm uncertain what is the most efficient way to ensure only records with the exact match are returned in the result set.
That is, if I search "Pope Francis", I want it to find only those two words when they're consecutive and in the same order (as opposed to, say, "The pope is named Francis").
So far, I've just combined a GIN index with ILIKE
for exact match searches. But given that a GIN index essentially works by storing the exact position of a word in every record, shouldn't there be a more efficient (non-ILIKE
) way of ensuring that the search term is an exact match with the field?
Generally speaking, full-text requires word stemming based on language dictionary used, so with With Full-Text search you can use ts_rank()
function without stemming and with 'simple'
dictionary to determine the relevance of the phrase you are searching for.
WITH t(v) AS ( VALUES
('Test sentence with Pope Francis'),
('Test Francis sentence with Pope '),
('The pope is named Francis')
)
SELECT v,ts_rank(tsv,q) as rank
FROM t,
to_tsvector('simple',v) as tsv,
plainto_tsquery('simple','Pope Francis') AS q;
Result:
v | rank
----------------------------------+-----------
Test sentence with Pope Francis | 0.0991032
Test Francis sentence with Pope | 0.0973585
The pope is named Francis | 0.0973585
(3 rows)
Without full-text search, you can implement just faster ILIKE
pattern matching with pg_trgm
extension. Example is here.