sqlpostgresqlfull-text-search

PostgreSQL: Full Text Search - How to search partial words?


Following a question posted here about how I can increase the speed on one of my SQL Search methods, I was advised to update my table to make use of Full Text Search. This is what I have now done, using Gist indexes to make searching faster. On some of the "plain" queries I have noticed a marked increase which I am very happy about.

However, I am having difficulty in searching for partial words. For example I have several records that contain the word Squire (454) and I have several records that contain Squirrel (173). Now if I search for Squire it only returns the 454 records but I also want it to return the Squirrel records as well.

My query looks like this

SELECT title 
FROM movies 
WHERE vectors @@ to_tsoquery('squire');

I thought I could do to_tsquery('squire%') but that does not work.
How do I get it to search for partial matches ?

Also, in my database I have records that are movies and others that are just TV Shows. These are differentiated by the "" over the name, so like "Munsters" is a TV Show, whereas The Munsters is the film of the show. What I want to be able to do is search for just the TV Show AND just the movies. Any idea on how I can achieve this ?

Regards Anthoni


Solution

  • Even using LIKE you will not be able to get 'squirrel' from squire% because 'squirrel' has two 'r's. To get Squire and Squirrel you could run the following query:

    SELECT title FROM movies WHERE vectors @@ to_tsquery('squire|squirrel');
    

    To differentiate between movies and tv shows you should add a column to your database. However, there are many ways to skin this cat. You could use a sub-query to force postgres to first find the movies matching 'squire' and 'squirrel' and then search that subset to find titles that begin with a '"'. It is possible to create indexes for use in LIKE '"%...' searches.

    Without exploring other indexing possibilities you could also run these - mess around with them to find which is fastest:

    SELECT title 
    FROM (
       SELECT * 
       FROM movies 
       WHERE vectors @@ to_tsquery('squire|squirrel')
    ) t
    WHERE title ILIKE '"%';
    

    or

    SELECT title 
    FROM movies 
    WHERE vectors @@ to_tsquery('squire|squirrel') 
      AND title ILIKE '"%';