I'm using postgres' tsquery function to search in a field that might contain letters in multiple languages and numbers. it seems that in every case the search works up to a part of the searched phrase and stops working until you write the full phrase.
for example: searching for the name '15339' outputs the right row when the search term is '15339' but if it's '153' it won't.
searching for Al-Alamya, if the term is 'al-' it will work and return the row, but adding letters after that, for example, 'al-alam' won't return it until I finish writing the full name ('Al-Alamya').
my query:
SELECT *
FROM (SELECT DISTINCT ON ("consumer_api_spot"."id") "consumer_api_spot"."id",
"consumer_api_spot"."name",
FROM "consumer_api_spot"
INNER JOIN "consumer_api_account" ON ("consumer_api_spot"."account_id" = "consumer_api_account"."id")
INNER JOIN "users_user" ON ("consumer_api_account"."id" = "users_user"."account_id")
WHERE (
users_user.id = 53 AND consumer_api_spot.active
AND
"consumer_api_spot"."vectorized_name" @@ tsquery('153')
)
GROUP BY "consumer_api_spot"."id"
) AS "Q"
LIMIT 50 OFFSET 0
If you check the documentation, you'll find more information about what you can specify as a tsquery
. They support grouping, combining using boolean operations, and also prefixing which is probably something you want. An example from the docs:
Also, lexemes in a
tsquery
can be labeled with*
to specify prefix matching:SELECT 'super:*'::tsquery;
This query will match any word in a
tsvector
that begins with “super”.
So in your query you should modify the part of tsquery('153')
to tsquery('153:*')
.
Btw. I don't know exactly how you constructed your database schema, but you can add a tsvector
index for a column using a GIN index. I will assume that you generate the "consumer_api_spot"."vectorized_name"
column from a "consumer_api_spot"."name"
column. If that's the case you can create a tsvector
index for that column like this:
CREATE INDEX gin_name on consumer_api_spot using gin (to_tsvector('english',name))
And then you could change this query:
"consumer_api_spot"."vectorized_name" @@ tsquery('153')
into this:
to_tsvector('english', "consumer_api_spot"."name") @@ to_tsquery('english', '153:*')
and get a potential speed benefit, because the query would utilize an index.
Note about the 'english'
: You cannot omit the language, when creating the index, but it won't have an effect on queries in other languages, or queries with numbers. However, be careful, the language must be the same for creating the index and performing the query to enable PostgreSQL to use the index.