I have a list of articles and I want to do a search using the PostgreSQL SearchQuery and SearchRank functionality. Here is the pseudo code:
from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank
from .models import Article
vector = SearchVector('title', weight='A')
query = SearchQuery(value)
results = Article.objects.annotate(rank=SearchRank(vector, query, cover_density=True).order_by('-rank')
for r in results:
print(r.rank, r.name)
For example, if I search for only "computer" this is what I gets printed:
1 The most powerful computer in the world
0 This man built a plane for his family in his garden
0 Dogs can sense earthquakes before they happen
As you can see it all works as expected with the article that contains "computer" in the title getting a rank of 1.
But now if I search for something with 2 words "fast computer" the results will all show a rank of 0.
0 Dogs can sense earthquakes before they happen
0 The most powerful computer in the world
0 This man built a plane for his family in his garden
According to the "SearchQuery" documentation:
If search_type is 'plain', which is the default, the terms are treated as separate keywords
So why is still not matching the article with "computer" in the title?
I also tried without using "cover_density=True" and I get similar results.
How can I get the results to still match my search query even if only word matches?
UPDATE:
Thanks to @gitaarik for pointing me in the right direction. Ive been able to do it in 2 ways
---- METHOD A ----
The first way is formatting the words as a RAW search string and changing the "search_type" to "raw".
query_split = value.split(' ')
search_string = ' | '.join([f"'{q}'" for q in query_split])
vector = SearchVector('title', weight='A')
query = SearchQuery(search_string, search_type='raw')
results = Article.objects.annotate(rank=SearchRank(vector, query, cover_density=True).order_by('-rank')
---- METHOD B ----
The second method is to create a list of SearchQuery objects (one per word) and then combine them all using the | bitwise operator
query_cleaned = re.sub(' +', ' ', value) # Remove double spaces
query_split = query_cleaned.split(' ')
queries = [SearchQuery(query) for query in query_split]
queries_combined = functools.reduce(lambda x, y: x | y, queries)
results = Article.objects.annotate(rank=SearchRank(vector, queries_combined, cover_density=True).order_by('-rank')
Not sure which method will be best. But they both seem to work properly
Did you see this part of the documentation? Did you try that?
https://docs.djangoproject.com/en/4.1/ref/contrib/postgres/search/#searchquery
So maybe you can do:
query = (SearchQuery('fast') | SearchQuery('computer'))