I have a list of articles and I want to do a search using the PostgreSQL SearchQuery and SearchRank functionality. Here is the pseudo code:
from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank
from .models import Article
vector = SearchVector('title', weight='A')
query = SearchQuery(value)
results = Article.objects.annotate(rank=SearchRank(vector, query, cover_density=True).order_by('-rank')
for r in results:
print(r.rank, r.name)
For example, if I search for only "computer" this is what I gets printed:
1 The most powerful computer in the world
0 This man built a plane for his family in his garden
0 Dogs can sense earthquakes before they happen
As you can see it all works as expected with the article that contains "computer" in the title getting a rank of 1.
But now if I search for something with 2 words "fast computer" the results will all show a rank of 0.
0 Dogs can sense earthquakes before they happen
0 The most powerful computer in the world
0 This man built a plane for his family in his garden
According to the "SearchQuery" documentation:
If search_type is 'plain', which is the default, the terms are treated as separate keywords
So why is still not matching the article with "computer" in the title?
I also tried without using "cover_density=True" and I get similar results.
How can I get the results to still match my search query even if only word matches?
Thanks to @gitaarik for pointing me in the right direction. Ive been able to do it in 2 ways
---- METHOD A ----
The first way is formatting the words as a RAW search string and changing the "search_type" to "raw".
query_split = value.split(' ')
search_string = ' | '.join([f"'{q}'" for q in query_split])
vector = SearchVector('title', weight='A')
query = SearchQuery(search_string, search_type='raw')
results = Article.objects.annotate(rank=SearchRank(vector, query, cover_density=True).order_by('-rank')
---- METHOD B ----
The second method is to create a list of SearchQuery objects (one per word) and then combine them all using the | bitwise operator
query_cleaned = re.sub(' +', ' ', value) # Remove double spaces
query_split = query_cleaned.split(' ')
queries = [SearchQuery(query) for query in query_split]
queries_combined = functools.reduce(lambda x, y: x | y, queries)
results = Article.objects.annotate(rank=SearchRank(vector, queries_combined, cover_density=True).order_by('-rank')
Not sure which method will be best. But they both seem to work properly
Did you see this part of the documentation? Did you try that?
So maybe you can do:
query = (SearchQuery('fast') | SearchQuery('computer'))