pythondjangodjango-rest-frameworkdjango-postgresql

Django SearchQuery and SearchRank not finding results when 1 word matches in a 2 word query


I have a list of articles and I want to do a search using the PostgreSQL SearchQuery and SearchRank functionality. Here is the pseudo code:

from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank
from .models import Article

vector = SearchVector('title', weight='A')
query = SearchQuery(value)
results = Article.objects.annotate(rank=SearchRank(vector, query, cover_density=True).order_by('-rank')

 for r in results:
    print(r.rank, r.name)

For example, if I search for only "computer" this is what I gets printed:

1 The most powerful computer in the world    
0 This man built a plane for his family in his garden
0 Dogs can sense earthquakes before they happen

As you can see it all works as expected with the article that contains "computer" in the title getting a rank of 1.

But now if I search for something with 2 words "fast computer" the results will all show a rank of 0.

0 Dogs can sense earthquakes before they happen
0 The most powerful computer in the world    
0 This man built a plane for his family in his garden

According to the "SearchQuery" documentation:

If search_type is 'plain', which is the default, the terms are treated as separate keywords

So why is still not matching the article with "computer" in the title?

I also tried without using "cover_density=True" and I get similar results.

How can I get the results to still match my search query even if only word matches?


UPDATE:

Thanks to @gitaarik for pointing me in the right direction. Ive been able to do it in 2 ways

---- METHOD A ----

The first way is formatting the words as a RAW search string and changing the "search_type" to "raw".

query_split = value.split(' ')
search_string = ' | '.join([f"'{q}'" for q in query_split])
vector = SearchVector('title', weight='A')
query = SearchQuery(search_string, search_type='raw')
results = Article.objects.annotate(rank=SearchRank(vector, query, cover_density=True).order_by('-rank')

---- METHOD B ----

The second method is to create a list of SearchQuery objects (one per word) and then combine them all using the | bitwise operator

query_cleaned = re.sub(' +', ' ', value)  # Remove double spaces
query_split = query_cleaned.split(' ')
queries = [SearchQuery(query) for query in query_split]  
queries_combined = functools.reduce(lambda x, y: x | y, queries)
results = Article.objects.annotate(rank=SearchRank(vector, queries_combined, cover_density=True).order_by('-rank')

Not sure which method will be best. But they both seem to work properly


Solution

  • Did you see this part of the documentation? Did you try that?

    enter image description here

    https://docs.djangoproject.com/en/4.1/ref/contrib/postgres/search/#searchquery

    So maybe you can do:

    query = (SearchQuery('fast') | SearchQuery('computer'))