searchsolrsolr4

Stronger boosting by date in Solr


Boosting by date field in solr is defined as:

{!boost b=recip(ms(NOW,datefield),3.16e-11,1,1)}

I looked everywhere (examples: Solr Dismax Config for Boost Scoring and Solr boost for multivalued date field and they all reference the SolrRelevancyFAQ), same definition that is used. But I found that this is not boosting my results sufficiently. How can I make this date boosting stronger?

User is searching for two keywords. Both items contain both keywords (in same order) in both title and description. Neither of the keywords is repeated.

And the solr debug output is waaay too confusing to me to understand the problem.

Now, this is not a huge problem. 99% of queries work fine and produce expected results, so its not like solr is not working at all, I just found this situation that is very confusing to me and don't know how to proceed.


Solution

  • User is searching for two keywords. Both items contain both keywords (in same order) in both title and description. Neither of the keywords is repeated.

    Well, by your example, it is clear that your results have landed into a tie situation. To understand this problem of confusing debug output and devise a tie-breaker policy, it is important to understand dismax.

    With DisMax queries, the different terms of the user input are executed against different fields, if many of them hit (the term appears in different fields in the same document) the hit that scores higher is used, but what happens with the other sub-queries that hit in that document for the term? Well, that’s what the tie parameter defines. DisMax will calculate the score for a term query as:

    score= [score of the top scoring subquery] + tie * (sum of other hitting subqueries)
    

    In consequence, the tie parameter is a value between 0 and 1 that will define if the Dismax will only consider the max hit score for a term (setting tie=0), all the hits for a term (setting tie=1) or something between those two extremes.

    The boost parameter is very similar to the bf parameter, but instead of adding its result to the final score, it will multiply it. This is only available in the Extended Dismax Query Parser or the Lucid Query Parser.

    There is an interesting article Comparing Boost Methods of SOLR which may be useful to you.

    References for this answer:

    Shishir