sortingsolrlucenerelevancesolr-boost

Solr boost query by field value and inside newest date


We have the following setup in our schema.xml:

<field name="last_modified" type="date" indexed="true" stored="true" multiValued="false" omitTermFreqAndPositions="true"/>
...

<field name="prefix" type="string" indexed="true" stored="true" omitTermFreqAndPositions="true"/>

Our goal is to sort the docs by

  1. prefix=9999 with newest docs (last modified) first
  2. prefix=1004 or prefix=1005 with newest docs (last modified) first

Our code:

{!boost b=recip(ms(NOW,last_modified),3.16e11,1,1)}prefix:9999^1000000 OR {!boost b=recip(ms(NOW,last_modified),3.16e-11,1,1)}prefix:1004^600000 OR {!boost b=recip(ms(NOW,last_modified),3.16e-11,1,1)}prefix:1005^600000

Result: The query above does not work as expected!

We thought that omitTermFreqAndPositions=true will force to prevent ITF and the scoring should work. But it does not seem so! Please help us with this :-)


Solution

  • So we found a solution!

    1. Create your own Similarity (a simple java class) For a better and simpler descriptions how, please read How to compile a custom similarity class for SOLR / Lucene using Eclipse

    The class we used

    package com.luxactive;
    import org.apache.lucene.index.FieldInvertState;
    import org.apache.lucene.search.similarities.DefaultSimilarity;
    
    public class MyNewSimilarityClass  extends DefaultSimilarity {
    
    @Override
    public float coord(int overlap, int maxOverlap) {
        return 1.0f;
    }
    
    @Override
    public float idf(long docFreq, long numDocs) {
        return 1.0f;
    }
    
    @Override
    public float lengthNorm(FieldInvertState arg0) {
        return 1.0f;
    }
    
    @Override
    public float tf(float freq) {
        return 1.0f;
    }
    
    }
    
    1. Create a simple jar with your Similarity
    2. Copy the jar to any folder into your solr server, we used: SOLRFOLDER/solr-4.8.0/example/solr/dih

    The next steps need to be done to every collection you have!

    1. Edit the solrconfig.xml at: SOLRFOLDER/solr-4.8.0/example/solr/collection/conf/solrconfig.xml
      Add <lib dir="../dih" regex=".*\.jar" /> to import the custom jar
    2. Edit the schema.xml in the same folder

    Add the following

    <!-- DEFAULT Factory for custom com.luxactive.MyNewSimilarityClass  -->
    <similarity class="solr.SchemaSimilarityFactory"/>
    
    <!-- TYPE String -->
     <fieldType name="no_term_frequency_string" class="solr.StrField" sortMissingLast="true" >
        <similarity class="com.luxactive.MyNewSimilarityClass"/>
    </fieldType>
    
    <!-- TYPE Date -->
    <fieldType name="no_term_frequency_date" class="solr.TrieDateField" sortMissingLast="true" >
        <similarity class="com.luxactive.MyNewSimilarityClass"/>
    </fieldType>
    
    <!-- TYPE Int-->
    <fieldType name="no_term_frequency_int" class="solr.TrieIntField" sortMissingLast="true" >
        <similarity class="com.luxactive.MyNewSimilarityClass"/>
    </fieldType>
    

    Here you define your own field types (int, string and date) that use the new Similarity class which will return a boost value like defined in the MyNewSimilarityClass.

    1. Now edit the fields you want to use your custom Similaritry by setting theyr type to one you created.
      From: <field name="last_modified" type="date" indexed="true" stored="true" multiValued="false" />
      To: <field name="last_modified" type="no_term_frequency_date" indexed="true" stored="true" multiValued="false" />
    2. Restart the solr server and enjoy your boosting :)