elasticsearchranking-functions

How to set the default pivot on a painless saturation function in Elasticsearch?


I want to normalize my Elasticsearch query scores between (0,1) and I'm using the painless predefined saturation() script for this, which takes the _score as the argument for the value to be normalized and another argument for the pivot as in the example:

{
  "_source": ["id", "first_name", "last_name"],
  "from": 0,
  "size": 10000,
  "query": {
    "script_score": {
      "query": {
        // my query here
      },
      "script": {
        "source": "saturation(_score, 10)"
      }
    }
  }
}

When I use 10 or other number for pivot, it works as expected. But reading more in-depth docs such as Saturation on the Rank feature query, it says:

If a pivot value is not provided, Elasticsearch computes a default value equal to the approximate geometric mean of all rank feature values in the index. We recommend using this default value if you haven’t had the opportunity to train a good pivot value.

I haven't trained a good pivot; doing a few queries I noticed that the maximum _score varies, like for some samples it was as little as 10, and other samples it was higher than 100. So choosing arbitrary pivot could be good for some cases and not for others...

How do I set my query above to use the default pivot value there? I tried a couple of things and got exceptions: saturation(_score), saturation(_score,), saturation(_score, null), etc


Solution

  • I suspect it's not possible for this use case. Here is the source on ScoreScriptUtils:

    public final class ScoreScriptUtils {
    
        /****** STATIC FUNCTIONS that can be used by users for score calculations **/
    
        public static double saturation(double value, double k) {
            return value / (k + value);
        }
    
        //...
    

    It seems like the default pivot can only be set on a numeric value using a rank_feature or rank_features field, which is not the same as the search _score