solrsolrcloud

Average of a Calculated column in Solr


I have a solr collection and I need some calculation done based on data. I have TotalLapTime and DrivingTime , completion rate for a driver is DrivingTime*100/TotalLapTime. I would like to find out average completion time for each team. How can I query the Solr for this ? I have been able only able to get sum of DrivingTime using stats. I am on Solr5

Solr collection:

    ID |DrivingTime|TotalLapTime| Team
    1  |   50    |    100       |  A
    2  |   25    |    100       |  A
    3  |   30    |    60        |  B
    4  |   50    |    60        |  B
    5  |   25    |    200       |  C

Expected output:

TEAM | AvgCompletionRate
  A  |      37%
  B  |      66.5%
  C  |      12.5%

Solution

  • First - one of the things with document based search indexes and analytics is that you should usually do as much pre-processing as possible, since your data usually is read once, then analyze in many different contexts. So in this case, adding completion_time as a separate field and calculate it when submitting the document to the index would be the best thing to do.

    That makes it far easier to do any further analysis on the field, and we can then use the JSON Facet API to create buckets for each time, and then calculate the average completion time for those entries that fall into that bucket.

    Adopted from Solr's example for sorting facets by nested functions example:

    {
      "query": "*:*",
      "facet": {
        "categories":{
          "type": "terms",
          "field": "TEAM",
          "limit": 3,
          "sort": "avg_completion_time asc",
          "facet": {
            "avg_completion_time": "avg(completion_time)",
          }
        }
      }
    }