elasticsearchspring-data-elasticsearch

Spring Data Elasticsearch 4.4.x: How to get Aggregations from SearchHits?


I'm new to Spring Data elasticsearch. I'm working on a project in which I'm indexing bugs faced in different projects (just for example).

I want to fetch all projects, with the number of bugs in each project.

Here is my document:

@Data
@Document(indexName = "all_bugs")
public class Bug{
    @Id
    private String recordId;
    private Project project;
    private String bugSummary;
    private String status;
    // other fields omitted for brevity
}

This is the Project class

@Data
public class Project {
    private String projectId;
    private String name;
}

Now all the bugs are in elasticsearch, and I can execute this query in the Kibana console to get All projects, with the count of bugs in each project

GET /all_bugs/_search
{
  "size": 0,
  "aggs": {
    "distinct_projects": {
      "terms": {
        "field": "project.projectId",
        "size": 10
      },
      "aggs": {
        "project_details": {
          "top_hits": {
            "size": 1,
            "_source": {
              "includes": ["project.projectId", "project.name"]
            }
          }
        }
      }
    }
  }
}

Though I know i need to make this better, the problem i'm facing is in the Spring Data Elasticsearch part. This is my method to construct the aggregation.

    @Autowired
    private ElasticsearchOperations elasticsearchOperations;

    public List<DistinctProject> getDistinctProjects() {
        TermsAggregationBuilder aggregation = AggregationBuilders
                .terms("distinct_projects")
                .field("projects.projectId")
                .size(10)
                .subAggregation(AggregationBuilders
                        .topHits("project_details")
                        .size(1)
                        .fetchSource(new String[]{"project.name", "project.projectId"}, null));

        NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
                .withAggregations(aggregation)
                .build();

        SearchHits<DistinctProject> searchHits = elasticsearchOperations.search(searchQuery, DistinctProject.class);

//I dont' know what to do from here...
    }

Now, I have the SearchHits<DistinctProject> with me. The question is, how do I get the aggregations from here to construct my response? In this case DistinctProject is simply a DTO in which I want to store projectId, name and docCount so that I can create a List and return it to the caller.

Now, the problem here, is all documentation I've gone through so far suggests me to implement searchHits.getAggregations().get("distinct_projects"), but that's not available in Spring Data Elasticsearch 4.4.11, which we're using. According to the documentation here,

The SearchHitsclass does not contain the org.elasticsearch.search.aggregations.Aggregations anymore. Instead it now contains an instance of the org.springframework.data.elasticsearch.core.AggregationsContainer class

So, searchHits.getAggregations().get("distinct_projects") throws a compilation error. I'm unable to proceed beyond this point.

I also referened this answer by P.J.Meisch, but this too referred to an older version of Spring Data Elasticsearch

I would really appreciate if someone could help me get out of this block.

For information, My spring boot version is 2.7.11 and the Spring Data elasticsearch version is 4.4.11.

Thanks, Sriram


Solution

  • I've tested your code. Sadlly, There is no data model for aggregation in Spring Data Elasticsearch. But you can treat aggregation data as json, and parse it by yourself.

        @Test
            public void testCreate(){
                TermsAggregationBuilder aggregation = AggregationBuilders
                        .terms("distinct_projects")
                        .field("project.projectId") // your code here is wrong
                        .size(10)
                        .subAggregation(AggregationBuilders
                                .topHits("project_details")
                                .size(1)
                                .fetchSource(new String[]{"project.name", "project.projectId"}, null));
    
                NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
                        .withAggregations(aggregation)
                        .build();
    
                SearchHits<DistinctProject> searchHits = elasticsearchOperations.search(searchQuery, DistinctProject.class, IndexCoordinates.of("all_bugs2"));
    
                System.out.println(JSONObject.toJSONString(searchHits.getAggregations()));
            }
    
        {
                "asMap": {
                        "distinct_projects": {
                                "buckets": [{
                                        "aggregations": {
                                                "asMap": {
                                                        "project_details": {
                                                                "fragment": true,
                                                                "hits": {
                                                                        "fragment": true,
                                                                        "hits": [{
                                                                                "documentFields": {},
                                                                                "fields": {},
                                                                                "fragment": false,
                                                                                "highlightFields": {},
                                                                                "id": "tqpfM4gBOyQu5gYl2sOB",
                                                                                "matchedQueries": [],
                                                                                "metadataFields": {},
                                                                                "primaryTerm": 0,
                                                                                "rawSortValues": [],
                                                                                "score": 1.0,
                                                                                "seqNo": -2,
                                                                                "sortValues": [],
                                                                                "sourceAsMap": {
                                                                                        "project": [{
                                                                                                "name": "my project",
                                                                                                "projectId": 10
                                                                                        }]
                                                                                },
                                                                                "sourceAsString": "{\"project\":[{\"name\":\"my project\",\"projectId\":10}]}",
                                                                                "sourceRef": {
                                                                                        "fragment": true
                                                                                },
                                                                                "type": "_doc",
                                                                                "version": -1
                                                                        }],
                                                                        "maxScore": 1.0,
                                                                        "totalHits": {
                                                                                "relation": 0,
                                                                                "value": 1
                                                                        }
                                                                },
                                                                "name": "project_details",
                                                                "type": "top_hits"
                                                        }
                                                },
                                                "fragment": true
                                        },
                                        "docCount": 1,
                                        "docCountError": 0,
                                        "fragment": true,
                                        "key": 10,
                                        "keyAsNumber": 10,
                                        "keyAsString": "10"
                                }],
                                "docCountError": 0,
                                "fragment": true,
                                "name": "distinct_projects",
                                "sumOfOtherDocCounts": 0,
                                "type": "lterms"
                        }
                },
                "fragment": true
        }