javaelasticsearchelasticsearch-rest-client

Pass list of Id as a parameter of Multi-Get request using java high level rest client


I am trying to get a list of all the documents which are present in the index using java high-level rest-client.

The sample index data is -

PUT /my-index/_doc/1
{
  "account_number": 1,
  "balance": 28838
}

PUT /my-index/_doc/2
{
  "account_number": 1,
  "balance": 28838
}

PUT /my-index/_doc/3
{
  "account_number": 1,
  "balance": 28838
}

To retrieve multiple JSON documents by ID, I am using multi-get API, as shown below

GET /_mget
{
  "ids": [
    "2",
    "3",
    "4"
  ]
}

Search Result is

{
  "docs": [
    {
      "_index": "my-index",
      "_type": "_doc",
      "_id": "2",
      "_version": 1,
      "_seq_no": 4,
      "_primary_term": 4,
      "found": true,
      "_source": {
        "account_number": 2,
        "balance": 28838
      }
    },
    {
      "_index": "my-index",
      "_type": "_doc",
      "_id": "3",
      "_version": 2,
      "_seq_no": 5,
      "_primary_term": 4,
      "found": true,
      "_source": {
        "account_number": 3,
        "balance": 28838
      }
    },
    {
      "_index": "my-index",
      "_type": "_doc",
      "_id": "4",
      "found": false
    }
  ]
}

Now, I need to parse the response generated by a multi-get request and get the list of all ids that are found in the index.

Java Code

I am able to get the list of Id i.e [2,3] (which is the expected result). But as shown in the below java code I am adding a single element each time (Document id) from the ids list using the for loop. Due to which each time a new multi-get request is created.

public List<Integer> verifyDocuments(final VerifyScrollRequest request) throws IOException {
        RestHighLevelClient es7Client = buildES7Client(request.getEs7Node(), request.getEs7Port());
        List<String> ids = new ArrayList<>();
        ids.add("2");
        ids.add("3");
        ids.add("4");
        List<Integer> documents = new ArrayList<>();
        MultiGetRequest getRequest = new MultiGetRequest();
        for (int i = 0; i < ids.size(); i++) {
            String element = ids.get(i);
            getRequest.add(new MultiGetRequest.Item(request.getEs7IndexName(), element));
            MultiGetResponse response = es7Client.mget(getRequest, RequestOptions.DEFAULT);
            if (response.getResponses()[i].getResponse().isExists()) {
                documents.add(Integer.parseInt(element));
            }
        }
        return documents;
    }

Is there any way to pass the complete list of Id as a parameter to the multi-get request, so that multi-get request is created only once?


Solution

  • You don't need to send call es7Client.mget() on each iteration. This is how I'd do it:

        public List<Integer> verifyDocuments(final VerifyScrollRequest request) throws IOException {
            RestHighLevelClient es7Client = buildES7Client(request.getEs7Node(), request.getEs7Port());
            // build the list of IDs
            List<String> ids = new ArrayList<>();
            ids.add("2");
            ids.add("3");
            ids.add("4");
            List<Integer> documents = new ArrayList<>();
    
            // build the mget request with all IDs
            MultiGetRequest getRequest = new MultiGetRequest();
            for (int i = 0; i < ids.size(); i++) {
                String element = ids.get(i);
                getRequest.add(new MultiGetRequest.Item(request.getEs7IndexName(), element));
            }
    
            // call mget
            MultiGetResponse response = es7Client.mget(getRequest, RequestOptions.DEFAULT);
    
            // iterate over the resulting documents
            for (int i = 0; i < ids.size(); i++) {
                if (response.getResponses()[i].getResponse().isExists()) {
                    documents.add(Integer.parseInt(element));
                }
            }
    
            return documents;
        }