.netreindexelasticsearch-indices

How to manage scroll context on reindex with Elasticsearch and Dotnet


I have a method in .NET which reindexes indices in Elasticsearch.

When I execute it, the tasks generated are in error with the reason

Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.

I've tried to use

await this._elasticSearchClient.Value.ClearScrollAsync(c => c.ScrollId(default));

but I have the same issue.

The code of my method :

// indiceDictionary is a dictionary with < destination_indice, source_indices >
foreach( var destIndice in indiceDictionary.Keys )
{
   var reIndexResponse = await this._elasticSearchClient.Value.ReindexOnServerAsync(r => r
      .Source(src =>
         src.Index(indiceDictionary[destIndice].ToArray())
      )
      .Destination(dest =>
         dest.Index(destIndice)
      )
      .WaitForCompletion(false)
   );

   var taskId = reIndexResponse.Task;
   this.logger.LogInformation("TriggerReorganization Reindex for dest {destinationIndice} with {numberOfSrcIndice} source indices has a Task with id : {taskId}"
   , destIndice, indiceDictionary[destIndice].Count, taskId);
   await this._elasticSearchClient.Value.ClearScrollAsync(c => c.ScrollId(default)); // :(
}

Is there a way to get the Scroll_id from each reindex that gets cleared after ?


Solution

  • WaitForCompletion(false) seems problematic. Fire-and-forget is almost never a good idea.

    Currently, your code is throwing reindexing requests at ES as quickly as it can. You'll probably need to throttle that.