elasticsearchelastic-stackelasticsearch-5elasticsearch-pluginelasticsearch-dsl

elasticsearch : How can i tell _reindex api in to continue indexing docs while source index still receiving new docs?


I have daily created indices, these indices are filled by an agent which collects a logs every second of the day, and i'am reindexing them (by field) to new indices using _reindex api.

How can i tell _reindex api to still reindixing while the source index still receiving new documents ?

Any help woould be really appriciated!

Thank you


Solution

  • you cannot force reindex API to be online to reindex new received documents.

    but I have solution. you can add a date field (index_time) to your source index. write an hourly cron job to run reindex API with a query to get last hour indexed docs via index_time.

    POST _reindex
    {
      "source": {
        "index": "my-index-000001",
         "query": {
            "filter" :{
                "query": {
                    "range": {
                       "index_time": {"gte" : "now-1h"}
                              }
                          }
                       } 
                   }
      },
      "dest": {
        "index": "my-new-index-000001"
      }
    }