elasticsearchlogstashlogstash-configurationlogstash-forwarder

Defining multiple outputs in Logstash whilst handling potential unavailability of an Elasticsearch instance


I have two outputs configured for Logstash as I need the data to be delivered to two separate Elasticsearch nodes in different locations.

A snippet of the configuration is below (redacted where required):

output {
  elasticsearch {
    hosts => [ "https://host1.local:9200" ]
    cacert => '/etc/logstash/config/certs/ca.crt'
    user => XXXXX
    password => XXXXX
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
 }

output {
  elasticsearch {
    hosts => [ "https://host2.local:9200" ]
    cacert => '/etc/logstash/config/certs/ca.crt'
    user => XXXXX
    password => XXXXX
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
}

During testing I've noticed that if one of the ES instances, host1.local or host2.local is unavailable, Logstash fails to process/deliver data to the other, even though it's available.

Is there a modification I can make to the configuration that will allow data to be delivered to the available Elasticsearch instance, even if the other dies?


Solution

  • logstash has an at-least-once delivery model. If persistent queues are not enabled data can be lost across a restart, but otherwise, logstash will delivery events to all of the outputs at least once. As a result, if one output becomes unreachable the queue (either in-memory or persistent) will back up and block processing. You can use persistent queues and pipeline-to-pipeline communication with an output isolator pattern to avoid stalling one output when another is unavailable.