logstashlogstash-groklogstash-configurationelklogstash-file

Logstash stopped processing because of an error: (SystemExit) exit


We are trying to index Nginx access and error log separately in Elasticsearch. for that we have created Filbeat and Logstash config as below.

Below is our /etc/filebeat/filebeat.yml configuration

filebeat.inputs:
- type: log
  paths:
    - /var/log/nginx/*access*.log
  exclude_files: ['\.gz$']
  exclude_lines: ['*ELB-HealthChecker*']
  fields:
    log_type: type1 
- type: log
  paths:
    - /var/log/nginx/*error*.log
  exclude_files: ['\.gz$']
  exclude_lines: ['*ELB-HealthChecker*']
  fields:
    log_type: type2

output.logstash:
  hosts: ["10.227.XXX.XXX:5400"]

Our logstash file /etc/logstash/conf.d/logstash-nginx-es.conf config is as below

input {
    beats {
        port => 5400
    }
}

filter {
  if ([fields][log_type] == "type1") {
    grok {
      match => [ "message" , "%{NGINXACCESS}+%{GREEDYDATA:extra_fields}"]
      overwrite => [ "message" ]
    }
    mutate {
      convert => ["response", "integer"]
      convert => ["bytes", "integer"]
      convert => ["responsetime", "float"]
    }
    geoip {
      source => "clientip"
      target => "geoip"
      add_tag => [ "nginx-geoip" ]
    }
    date {
      match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
      remove_field => [ "timestamp" ]
    }
    useragent {
      source => "user_agent"
    }
  } else {
      grok {
        match => [ "message" , "(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))"(, upstream: "%{GREEDYDATA:upstream}")?, host: "%{DATA:host}"(, referrer: "%{GREEDYDATA:referrer}")?"]
        overwrite => [ "message" ]
      }
      mutate {
        convert => ["response", "integer"]
        convert => ["bytes", "integer"]
        convert => ["responsetime", "float"]
      }
      geoip {
        source => "clientip"
        target => "geoip"
        add_tag => [ "nginx-geoip" ]
      }
      date {
        match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
        remove_field => [ "timestamp" ]
      }
      useragent {
        source => "user_agent"
      }
    }
}

output {
  if ([fields][log_type] == "type1") {
    amazon_es {
      hosts => ["vpc-XXXX-XXXX.ap-southeast-1.es.amazonaws.com"]
      region => "ap-southeast-1"
      aws_access_key_id => 'XXXX'
      aws_secret_access_key => 'XXXX'
      index => "nginx-access-logs-%{+YYYY.MM.dd}"
    }
} else {
    amazon_es {
      hosts => ["vpc-XXXX-XXXX.ap-southeast-1.es.amazonaws.com"]
      region => "ap-southeast-1"
      aws_access_key_id => 'XXXX'
      aws_secret_access_key => 'XXXX'
      index => "nginx-error-logs-%{+YYYY.MM.dd}"
    }
  }
    stdout { 
      codec => rubydebug 
    }
}

And we are receiving below error while starting logstash.

[2020-10-12T06:05:39,183][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.9.2", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.265-b01 on 1.8.0_265-b01 +indy +jit [linux-x86_64]"}
[2020-10-12T06:05:39,861][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-10-12T06:05:41,454][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"{\", \",\", \"]\" at line 32, column 263 (byte 918) after filter {\n  if ([fields][log_type] == \"type1\") {\n    grok {\n      match => [ \"message\" , \"%{NGINXACCESS}+%{GREEDYDATA:extra_fields}\"]\n      overwrite => [ \"message\" ]\n    }\n    mutate {\n      convert => [\"response\", \"integer\"]\n      convert => [\"bytes\", \"integer\"]\n      convert => [\"responsetime\", \"float\"]\n    }\n    geoip {\n      source => \"clientip\"\n      target => \"geoip\"\n      add_tag => [ \"nginx-geoip\" ]\n    }\n    date {\n      match => [ \"timestamp\" , \"dd/MMM/YYYY:HH:mm:ss Z\" ]\n      remove_field => [ \"timestamp\" ]\n    }\n    useragent {\n      source => \"user_agent\"\n    }\n  } else {\n      grok {\n        match => [ \"message\" , \"(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \\[%{LOGLEVEL:severity}\\] %{POSINT:pid}#%{NUMBER:threadid}\\: \\*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: \"", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:32:in `compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:183:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:69:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:44:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:52:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:357:in `block in converge_state'"]}
[2020-10-12T06:05:41,795][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-10-12T06:05:46,685][INFO ][logstash.runner          ] Logstash shut down.
[2020-10-12T06:05:46,706][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

There seems to be some formatting issue. Please help what is the problem

=================================UPDATE===================================

For all those who are looking for a robust grok filter for nginx access and error logs ... please try below filter patterns.

Access_Logs - %{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:access_time}\] \"%{WORD:http_method} %{URIPATHPARAM:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent_bytes} \"%{SPACE:referrer}\" \"%{DATA:agent}\" %{NUMBER:duration} req_header:\"%{DATA:req_header}\" req_body:\"%{DATA:req_body}\" resp_header:\"%{DATA:resp_header}\" resp_body:\"%{GREEDYDATA:resp_body}\"

Error_Logs - (?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{DATA:errormessage}, client: %{IP:client}, server: %{IP:server}, request: \"(?<httprequest>%{WORD:httpcommand} %{NOTSPACE:httpfile} HTTP/(?<httpversion>[0-9.]*))\", host: \"%{NOTSPACE:host}\"(, referrer: \"%{NOTSPACE:referrer}\")?


Solution

  • Grok pattern on line 32 is the issue. Need to escape all " characters. Below is an escaped version of the GROK.

    grok {
            match => [ "message" , "(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME})\[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))\"(, upstream: \"%{GREEDYDATA:upstream}\")?, host: \"%{DATA:host}\"(, referrer: \"%{GREEDYDATA:referrer}\")?"]
            overwrite => [ "message" ]
          }