logstashlogstash-configurationlogstash-file

Logstash variable in pipeline config


I am setting up Logstash to ingest Airflow logs. The following config is giving me the output I need:

input {
   file {
      path => "/my_path/logs/**/*.log"
      start_position => "beginning"
      sincedb_path => "/dev/null"
   }
}

filter {
   if [path] =~ /\/my_path\/logs\/containers\/.*/ or [path] =~ /\/my_path\/logs\/scheduler\/.*/ {
      drop{}
   }
   else {
      grok {
        "match" => [ "message", "\[%{TIMESTAMP_ISO8601:log_task_execution_datetime}\]%{SPACE}\{%{DATA:log_file_line}\}%{SPACE}%{WORD:log_level}%{SPACE}-%{SPACE}%{GREEDYDATA:log_message}" ]
        "remove_field" => [ "message" ]
      }
      date {
        "match" => [ "log_task_execution_datetime", "ISO8601" ]
        "target" => "log_task_execution_datetime"
        "timezone" => "UTC"
      }
      dissect {
        "mapping" => { "path" => "/my_path/logs/%{dag_id}/%{task_id}/%{dag_execution_datetime}/%{try_number}.%{}" }
        "add_field" => { "log_id_template" => "{%{dag_id}}-{%{task_id}}-{%{dag_execution_datetime}}-{%{try_number}}" }
      }
   }
}

output {
       stdout {codec => rubydebug{metadata => true}}
}

But I do not like having to specify the path "/my_path/logs/" multiple times. In my input section, I tried to use:

add_field => { "[@metadata][base_path]" => "/my_path/logs/" }

and then, in the filter section:

if [path] =~ /[@metadata][base_path].*/ or [path] =~ /[@metadata][base_path].*/ {
      drop{}
   }
...
dissect {
        "mapping" => { "path" => "[@metadata][base_path]%{dag_id}/%{task_id}/%{dag_execution_datetime}/%{try_number}.%{}" } 

But it doesn't seem to work for the regex in the filter or in the dissect mapping. I get a similar issue when trying to use an environment variable as described here.

I have the - maybe naïve - notion that I should be able to use one variable for all references to the base path. Is there a way?


Solution

  • Using an environment variable in a conditional is not supported. There has been a github issue requesting it as an enhancement open since 2016. The workaround is to use mutate+add_field to add a field to [@metadata] then test that.

    "mapping" => { "path" => "${[@metadata][base_path]}%{dag_id}/%{task_id} ...
    

    should work. The terms in a conditional are not sprintf'd, so you cannot use %{}, but you can do a substring match. If FOO is set to /home/user/dir then

        mutate { add_field => { "[@metadata][base_path]" => "${FOO}" } }
        mutate { add_field => { "[path]" => "/home/user/dir/file" } }
        if [@metadata][base_path] in [path] {  mutate { add_field => { "matched" => true } } }
    

    results in the [matched] field getting added. I do not know of a way to anchor the string match, so if FOO were set to /dir/ then that would also match.