logstashkibanaelastic-stackmonologgelf

Logstash and nested JSON from Monlog; Why arrays are converted to JSON string?


I am using PHP with Monolog. I am outputting logs to JSON file and using Gelf to Logstash which then sends logs to ElasticSearch.

The problem I have is that I am missing the extra object in Kibana and the tags field gets interpreted as string instead of nested object.

Any idea how to convince Logstash/Kibana, so the inner JSON field are parsed as fields/objects not as JSON string?

This is how it looks like in Kibana.

{
   "_index":"logstash-2018.08.30",
   "_type":"doc",
   "_id":"TtHbiWUBc7g5w1yM8X6f",
   "_version":1,
   "_score":null,
   "_source":{
      "ctxt_task":"taskName",
      "@version":"1",
      "http_method":"GET",
      "user_agent":"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0",
      "level":6,
      "message":"Finished task",
      "tags":"{\"hostname\":\"28571f0dc7e1\",\"region\":\"eu-west-1\",\"environment\":\"local\",\"processUniqueId\":\"5b87a4d843c20\"}",
      "url":"/assets/Logo.jpg",
      "ctxt_controller":"ControllerName",
      "memory_usage":"4 MB",
      "referrer":"https://local.project.net/account/login",
      "facility":"logger",
      "memory_peak_usage":"4 MB",
      "ctxt_timeElapsed":0.05187487602233887,
      "@timestamp":"2018-08-30T08:03:37.386Z",
      "ip":"172.18.0.1",
      "ctxt_start":1535616217.33417,
      "type":"gelf",
      "host":"18571f0dc7e9",
      "source_host":"172.18.0.8",
      "server":"local.project.net",
      "ctxt_end":1535616217.386045,
      "version":"1.0"
   },
   "fields":{
      "@timestamp":[
         "2018-08-30T08:03:37.386Z"
      ]
   },
   "sort":[
      1535616217386
   ]
}

My log looks like:

{
   "message":"Finished task",
   "context":{
      "controller":"ControllerName",
      "task":"taskName",
      "timeElapsed":0.02964186668395996,
      "start":1535614742.840069,
      "end":1535614742.869711,
      "content":""
   },
   "level":200,
   "level_name":"INFO",
   "channel":"logger",
   "datetime":{
      "date":"2018-08-30 08:39:02.869850",
      "timezone_type":3,
      "timezone":"Europe/London"
   },
   "extra":{
      "memory_usage":"14 MB",
      "memory_peak_usage":"14 MB",
      "tags":{
         "hostname":"28571f0dc7e1",
         "region":"eu-west-1",
         "environment":"local",
         "processUniqueId":"5b879f16be3f1"
      }
   }
}

My logstash conf:

input {
    tcp {
        port => 5000
    }
    gelf {
        port => 12201
        type => gelf
        codec => "json"
    }
}

output {
    elasticsearch {
        hosts => "172.17.0.1:9201"
    }
}

My monolog config:

$gelfTransport = new \Gelf\Transport\UdpTransport(LOG_GELF_HOST, LOG_GELF_PORT);
            $gelfPublisher = new \Gelf\Publisher($gelfTransport);
            $gelfHandler = new \Monolog\Handler\GelfHandler($gelfPublisher, static::$logVerbosity);
            $gelfHandler->setFormatter(new \Monolog\Formatter\GelfMessageFormatter());

            // This is to prevent application from failing if `GelfHandler` fails for some reason
            $ignoreErrorHandlers = new \Monolog\Handler\WhatFailureGroupHandler([
                $gelfHandler
            ]);
            $logger->pushHandler($ignoreErrorHandlers);

EDIT: So far my finding is that this is caused by GelfMessageFormatter converting they arrays to JSON:

$val = is_scalar($val) || null === $val ? $val : $this->toJson($val);

When netcat is used along with the nested JSON, e.g.:

echo -n '{
"field": 1,
"nestedField1": {"nf1": 1.1, "nf2": 1.2, "2nestedfield":{"2nf1":1.11, "2nf2":1.12}}
}' | gzip -c | nc -u -w1 bomcheck-logstash 12201

then data in Kibana looks OK


Solution

  • It looks like GELF doesn't support nested data structures out of the box. I've decided to use native Logstash UDP plugin:

    input {
        udp {
            port => 12514
            codec => "json"
        }
    
    }
    

    along with the Monolog LogstashFormatter

    $connectionString = sprintf("udp://%s:%s", LOG_UDP_LOGSTASH_HOST, LOG_UDP_LOGSTASH_PORT);
    $handler = new \Monolog\Handler\SocketHandler($connectionString);
    $handler->setFormatter(new \Monolog\Formatter\LogstashFormatter('project', null, null, 'ctxt_', \Monolog\Formatter\LogstashFormatter::V1));
    $logger->pushHandler($handler);
    

    The nested data ends up correctly in Kibana.