elasticsearchkubernetesfluentdopensearch

fluentd with OpenSearch - where does the @timestamp field come from?


I am running fluentd as a DaemonSet in a Kubernetes cluster. fluentd writes the log entries to OpenSearch. Take a look at https://github.com/fluent/fluentd-kubernetes-daemonset

I must lay some background before my question: The way it works is that Kubernetes pods write to stdout, and the container runtime writes this to a certain location, namely /var/logs/pods/<pod_specific_location> . The format of these log files is as such:

31-12-23T12:00:00.123456Z    stdout    F     my great log message

Now, fluentd is configured to pick it from there, and using the cri parser plugin transforms it to:

{
"time": "31-12-23T12:00:00.123456Z",
"stream": "stdout",
"logtag": "F",
"message": "my great log message"
}

Now, say I run a pod in my cluster that writes the following log message:

hello

Further down the road, using the kubernetes metadata plugin, fluentd enriches this record with kubernetes metadata, such as namespace name, pod name, etc. etc., so it'll look something like:

{
"stream":"stdout",
"logtag":"F",
"time":"31-12-23T12:00:00.123456Z",
"message": "my great log message"
"docker":
{"container_id":"9077644273956d3f3e9d171240f412b3b6e959984a5fd99adfcb77f9b998a370"},
"kubernetes":
{"container_name":"demo-app",
"namespace_name":"foo",
"pod_name":"foo-ns-app",
"container_image":"docker.io/yoavklein3/net-tools:latest",
"container_image_id":"docker.io/yoavklein3/net-tools@sha256:3fd9646a14d97ecc2d236a5bebd88faf617bc6045f1e4f32c49409f1c930879a",
"pod_id":"a69fb942-c0ab-457d-b752-ffa3fa27e574",
"pod_ip":"10.0.2.224",
"host":"ip-10-0-2-5.ec2.internal",
"master_url":"https://172.20.0.1:443/api",
"namespace_id":"6bdf5fe9-9a5a-4501-ab6c-deddd241e071",
"namespace_labels":{"kubernetes.io/metadata.name":"foo"}}}

Now, using the opensearch plugin it is sent to Opensearch.

Now, when I open Opensearch Dashboards, I can see a field called @timestamp, and I just can't figure out where this field comes from:

This is a document in OpenSearch (apologies for not sticking to the example above exactly, but the concept remains the same):

{
  "_index": "logstash-2023.06.06",
  "_type": "_doc",
  "_id": "sVHjj4gByMQm1Wc45hv2",
  "_version": 1,
  "_score": null,
  "_source": {
    "stream": "stdout",
    "logtag": "F",
    "time": "2023-06-06T08:47:35.874884092Z",
    "docker": {
      "container_id": "9077644273956d3f3e9d171240f412b3b6e959984a5fd99adfcb77f9b998a370"
    },
    "kubernetes": {
      "container_name": "demo-app",
      "namespace_name": "foo",
      "pod_name": "foo-ns-app",
      "container_image": "docker.io/yoavklein3/net-tools:latest",
      "container_image_id": "docker.io/yoavklein3/net-tools@sha256:3fd9646a14d97ecc2d236a5bebd88faf617bc6045f1e4f32c49409f1c930879a",
      "pod_id": "a69fb942-c0ab-457d-b752-ffa3fa27e574",
      "pod_ip": "10.0.2.224",
      "host": "ip-10-0-2-5.ec2.internal",
      "master_url": "https://172.20.0.1:443/api",
      "namespace_id": "6bdf5fe9-9a5a-4501-ab6c-deddd241e071",
      "namespace_labels": {
        "kubernetes.io/metadata.name": "foo"
      }
    },
    "data": "This is from FOO namespace",
    "@timestamp": "2023-06-06T08:47:35.882677347+00:00",
    "tag": "kubernetes.var.log.containers.foo-ns-app_foo_demo-app-9077644273956d3f3e9d171240f412b3b6e959984a5fd99adfcb77f9b998a370.log"
  },
  "fields": {
    "@timestamp": [
      "2023-06-06T08:47:35.882Z"
    ],
    "time": [
      "2023-06-06T08:47:35.874Z"
    ]
  },
  "sort": [
    1686041255882
  ]
}

NOTE: the message field is missing, and there's the data field instead. This is due to parsing the message field as JSON. You can ignore this, it's completely irrelevant, just noting if you're confused.

EDIT

I don't think that the source of this @timestamp field is the Opensearch plugin. Why? because when I run fluentd with opensearch NOT in a kubernetes cluster, but rather using other input plugins, I can't see this field.


Solution

  • I can see a field called @timestamp, and I just can't figure out where this field comes from...

    This field is added by the opensearch plugin, the value is the point in time when the message is ingested.

    The field is only added if either logstash_format is true or include_timestamp is true.