I have been trying to use the fluent-operator to deploy fluentbit and fluentd in a multi-tenant scenario in EKS cluster.
The goal is to collect logs with fluentbit and then forward to fluentd to process and send to OpenSearch.
The logs are being collected by fluentbit, but then fluentbit pod logs the following error when trying to communicate with fluentd:
[2023/02/10 17:54:57] [error] [net] TCP connection failed: fluentd.fluent.svc:24224 (Connection refused)
[2023/02/10 17:54:57] [error] [output:forward:forward.0] no upstream connections available
[2023/02/10 17:54:57] [error] [engine] chunk '12-1676051688.632628964.flb' cannot be retried: task_id=16, input=tail.1 > output=forward.0
[2023/02/10 17:54:57] [ warn] [engine] failed to flush chunk '12-1676051696.570563472.flb', retry in 6 seconds: task_id=7, input=tail.1 > output=forward.0 (out_id=0)
[2023/02/10 17:54:57] [error] [engine] chunk '12-1676051685.661115204.flb' cannot be retried: task_id=8, input=tail.1 > output=forward.0
[2023/02/10 17:54:57] [ warn] [engine] failed to flush chunk '12-1676051696.742618827.flb', retry in 6 seconds: task_id=10, input=tail.1 > output=forward.0 (out_id=0)
[2023/02/10 17:54:57] [ info] [input:tail:tail.1] inode=45094081 handle rotation(): /var/log/containers/fluent-bit-dj2j8_fluent_fluent-bit-a1d1b1304f8a9f66bb394f20e2400898f9dbe354992f4190e44d2f6b2d48d80f.log => /var/log/pods/fluent_fluent-bit-dj2j8_b907b949-bc53-47e6-91f0-709647fd7733/fluent-bit/0.log.20230210-175457
[2023/02/10 17:54:57] [ info] [input:tail:tail.1] inotify_fs_remove(): inode=45094081 watch_fd=966
Fluentd starts up fine, and then can't connect to OpenSearch:
level=info msg="Fluentd started"
2023-02-14 21:22:23 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2023-02-14 21:22:23 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2023-02-14 21:22:24 +0000 [info]: gem 'fluentd' version '1.15.3'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-aws-elasticsearch-service' version '2.4.1'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-dedot_filter' version '1.0.0'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-detect-exceptions' version '0.0.14'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '5.2.4'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-grafana-loki' version '1.2.20'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-kafka' version '0.18.1'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-label-router' version '0.2.10'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-multi-format-parser' version '1.0.0'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-opensearch' version '1.0.10'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-oss' version '0.0.2'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.1'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-s3' version '1.7.2'
2023-02-14 21:22:24 +0000 [info]: gem 'fluent-plugin-sumologic_output' version '1.8.0'
2023-02-14 21:22:25 +0000 [info]: using configuration file: <ROOT>
<system>
rpc_endpoint "127.0.0.1:24444"
log_level info
workers 1
</system>
<source>
@type forward
bind "0.0.0.0"
port 24224
</source>
<match **>
@id main
@type label_router
<route>
@label "@d2d59c6c703bc71418b747e394ea26bb"
<match>
namespaces fluent,kube-system,kyverno,observability-system
</match>
</route>
</match>
<label @d2d59c6c703bc71418b747e394ea26bb>
<match **>
@id ClusterFluentdConfig-cluster-fluentd-config::cluster::clusteroutput::fluentd-output-opensearch-0
@type opensearch
host "vpc-XXXXX-us-west-2-XXXXXXX.us-west-2.es.amazonaws.com"
logstash_format true
logstash_prefix "logs"
port 9200
</match>
</label>
<match **>
@type null
@id main-no-output
</match>
<label @FLUENT_LOG>
<match fluent.*>
@type null
@id main-fluentd-log
</match>
</label>
</ROOT>
2023-02-14 21:22:25 +0000 [info]: starting fluentd-1.15.3 pid=13 ruby="3.1.3"
2023-02-14 21:22:25 +0000 [info]: spawn command to main: cmdline=["/usr/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--under-supervisor"]
2023-02-14 21:22:25 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2023-02-14 21:22:27 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2023-02-14 21:22:27 +0000 [info]: adding match in @d2d59c6c703bc71418b747e394ea26bb pattern="**" type="opensearch"
2023-02-14 21:22:36 +0000 [warn]: #0 [ClusterFluentdConfig-cluster-fluentd-config::cluster::clusteroutput::fluentd-output-opensearch-0] Could not communicate to OpenSearch, resetting connection and trying again. connect_write timeout reached
2023-02-14 21:22:36 +0000 [warn]: #0 [ClusterFluentdConfig-cluster-fluentd-config::cluster::clusteroutput::fluentd-output-opensearch-0] Remaining retry: 14. Retry to communicate after 2 second(s).
2023-02-14 21:22:45 +0000 [warn]: #0 [ClusterFluentdConfig-cluster-fluentd-config::cluster::clusteroutput::fluentd-output-opensearch-0] Could not communicate to OpenSearch, resetting connection and trying again. connect_write timeout reached
The configuration of fluentd-output-opensearch, fluentd service, fluentbit service, clusteroutput.fluentbit, fluentd pod and fluentbit pod seem ok:
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
annotations:
meta.helm.sh/release-name: fluent-operator
meta.helm.sh/release-namespace: fluent
creationTimestamp: "2023-02-10T14:28:57Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
output.fluentd.fluent.io/enabled: "true"
name: fluentd-output-opensearch
resourceVersion: "8982613"
uid: dcacb711-72b5-4fb3-9ec8-fab78f85e171
spec:
outputs:
- buffer:
path: /buffers/opensearch
type: file
opensearch:
host: vpc-XXXX-us-west-2-XXXXXXXXXX.us-west-2.es.amazonaws.com
logstashFormat: true
logstashPrefix: logs
port: 9200
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2023-02-10T12:29:53Z"
labels:
app.kubernetes.io/component: fluentd
app.kubernetes.io/instance: fluentd
app.kubernetes.io/name: fluentd
name: fluentd
namespace: fluent
ownerReferences:
- apiVersion: fluentd.fluent.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Fluentd
name: fluentd
uid: 98e29fa5-c0c0-4239-a7d8-61eb3ff59c18
resourceVersion: "8902659"
uid: 62273018-9921-41b9-a38a-32c703264a4c
spec:
clusterIP: 10.100.195.123
clusterIPs:
- 10.100.195.123
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: forward
port: 24224
protocol: TCP
targetPort: forward
selector:
app.kubernetes.io/component: fluentd
app.kubernetes.io/instance: fluentd
app.kubernetes.io/name: fluentd
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2023-02-13T18:44:57Z"
labels:
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: fluent-bit
name: fluent-bit
namespace: fluent
ownerReferences:
- apiVersion: fluentbit.fluent.io/v1alpha2
blockOwnerDeletion: true
controller: true
kind: FluentBit
name: fluent-bit
uid: 4fae4404-bea4-4cdd-aaf3-52b97d758bff
resourceVersion: "12053875"
uid: 89fa21db-cd70-4bcd-81f6-a1bd47cab74c
spec:
clusterIP: 10.100.253.128
clusterIPs:
- 10.100.253.128
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: metrics
port: 2020
protocol: TCP
targetPort: 2020
selector:
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: fluent-bit
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
annotations:
meta.helm.sh/release-name: fluent-operator
meta.helm.sh/release-namespace: fluent
creationTimestamp: "2023-02-10T12:29:44Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
fluentbit.fluent.io/component: logging
fluentbit.fluent.io/enabled: "true"
name: fluentd
resourceVersion: "8902495"
uid: b333b5e4-128d-419c-a726-cd8a8edeb4cf
spec:
forward:
host: fluentd.fluent.svc
port: 24224
matchRegex: (?:kube|service)\.(.*)
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
creationTimestamp: "2023-02-13T18:44:58Z"
generateName: fluentd-
labels:
app.kubernetes.io/component: fluentd
app.kubernetes.io/instance: fluentd
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: fluentd
controller-revision-hash: fluentd-d8ddb8bd9
statefulset.kubernetes.io/pod-name: fluentd-0
name: fluentd-0
namespace: fluent
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: fluentd
uid: 7c239d83-7421-4ed6-88a8-2e5f6c76facd
resourceVersion: "12054209"
uid: 2a3b0d84-78e6-4ae1-a90c-4a3d6fccba71
spec:
containers:
- env:
- name: BUFFER_PATH
value: /buffers
image: kubesphere/fluentd:v1.15.3
imagePullPolicy: IfNotPresent
name: fluentd
ports:
- containerPort: 2021
name: metrics
protocol: TCP
- containerPort: 24224
name: forward
protocol: TCP
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 100m
memory: 128Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /fluentd/etc
name: config
readOnly: true
- mountPath: /buffers
name: fluentd-buffer-pvc
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-n7vbs
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostname: fluentd-0
nodeName: ip-172-23-137-214.us-west-2.compute.internal
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: fluentd
serviceAccountName: fluentd
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: fluentd-buffer-pvc
persistentVolumeClaim:
claimName: fluentd-buffer-pvc-fluentd-0
- name: config
secret:
defaultMode: 420
secretName: fluentd-config
- name: kube-api-access-n7vbs
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-02-13T18:45:02Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2023-02-13T18:45:14Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2023-02-13T18:45:14Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2023-02-13T18:45:02Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://729776240915f3377c6a9bf06a7e19a5213672da96468cd9c8b599f157d6386c
image: docker.io/kubesphere/fluentd:v1.15.3
imageID: docker.io/kubesphere/fluentd@sha256:58caf053b0f903ce3d0fc86b7bc748839e1a4aed6c7d8c1d3285d28553e93bce
lastState: {}
name: fluentd
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-02-13T18:45:13Z"
hostIP: 172.23.137.214
phase: Running
podIP: 172.30.43.227
podIPs:
- ip: 172.30.43.227
qosClass: Burstable
startTime: "2023-02-13T18:45:02Z"
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
creationTimestamp: "2023-02-13T18:44:57Z"
generateName: fluent-bit-
labels:
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: fluent-bit
controller-revision-hash: 7b98cd9f49
pod-template-generation: "1"
name: fluent-bit-2sx6v
namespace: fluent
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: DaemonSet
name: fluent-bit
uid: d33dcff3-2e04-42dd-816c-0edb3ea63a19
resourceVersion: "12053982"
uid: 296d44ba-b761-47f0-a4ec-ed55dfa507dd
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- ip-172-23-137-29.us-west-2.compute.internal
containers:
- env:
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: HOST_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
image: kubesphere/fluent-bit:v2.0.9
imagePullPolicy: IfNotPresent
name: fluent-bit
ports:
- containerPort: 2020
name: metrics
protocol: TCP
resources:
limits:
cpu: 500m
memory: 200Mi
requests:
cpu: 10m
memory: 25Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /containers
name: varlibcontainers
readOnly: true
- mountPath: /fluent-bit/config
name: config
readOnly: true
- mountPath: /var/log/
name: varlogs
readOnly: true
- mountPath: /var/log/journal
name: systemd
readOnly: true
- mountPath: /fluent-bit/tail
name: positions
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-gzqz8
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: ip-172-23-137-29.us-west-2.compute.internal
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: fluent-bit
serviceAccountName: fluent-bit
terminationGracePeriodSeconds: 30
tolerations:
- operator: Exists
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/disk-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/pid-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/unschedulable
operator: Exists
volumes:
- hostPath:
path: /containers
type: ""
name: varlibcontainers
- name: config
secret:
defaultMode: 420
secretName: fluent-bit-config
- hostPath:
path: /var/log
type: ""
name: varlogs
- hostPath:
path: /var/log/journal
type: ""
name: systemd
- hostPath:
path: /var/lib/fluent-bit/
type: ""
name: positions
- name: kube-api-access-gzqz8
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-02-13T18:44:57Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2023-02-13T18:44:59Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2023-02-13T18:44:59Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2023-02-13T18:44:57Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://7e5101ec69c0b0f4749f3462306801b41ff41e6c288eff74a75e253e79626720
image: docker.io/kubesphere/fluent-bit:v2.0.9
imageID: docker.io/kubesphere/fluent-bit@sha256:7b66bfc157e60f17e26c5e1dbbe1ae79768446ffaad06b4a013a3efb65815cce
lastState: {}
name: fluent-bit
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-02-13T18:44:58Z"
hostIP: 172.23.137.29
phase: Running
podIP: 172.30.30.141
podIPs:
- ip: 172.30.30.141
qosClass: Burstable
startTime: "2023-02-13T18:44:57Z"
Also, the fluentd globalInputs seem to be correct for forward inputs:
apiVersion: fluentd.fluent.io/v1alpha1
kind: Fluentd
metadata:
annotations:
meta.helm.sh/release-name: fluent-operator
meta.helm.sh/release-namespace: fluent
creationTimestamp: "2023-02-13T20:13:59Z"
finalizers:
- fluentd.fluent.io
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: fluentd
name: fluentd
namespace: fluent
resourceVersion: "12115920"
uid: f1448972-45d5-4a36-8d0d-ed2cf65ff730
spec:
fluentdCfgSelector:
matchLabels:
config.fluentd.fluent.io/enabled: "true"
globalInputs:
- forward:
bind: 0.0.0.0
port: 24224
image: kubesphere/fluentd:v1.15.3
replicas: 1
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 100m
memory: 128Mi
status:
messages: all matched cfgs is valid
state: active
I have all fluentbit, fluentd and fluent-operator pods up and running in the same namespace.
I also execd' into both fluentbit and fluentd pods. Running ping
from fluentbit container to fluentd's podIP. It seems to work.
root@fluent-bit-gtslr:/# ping 172.30.30.141
PING 172.30.30.141 (172.30.30.141) 56(84) bytes of data.
64 bytes from 172.30.30.141: icmp_seq=1 ttl=253 time=0.742 ms
64 bytes from 172.30.30.141: icmp_seq=2 ttl=253 time=0.711 ms
64 bytes from 172.30.30.141: icmp_seq=3 ttl=253 time=0.693 ms
64 bytes from 172.30.30.141: icmp_seq=4 ttl=253 time=0.730 ms
64 bytes from 172.30.30.141: icmp_seq=5 ttl=253 time=0.730 ms
^C
--- 172.30.30.141 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4093ms
rtt min/avg/max/mdev = 0.693/0.721/0.742/0.017 ms
Why am I getting this error?
I installed the fluent-operator via Helm:
helm install fluent-operator --create-namespace -n fluent https://github.com/fluent/fluent-operator/releases/download/v2.0.1/fluent-operator.tgz --values values.yaml
The values.yaml has the following configuration:
# Default values for fluentbit-operator.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
#Set this to containerd or crio if you want to collect CRI format logs
containerRuntime: docker
# If you want to deploy a default Fluent Bit pipeline (including Fluent Bit Input, Filter, and output) to collect Kubernetes logs, you'll need to set the Kubernetes parameter to true
# see https://github.com/fluent/fluent-operator/tree/master/manifests/logging-stack
Kubernetes: true
operator:
# The init container is to get the actual storage path of the docker log files so that it can be mounted to collect the logs.
# see https://github.com/fluent/fluent-operator/blob/master/manifests/setup/fluent-operator-deployment.yaml#L26
initcontainer:
repository: "docker"
tag: "20.10"
container:
repository: "kubesphere/fluent-operator"
tag: "latest"
# FluentBit operator resources. Usually user needn't to adjust these.
resources:
limits:
cpu: 100m
memory: 60Mi
requests:
cpu: 100m
memory: 20Mi
# Specify custom annotations to be added to each Fluent Operator pod.
annotations: {}
## Reference to one or more secrets to be used when pulling images
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
imagePullSecrets: []
# - name: "image-pull-secret"
# Reference one more key-value pairs of labels that should be attached to fluent-operator
labels: {}
# myExampleLabel: someValue
logPath:
# The operator currently assumes a Docker container runtime path for the logs as the default, for other container runtimes you can set the location explicitly below.
# crio: /var/log
containerd: /var/log
fluentbit:
image:
repository: "kubesphere/fluent-bit"
tag: "v2.0.9"
# fluentbit resources. If you do want to specify resources, adjust them as necessary
#You can adjust it based on the log volume.
resources:
limits:
cpu: 500m
memory: 200Mi
requests:
cpu: 10m
memory: 25Mi
# Specify custom annotations to be added to each FluentBit pod.
annotations: {}
## Request to Fluent Bit to exclude or not the logs generated by the Pod.
# fluentbit.io/exclude: "true"
## Prometheus can use this tag to automatically discover the Pod and collect monitoring data
# prometheus.io/scrape: "true"
# Specify additional custom labels for fluentbit-pods
labels: {}
## Reference to one or more secrets to be used when pulling images
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
##
imagePullSecrets: [ ]
# - name: "image-pull-secret"
secrets: []
# List of volumes that can be mounted by containers belonging to the pod.
additionalVolumes: []
# Pod volumes to mount into the container's filesystem.
additionalVolumesMounts: []
# Remove the above empty volumes and volumesMounts, and then set additionalVolumes and additionalVolumesMounts as below if you want to collect node exporter metrics
# additionalVolumes:
# - name: hostProc
# hostPath:
# path: /proc/
# - name: hostSys
# hostPath:
# path: /sys/
# additionalVolumesMounts:
# - mountPath: /host/sys
# mountPropagation: HostToContainer
# name: hostSys
# readOnly: true
# - mountPath: /host/proc
# mountPropagation: HostToContainer
# name: hostProc
# readOnly: true
#Set a limit of memory that Tail plugin can use when appending data to the Engine.
# You can find more details here: https://docs.fluentbit.io/manual/pipeline/inputs/tail#config
#If the limit is reach, it will be paused; when the data is flushed it resumes.
#if the inbound traffic is less than 2.4Mbps, setting memBufLimit to 5MB is enough
#if the inbound traffic is less than 4.0Mbps, setting memBufLimit to 10MB is enough
#if the inbound traffic is less than 13.64Mbps, setting memBufLimit to 50MB is enough
input:
tail:
memBufLimit: 5MB
nodeExporterMetrics: {}
# uncomment below nodeExporterMetrics section if you want to collect node exporter metrics
# nodeExporterMetrics:
# tag: node_metrics
# scrapeInterval: 15s
# path:
# procfs: /host/proc
# sysfs: /host/sys
#Configure the output plugin parameter in FluentBit.
#You can set enable to true to output logs to the specified location.
output:
# You can find more supported output plugins here: https://github.com/fluent/fluent-operator/tree/master/docs/plugins/fluentbit/clusteroutput
es:
enable: false
host: "<Elasticsearch url like elasticsearch-logging-data.kubesphere-logging-system.svc>"
port: 9200
logstashPrefix: ks-logstash-log
# path: ""
# bufferSize: "4KB"
# index: "fluent-bit"
# httpUser:
# httpPassword:
# logstashFormat: true
# replaceDots: false
# enableTLS: false
# tls:
# verify: On
# debug: 1
# caFile: "<Absolute path to CA certificate file>"
# caPath: "<Absolute path to scan for certificate files>"
# crtFile: "<Absolute path to private Key file>"
# keyFile: "<Absolute path to private Key file>"
# keyPassword:
# vhost: "<Hostname to be used for TLS SNI extension>"
kafka:
enable: false
brokers: "<kafka broker list like xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092>"
topics: ks-log
opentelemetry: {}
# You can configure the opentelemetry-related configuration here
opensearch: {}
# You can configure the opensearch-related configuration here
stdout:
enable: true
forward:
enable: true
host: fluentd
port: 24224
#Configure the default filters in FluentBit.
# The `filter` will filter and parse the collected log information and output the logs into a uniform format. You can choose whether to turn this on or not.
filter:
kubernetes:
enable: true
labels: true
annotations: true
containerd:
# This is customized lua containerd log format converter, you can refer here:
# https://github.com/fluent/fluent-operator/blob/master/charts/fluent-operator/templates/fluentbit-clusterfilter-containerd.yaml
# https://github.com/fluent/fluent-operator/blob/master/charts/fluent-operator/templates/fluentbit-containerd-config.yaml
enable: true
systemd:
enable: true
fluentd:
enable: true
name: fluentd
port: 24224
image:
repository: "kubesphere/fluentd"
tag: "v1.15.3"
replicas: 1
forward:
port: 24224
watchedNamespaces:
- default
- kube-system
- test-namespace
- fluent
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 100m
memory: 128Mi
# Configure the output plugin parameter in Fluentd.
# Fluentd is disabled by default, if you enable it make sure to also set up an output to use.
output:
es:
enable: false
host: elasticsearch-logging-data.kubesphere-logging-system.svc
port: 9200
logstashPrefix: ks-logstash-log
buffer:
enable: false
type: file
path: /buffers/es
kafka:
enable: false
brokers: "my-cluster-kafka-bootstrap.default.svc:9091,my-cluster-kafka-bootstrap.default.svc:9092,my-cluster-kafka-bootstrap.default.svc:9093"
topicKey: kubernetes_ns
buffer:
enable: false
type: file
path: /buffers/kafka
stdout:
enable: true
opensearch:
enable: true
host: vpc-XXX-us-west-2-XXXXXXXX.us-west-2.es.amazonaws.com
port: 9200
logstashPrefix: logs
buffer:
enable: true
type: file
path: /buffers/opensearch
nameOverride: ""
fullnameOverride: ""
namespaceOverride: ""
I have found a solution.
It seems that fluentd refuses fluentbit connection if it can't connect to OpenSearch beforehand.
I was sending logs to OpenSearch on port 9200(http). Then, I tested it on port 443.
Pinging OpenSearch from the node and from the pod on port 443 was the only request that worked.
So, I just added port 443 and scheme https to values.yaml. After that, logs starded popping up on OpenSearch Dashboards(Kibana). It ended like this:
# Default values for fluentbit-operator.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
#Set this to containerd or crio if you want to collect CRI format logs
containerRuntime: docker
# If you want to deploy a default Fluent Bit pipeline (including Fluent Bit Input, Filter, and output) to collect Kubernetes logs, you'll need to set the Kubernetes parameter to true
# see https://github.com/fluent/fluent-operator/tree/master/manifests/logging-stack
Kubernetes: true
operator:
# The init container is to get the actual storage path of the docker log files so that it can be mounted to collect the logs.
# see https://github.com/fluent/fluent-operator/blob/master/manifests/setup/fluent-operator-deployment.yaml#L26
initcontainer:
repository: "docker"
tag: "20.10"
container:
repository: "kubesphere/fluent-operator"
tag: "latest"
# FluentBit operator resources. Usually user needn't to adjust these.
resources:
limits:
cpu: 100m
memory: 60Mi
requests:
cpu: 100m
memory: 20Mi
# Specify custom annotations to be added to each Fluent Operator pod.
annotations: {}
## Reference to one or more secrets to be used when pulling images
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
imagePullSecrets: []
# - name: "image-pull-secret"
# Reference one more key-value pairs of labels that should be attached to fluent-operator
labels: {}
# myExampleLabel: someValue
logPath:
# The operator currently assumes a Docker container runtime path for the logs as the default, for other container runtimes you can set the location explicitly below.
# crio: /var/log
containerd: /var/log
fluentbit:
image:
repository: "kubesphere/fluent-bit"
tag: "v2.0.9"
# fluentbit resources. If you do want to specify resources, adjust them as necessary
#You can adjust it based on the log volume.
resources:
limits:
cpu: 500m
memory: 200Mi
requests:
cpu: 10m
memory: 25Mi
# Specify custom annotations to be added to each FluentBit pod.
annotations: {}
## Request to Fluent Bit to exclude or not the logs generated by the Pod.
# fluentbit.io/exclude: "true"
## Prometheus can use this tag to automatically discover the Pod and collect monitoring data
# prometheus.io/scrape: "true"
# Specify additional custom labels for fluentbit-pods
labels: {}
## Reference to one or more secrets to be used when pulling images
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
##
imagePullSecrets: [ ]
# - name: "image-pull-secret"
secrets: []
# List of volumes that can be mounted by containers belonging to the pod.
additionalVolumes: []
# Pod volumes to mount into the container's filesystem.
additionalVolumesMounts: []
# Remove the above empty volumes and volumesMounts, and then set additionalVolumes and additionalVolumesMounts as below if you want to collect node exporter metrics
# additionalVolumes:
# - name: hostProc
# hostPath:
# path: /proc/
# - name: hostSys
# hostPath:
# path: /sys/
# additionalVolumesMounts:
# - mountPath: /host/sys
# mountPropagation: HostToContainer
# name: hostSys
# readOnly: true
# - mountPath: /host/proc
# mountPropagation: HostToContainer
# name: hostProc
# readOnly: true
#Set a limit of memory that Tail plugin can use when appending data to the Engine.
# You can find more details here: https://docs.fluentbit.io/manual/pipeline/inputs/tail#config
#If the limit is reach, it will be paused; when the data is flushed it resumes.
#if the inbound traffic is less than 2.4Mbps, setting memBufLimit to 5MB is enough
#if the inbound traffic is less than 4.0Mbps, setting memBufLimit to 10MB is enough
#if the inbound traffic is less than 13.64Mbps, setting memBufLimit to 50MB is enough
input:
tail:
memBufLimit: 5MB
nodeExporterMetrics: {}
# uncomment below nodeExporterMetrics section if you want to collect node exporter metrics
# nodeExporterMetrics:
# tag: node_metrics
# scrapeInterval: 15s
# path:
# procfs: /host/proc
# sysfs: /host/sys
#Configure the output plugin parameter in FluentBit.
#You can set enable to true to output logs to the specified location.
output:
# You can find more supported output plugins here: https://github.com/fluent/fluent-operator/tree/master/docs/plugins/fluentbit/clusteroutput
es:
enable: false
host: "<Elasticsearch url like elasticsearch-logging-data.kubesphere-logging-system.svc>"
port: 9200
logstashPrefix: ks-logstash-log
# path: ""
# bufferSize: "4KB"
# index: "fluent-bit"
# httpUser:
# httpPassword:
# logstashFormat: true
# replaceDots: false
# enableTLS: false
# tls:
# verify: On
# debug: 1
# caFile: "<Absolute path to CA certificate file>"
# caPath: "<Absolute path to scan for certificate files>"
# crtFile: "<Absolute path to private Key file>"
# keyFile: "<Absolute path to private Key file>"
# keyPassword:
# vhost: "<Hostname to be used for TLS SNI extension>"
kafka:
enable: false
brokers: "<kafka broker list like xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092>"
topics: ks-log
opentelemetry: {}
# You can configure the opentelemetry-related configuration here
opensearch: {}
# You can configure the opensearch-related configuration here
stdout:
enable: true
# forward: # {{- if .Values.Kubernetes -}} {{- if .Values.fluentd.enable -}}
# host: fluentd.fluent.svc.cluster.local # host: {{ .Values.fluentd.name }}.{{ .Release.Namespace }}.svc on fluentbit-output-forward.yaml
# port: 24224 # {{ .Values.fluentd.forward.port }}
#Configure the default filters in FluentBit.
# The `filter` will filter and parse the collected log information and output the logs into a uniform format. You can choose whether to turn this on or not.
filter:
kubernetes:
enable: true
labels: true
annotations: true
containerd:
# This is customized lua containerd log format converter, you can refer here:
# https://github.com/fluent/fluent-operator/blob/master/charts/fluent-operator/templates/fluentbit-clusterfilter-containerd.yaml
# https://github.com/fluent/fluent-operator/blob/master/charts/fluent-operator/templates/fluentbit-containerd-config.yaml
enable: false
systemd:
enable: false
fluentd:
enable: true
name: fluentd
port: 24224 # port: {{ .Values.fluentd.port }} on fluentd-fluentd.yaml
image:
repository: "kubesphere/fluentd"
tag: "v1.15.3"
replicas: 1
forward:
port: 24224 # port: {{ .Values.fluentd.forward.port }} on fluentbit-output-forward.yaml
watchedNamespaces:
- fluent
- observability-system
- default
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 100m
memory: 128Mi
# Configure the output plugin parameter in Fluentd.
# Fluentd is disabled by default, if you enable it make sure to also set up an output to use.
output:
es:
enable: false
host: elasticsearch-logging-data.kubesphere-logging-system.svc
port: 9200
logstashPrefix: ks-logstash-log
buffer:
enable: false
type: file
path: /buffers/es
kafka:
enable: false
brokers: "my-cluster-kafka-bootstrap.default.svc:9091,my-cluster-kafka-bootstrap.default.svc:9092,my-cluster-kafka-bootstrap.default.svc:9093"
topicKey: kubernetes_ns
buffer:
enable: false
type: file
path: /buffers/kafka
stdout:
enable: true
opensearch:
enable: true
host: vpc-XXXXX-us-west-2-XXXXXXXX.us-west-2.es.amazonaws.com
port: 443
logstashPrefix: logs
scheme: https
# buffer:
# enable: false
# type: file
# path: /buffers/opensearch
nameOverride: ""
fullnameOverride: ""
namespaceOverride: ""
Keep in mind that fluentd is running on Kubernetes cluster(EKS).
Another issue that I had to face was that after upgrading the fluent-operator release, the changes weren't applied to the fluentd pod.
This is because the fluentd template doesn't handle parameters like scheme
.
But the CRD does: https://github.com/fluent/helm-charts/blob/main/charts/fluent-operator/crds/fluentd.fluent.io_clusteroutputs.yaml#L1411 .
So, I just had to apply this change manually and then kill the fluentd pod. After that, the pod recognized the changes and rendered the https scheme:
kubectl get clusteroutput fluentd-output-opensearch -o yaml
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
annotations:
meta.helm.sh/release-name: fluent-operator
meta.helm.sh/release-namespace: fluent
creationTimestamp: "2023-02-15T20:35:26Z"
generation: 2
labels:
app.kubernetes.io/managed-by: Helm
output.fluentd.fluent.io/enabled: "true"
name: fluentd-output-opensearch
resourceVersion: "14073767"
uid: 9705d00f-5c10-4b32-916c-f6a487a3ac70
spec:
outputs:
- opensearch:
host: vpc-XXXXX-us-west-2-XXXXXX.us-west-2.es.amazonaws.com
logstashFormat: true
logstashPrefix: logs
port: 443
scheme: https