Describe the issue
When deploying Fluent Operator on a Kubernetes cluster, the operator process consumes a very high number of CPU cores (~27 out of 32 available cores on a single Node). This is far beyond expected usage and causes resource contention.
Deleting the fluent pod does not change the high cpu behavior once new pod is recreated. Need assistance identifying cause of the high cpu load, thanks.
fluent-operator-df4555f76-ht6f2.log
2025-09-15T04:49:19Z INFO setup starting manager
2025-09-15T04:49:19Z INFO controller-runtime.metrics Starting metrics server
2025-09-15T04:49:19Z INFO starting server {"name": "health probe", "addr": "[::]:8081"}
2025-09-15T04:49:19Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8080", "secure": false}
2025-09-15T04:49:19Z INFO Starting EventSource {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBit"}
Trace[833216769]: [208.550494ms] [208.550494ms] END
I0915 04:49:45.438100 1 trace.go:236] Trace[442998739]: "DeltaFIFO Pop Process" ID:XXXXX,Depth:16,Reason:slow event handlers blocking the queue (15-Sep-2025 04:49:45.204) (total time: 186ms):
Trace[442998739]: [186.678635ms] [186.678635ms] END
I0915 04:49:45.830454 1 trace.go:236] Trace[989326455]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.30.3/tools/cache/reflector.go:232 (15-Sep-2025 04:49:19.799) (total time: 25965ms):
Trace[989326455]: ---"Objects listed" error: 25832ms (04:49:45.631)
Trace[989326455]: [25.965573843s] [25.965573843s] END
2025-09-15T04:57:02Z INFO controllers.FluentBitConfig Fluent Bit main configuration has updated {"logging-control-plane": "monitoring", "fluentbitconfig": "fluent-bit-config", "secret": "fluent-bit-config"}
Deployed fluent-operator 3.4.0 in "operators" namespace using fluent-operator helm charts. Below is the configurations:
Fluent Operator configurations
fluent-operator:
enabled: true
fluent-operator:
containerRuntime: containerd
operator:
# The init container is to get the actual storage path of the docker log files so that it can be mounted to collect the logs.
# see https://github.com/fluent/fluent-operator/blob/master/manifests/setup/fluent-operator-deployment.yaml#L26
logPath:
# The operator currently assumes a Docker container runtime path for the logs as the default, for other container runtimes you can set the location explicitly below.
containerd: /var/log/containers/*.log
disableComponentControllers: "fluentd"
fluentbit:
# Installs a sub chart carrying the CRDs for the fluent-bit controller. The sub chart is enabled by default.
crdsEnable: true
enable: false
fluentd:
# Installs a sub chart carrying the CRDs for the fluentd controller. The sub chart is enabled by default.
crdsEnable: false
Deployed fluent-bit in "monitoring" namespace using fluent-operator helm charts. Below is the configurations:
Fluent bit configurations
fluent-operator:
containerRuntime: containerd
operator:
enable: false
disableComponentControllers: "fluentd"
fluentbit:
# Installs a sub chart carrying the CRDs for the fluent-bit controller. The sub chart is enabled by default.
enable: true
Please let me if you want me provide any other details.
Old bug ref : #1717
To Reproduce
Deploy Fluent Operator version v3.4.0 on a Kubernetes cluster.
Observe CPU usage on the Node where the operator is running.
Notice that the operator consumes ~27 cores out of 32.
Expected behavior
Fluent Operator should run normally with minimal CPU usage, not consuming nearly all available cores.
Your Environment
Fluent Operator version: 3.4.0
Container Runtime: containerd
Operating System: Ubuntu 22.04
Kubernetes version: 1.35
How did you install fluent operator?
No response
Additional context
No response
Describe the issue
When deploying Fluent Operator on a Kubernetes cluster, the operator process consumes a very high number of CPU cores (~27 out of 32 available cores on a single Node). This is far beyond expected usage and causes resource contention.
Deleting the fluent pod does not change the high cpu behavior once new pod is recreated. Need assistance identifying cause of the high cpu load, thanks.
fluent-operator-df4555f76-ht6f2.log
2025-09-15T04:49:19Z INFO setup starting manager
2025-09-15T04:49:19Z INFO controller-runtime.metrics Starting metrics server
2025-09-15T04:49:19Z INFO starting server {"name": "health probe", "addr": "[::]:8081"}
2025-09-15T04:49:19Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8080", "secure": false}
2025-09-15T04:49:19Z INFO Starting EventSource {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBit"}
Trace[833216769]: [208.550494ms] [208.550494ms] END
I0915 04:49:45.438100 1 trace.go:236] Trace[442998739]: "DeltaFIFO Pop Process" ID:XXXXX,Depth:16,Reason:slow event handlers blocking the queue (15-Sep-2025 04:49:45.204) (total time: 186ms):
Trace[442998739]: [186.678635ms] [186.678635ms] END
I0915 04:49:45.830454 1 trace.go:236] Trace[989326455]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.30.3/tools/cache/reflector.go:232 (15-Sep-2025 04:49:19.799) (total time: 25965ms):
Trace[989326455]: ---"Objects listed" error: 25832ms (04:49:45.631)
Trace[989326455]: [25.965573843s] [25.965573843s] END
2025-09-15T04:57:02Z INFO controllers.FluentBitConfig Fluent Bit main configuration has updated {"logging-control-plane": "monitoring", "fluentbitconfig": "fluent-bit-config", "secret": "fluent-bit-config"}
Deployed fluent-operator 3.4.0 in "operators" namespace using fluent-operator helm charts. Below is the configurations:
Fluent Operator configurations
Deployed fluent-bit in "monitoring" namespace using fluent-operator helm charts. Below is the configurations:
Fluent bit configurations
Please let me if you want me provide any other details.
Old bug ref : #1717
To Reproduce
Deploy Fluent Operator version v3.4.0 on a Kubernetes cluster.
Observe CPU usage on the Node where the operator is running.
Notice that the operator consumes ~27 cores out of 32.
Expected behavior
Fluent Operator should run normally with minimal CPU usage, not consuming nearly all available cores.
Your Environment
How did you install fluent operator?
No response
Additional context
No response