How to Fix Kubernetes DiskPressure Pod Evictions Caused by Container Log Accumulation
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 15–30 mins
TL;DR
- What broke: Kubernetes worker nodes crossed the
imagefs.availableornodefs.availableeviction threshold because container log files under/var/log/pods/or/var/lib/docker/containers/were never rotated, consuming 80%+ of node disk. - How to fix it: Enforce
containerLogMaxSizeandcontainerLogMaxFilesin the kubelet config, and immediately purge orphaned log files withfind+truncate. - Fast path: Use our Client-Side Sandbox below to auto-refactor your kubelet ConfigMap or container spec — paste your config and get corrected YAML instantly.
The Incident (What Does the Error Mean?)
Raw kubelet event from kubectl describe node <node-name>:
Conditions:
Type Status
---- ------
DiskPressure True
Events:
Warning Evicted pod/api-server-7d9f4b8c6-xk2p9
The node had condition: DiskPressure
Raw eviction event from kubectl get events --field-selector reason=Evicted -A:
WARNING Evicting pod api-server-7d9f4b8c6-xk2p9
because it exceeded its local ephemeral storage limit
or the node is under DiskPressure.
Immediate consequence: The kubelet's eviction manager fires. Pods are terminated without a graceful drain. Deployments restart pods onto other nodes — which are likely suffering the same log bloat — and the eviction cascades cluster-wide. Stateful workloads without PVCs lose ephemeral data permanently.
The Attack Vector / Blast Radius
This is a cascading resource exhaustion failure, not a single-node problem.
Why it escalates fast:
- No log rotation by default in many kubelet configs. If
containerLogMaxSizeis unset, a single verbose container (e.g., Istio sidecar, a Java app dumping stack traces) can write gigabytes in hours. /var/log/pods/symlinks to/var/lib/docker/containers/or containerd's content store. Deleting the pod does NOT immediately reclaim disk — the log file inode stays alive until the runtime GCs it.- Evicted pods leave orphaned log directories.
/var/log/pods/<namespace>_<pod>_<uid>/persists after eviction. On a busy node cycling through evictions, these accumulate rapidly. - Eviction threshold breach triggers a feedback loop: Eviction → new pod scheduled → new logs → disk fills faster → more evictions. Nodes can become
NotReadywithin minutes. - DaemonSets are not evicted by default, meaning your log shippers (Fluentd, Filebeat) stay running and continue writing their own buffers to the same full disk.
Blast radius: Full node NotReady, cluster autoscaler spinning up replacement nodes that inherit the same misconfiguration, SLA breach, potential data loss for non-PVC workloads.
How to Fix It
Step 0: Immediate Triage (Run This First)
# Identify top disk consumers on the affected node (run via node shell or SSH)
du -sh /var/log/pods/* | sort -rh | head -20
du -sh /var/lib/containerd/* | sort -rh | head -10
# Truncate (do NOT delete — deletion of open files won't free inodes) active log files over 500MB
find /var/log/pods/ -name '*.log' -size +500M -exec truncate -s 0 {} \;
# Force kubelet to GC dead containers
kubectl delete pod --field-selector=status.phase=Failed -A
Basic Fix: Kubelet Log Rotation Config
Edit the kubelet config on each affected node. Location is typically /etc/kubernetes/kubelet-config.yaml or managed via a ConfigMap in kubeadm clusters.
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
-# containerLogMaxSize and containerLogMaxFiles not set (defaults: 10Mi / 5)
+containerLogMaxSize: "50Mi"
+containerLogMaxFiles: 3
evictionHard:
- nodefs.available: "5%"
+ nodefs.available: "10%"
+ nodefs.inodesFree: "5%"
+imageGCHighThresholdPercent: 75
+imageGCLowThresholdPercent: 70
After editing, restart kubelet:
systemctl daemon-reload && systemctl restart kubelet
Enterprise Best Practice: Enforce via DaemonSet + Logrotate + Resource Quotas
1. Container-level log limits in the Pod spec (defense-in-depth):
containers:
- name: api-server
image: myapp:v2.1.0
+ resources:
+ limits:
+ ephemeral-storage: "2Gi"
+ requests:
+ ephemeral-storage: "512Mi"
2. Cluster-level enforcement via LimitRange:
+apiVersion: v1
+kind: LimitRange
+metadata:
+ name: ephemeral-storage-limits
+ namespace: production
+spec:
+ limits:
+ - type: Container
+ max:
+ ephemeral-storage: "4Gi"
+ default:
+ ephemeral-storage: "1Gi"
+ defaultRequest:
+ ephemeral-storage: "256Mi"
3. Node-level logrotate DaemonSet for legacy containerd/docker log paths:
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+ name: log-purger
+ namespace: kube-system
+spec:
+ selector:
+ matchLabels:
+ app: log-purger
+ template:
+ spec:
+ tolerations:
+ - operator: Exists # Run even on tainted/pressured nodes
+ hostPID: true
+ containers:
+ - name: log-purger
+ image: alpine:3.19
+ command:
+ - /bin/sh
+ - -c
+ - |
+ while true; do
+ find /host/var/log/pods -name '*.log' -size +200M \
+ -exec truncate -s 0 {} \;
+ find /host/var/log/pods -maxdepth 3 -type d -empty -delete
+ sleep 300
+ done
+ volumeMounts:
+ - name: varlog
+ mountPath: /host/var/log
+ volumes:
+ - name: varlog
+ hostPath:
+ path: /var/log
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. OPA/Gatekeeper policy — block pods with no ephemeral-storage limit:
package kubernetes.admission
deny[msg] {
input.request.kind.kind == "Pod"
container := input.request.object.spec.containers[_]
not container.resources.limits["ephemeral-storage"]
msg := sprintf("Container '%v' must define ephemeral-storage limit", [container.name])
}
2. Checkov scan in your pipeline:
# Fails pipeline if ephemeral-storage limits are missing
checkov -d ./k8s-manifests \
--check CKV_K8S_20 \
--check CKV_K8S_11
3. Prometheus alerting — fire BEFORE eviction threshold is hit:
- alert: NodeDiskPressureImminent
expr: |
(node_filesystem_avail_bytes{mountpoint="/"} /
node_filesystem_size_bytes{mountpoint="/"}) < 0.15
for: 5m
labels:
severity: warning
annotations:
summary: "Node {{ $labels.instance }} disk below 15% — eviction imminent"
4. Kubelet config drift detection with kube-bench:
kube-bench node --check 4.2.10,4.2.11
# Validates containerLogMaxSize and containerLogMaxFiles are set
Set containerLogMaxSize: 50Mi and containerLogMaxFiles: 3 as your org-wide baseline. Enforce it via your node bootstrap scripts (Terraform user_data, Ansible, or EKS managed node group launch templates). Any node that drifts from this config should be flagged by your configuration management pipeline before it joins the cluster.