Initializing Enclave...

How to Fix Kubernetes 'Volume Mount Failed: Read-Only File System' Error

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 10 mins


TL;DR

  • What broke: A container attempted a write to a volume (or its own root filesystem) that the kubelet mounted as read-only — either via an explicit readOnly: true flag, securityContext.readOnlyRootFilesystem: true, a ReadOnlyMany PVC access mode mismatch, or a node-level filesystem that remounted itself ro after an I/O error.
  • How to fix it: Audit the volumeMounts[].readOnly flag, the pod-level securityContext, and the PVC accessModes. If the node filesystem went ro, you have a hardware/disk corruption event — cordon the node immediately.
  • Shortcut: Use our Client-Side Sandbox below to auto-refactor your failing Pod spec without leaking it to a third-party AI.

The Incident (What Does the Error Mean?)

Raw error surface — you will see one or more of these:

# kubelet event
Warning  Failed     5s    kubelet  Error: failed to start container "app": 
  Error response from daemon: OCI runtime create failed: 
  container_linux.go:380: starting container process caused: 
  process_linux.go:545: container init caused: 
  rootfs_linux.go:76: mounting "/var/lib/kubelet/pods/.../volumes/..." 
  to rootfs at "/data" caused: 
  mount through procfd: 
  mount /var/lib/kubelet/pods/.../volumes/...:/data (via /proc/self/fd/6), 
  flags: 0x5001: read-only file system: unknown

# or inside the container at runtime
OSError: [Errno 30] Read-only file system: '/data/output.log'

# or dmesg on the node
[1234567.890] EXT4-fs error (device nvme0n1p1): ...
[1234567.891] EXT4-fs (nvme0n1p1): Remounting filesystem read-only

Immediate consequence: The container either never starts (CrashLoopBackOff) or starts and immediately throws EROFS on the first write, killing the workload process. Persistent queues, log pipelines, and stateful apps are fully blocked.


The Attack Vector / Blast Radius

There are four distinct root causes — misidentifying them wastes hours:

Root Cause Blast Radius
volumeMounts[].readOnly: true set by mistake Single container, single volume. Fast fix.
securityContext.readOnlyRootFilesystem: true Entire container rootfs is read-only. Any write (temp files, PID files, log rotation) fails.
PVC accessModes: [ReadOnlyMany] when ReadWriteOnce is needed All pods mounting that PVC are blocked from writing.
Node-level ext4/xfs remount-ro (disk I/O error) Node is dying. All pods on that node are affected. Data loss risk is real. This is your 3am page.

The node-level scenario is the dangerous one. A degraded NVMe or EBS volume triggers the kernel to remount the filesystem read-only to prevent corruption. The kubelet itself cannot write state. Cordon and drain the node before anything else.


How to Fix It (The Solution)

Diagnosis Checklist — Run These First

# 1. Get the exact event
kubectl describe pod <pod-name> -n <namespace> | grep -A 20 "Events:"

# 2. Check the node the pod landed on
kubectl get pod <pod-name> -n <namespace> -o wide

# 3. Check node conditions — look for DiskPressure or custom taints
kubectl describe node <node-name> | grep -A 10 "Conditions:"

# 4. SSH to node and check dmesg for remount-ro events
sudo dmesg | grep -i "read-only\|remounting\|EXT4-fs error\|XFS.*error"

# 5. Verify actual mount flags on the node
cat /proc/mounts | grep <volume-path>

Fix 1 — Remove Erroneous readOnly: true on volumeMount

apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  containers:
  - name: app
    image: myapp:1.0
    volumeMounts:
    - name: data-vol
      mountPath: /data
-     readOnly: true
+     readOnly: false   # or remove the line entirely; default is false
  volumes:
  - name: data-vol
    persistentVolumeClaim:
      claimName: app-pvc

Fix 2 — readOnlyRootFilesystem: true with Writable EmptyDir Overlays (Enterprise Best Practice)

readOnlyRootFilesystem: true is a correct security hardening control — do not simply disable it. Instead, mount emptyDir volumes over the specific paths your app needs to write.

apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
  containers:
  - name: app
    image: myapp:1.0
    securityContext:
-     readOnlyRootFilesystem: false  # DO NOT do this — removes hardening
+     readOnlyRootFilesystem: true   # Keep this. Add targeted writable mounts instead.
    volumeMounts:
    - name: data-pvc
      mountPath: /data
+   - name: tmp-dir
+     mountPath: /tmp              # app writes temp files here
+   - name: run-dir
+     mountPath: /var/run          # PID files, sockets
+   - name: log-dir
+     mountPath: /var/log/app      # log rotation target
  volumes:
  - name: data-pvc
    persistentVolumeClaim:
      claimName: app-pvc
+ - name: tmp-dir
+   emptyDir: {}
+ - name: run-dir
+   emptyDir: {}
+ - name: log-dir
+   emptyDir:
+     sizeLimit: 500Mi

Fix 3 — PVC Access Mode Mismatch

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-pvc
spec:
  accessModes:
-   - ReadOnlyMany   # Wrong if the app needs to write
+   - ReadWriteOnce  # For single-node write workloads
# Use ReadWriteMany only if your CSI driver supports it (EFS, NFS, Longhorn)
  resources:
    requests:
      storage: 10Gi
  storageClassName: gp3

Fix 4 — Node-Level Remount-ro (Emergency Response)

# IMMEDIATE: Cordon the node — stop scheduling new pods
kubectl cordon <node-name>

# Drain with grace period — evict existing pods
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data --grace-period=60

# On the node itself — attempt remount rw ONLY if you have confirmed no corruption
# and this is a transient I/O blip (e.g., EBS multi-attach timeout)
sudo mount -o remount,rw /

# Verify
cat /proc/mounts | grep " / " | grep rw

# If EBS: detach and reattach the volume from the AWS console, then run fsck
sudo fsck.ext4 -y /dev/nvme0n1p1

# If corruption is confirmed: replace the node, do NOT remount rw
# Terminate the instance and let the ASG/node group provision a clean replacement

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. OPA/Gatekeeper Policy — Enforce emptyDir overlays when readOnlyRootFilesystem is true

# opa/policies/readonly-rootfs-requires-tmp-mount.rego
package kubernetes.admission

deny[msg] {
  container := input.review.object.spec.containers[_]
  container.securityContext.readOnlyRootFilesystem == true
  not has_emptydir_for_tmp(input.review.object.spec, container)
  msg := sprintf("Container '%v' has readOnlyRootFilesystem but no emptyDir for /tmp. App will crash on first write.", [container.name])
}

has_emptydir_for_tmp(spec, container) {
  mount := container.volumeMounts[_]
  mount.mountPath == "/tmp"
  vol := spec.volumes[_]
  vol.name == mount.name
  vol.emptyDir
}

2. Checkov — Scan for readOnly misconfiguration in IaC

# Install
pip install checkov

# Scan your Helm-rendered manifests or raw YAML
checkov -d ./k8s-manifests --framework kubernetes \
  --check CKV_K8S_28  # readOnlyRootFilesystem check

# In CI (GitHub Actions)
- name: Checkov Kubernetes Scan
  uses: bridgecrewio/checkov-action@master
  with:
    directory: k8s-manifests/
    framework: kubernetes
    soft_fail: false

3. Kyverno Policy — Block PVCs with ReadOnlyMany for stateful workloads

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: block-readonly-pvc-for-stateful
spec:
  validationFailureAction: Enforce
  rules:
  - name: check-pvc-access-mode
    match:
      resources:
        kinds: [StatefulSet]
    validate:
      message: "StatefulSet volumeClaimTemplates must not use ReadOnlyMany."
      pattern:
        spec:
          volumeClaimTemplates:
          - spec:
              accessModes:
                "!(ReadOnlyMany)"

4. Node Health Monitoring — Alert before remount-ro happens

# Prometheus alerting rule
groups:
- name: node-disk-health
  rules:
  - alert: NodeFilesystemReadOnly
    expr: node_filesystem_readonly{fstype!~"tmpfs|overlay"} == 1
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Node {{ $labels.instance }} filesystem {{ $labels.mountpoint }} is READ-ONLY"
      description: "Kernel remounted filesystem ro due to I/O errors. Cordon immediately."

Install node_exporter on all nodes — node_filesystem_readonly is the metric that catches this before your pods start failing.

Related Diagnostics

"Part of the Security Utility Matrix."

View all 140 Security Tools →