How to Fix PVC Resize Stuck in FileSystemResizePending: Volume Expansion Requires Node Restart
Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 10–20 mins
TL;DR
- What broke: Kubernetes expanded the underlying block volume at the storage layer, but the filesystem inside the PVC was never resized because the pod consuming it was never restarted — the node-level resize agent (
kubelet) only triggersresize2fs/xfs_growfswhen the volume is remounted. - How to fix it: Delete and reschedule the consuming pod (or perform a rolling restart of the workload) so
kubeletremounts the volume and completes the filesystem expansion. - Use our Client-Side Sandbox above to paste your PVC YAML and StorageClass manifest — it will auto-detect the missing
allowVolumeExpansion: trueflag and generate the corrected rollout patch.
The Incident (What Does the Error Mean?)
You ran kubectl get pvc and saw this:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-pvc-0 Bound pvc-a1b2c3d4 20Gi RWO gp2 14d
But you patched it to 50Gi. Now:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-pvc-0 Bound pvc-a1b2c3d4 20Gi RWO gp2 14d
Conditions:
Type Status
FileSystemResizePending True — Waiting for user to (re)start a pod to finish file system resize of volume on node.
Immediate consequence: The block device is already 50Gi at the cloud provider level (EBS, GCE PD, Azure Disk). The pod still sees 20Gi. Any write that would have fit in 50Gi fails. Stateful workloads — Postgres, Kafka, Elasticsearch — will crash with ENOSPC or hang on write.
The Attack Vector / Blast Radius
This is a two-phase resize and Kubernetes only completed phase 1.
Phase 1 — Control plane / cloud API: external-resizer CSI sidecar calls the cloud API (ModifyVolume on EBS, etc.). Block device is now 50Gi. PV object is updated. ✅ Done.
Phase 2 — Node / filesystem: kubelet must detect the mounted volume has a larger block device, then call resize2fs (ext4) or xfs_growfs (XFS) inside the pod's mount namespace. This only happens on pod restart / remount. If the pod is still running, kubelet never gets the trigger.
Blast radius:
- Stateful sets with
volumeClaimTemplateswill have every replica in this state — one bad rollout and all replicas are capacity-starved simultaneously. - If
allowVolumeExpansion: falseon the StorageClass, the PVC patch was silently accepted by the API server but the CSI driver will reject it — leaving the PVC in a permanently inconsistent desired vs. actual state. - On
ReadWriteOncevolumes, the block device is node-locked. A pod rescheduled to a different node will force a detach/reattach, which can take 6–10 minutes on AWS EBS (themulti-attacherror window).
How to Fix It
Step 0 — Verify StorageClass has expansion enabled
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp2
provisioner: ebs.csi.aws.com
parameters:
type: gp2
-# allowVolumeExpansion not set (defaults to false)
+allowVolumeExpansion: true
If this flag is missing, patch it:
kubectl patch storageclass gp2 -p '{"allowVolumeExpansion": true}'
⚠️ Patching an existing StorageClass only affects future resize requests. If the PVC resize was already rejected, re-apply the PVC patch after fixing the StorageClass.
Basic Fix — Force pod remount by restarting the workload
For a Deployment:
kubectl rollout restart deployment/<your-deployment> -n <namespace>
For a StatefulSet (do this carefully — one pod at a time):
kubectl rollout restart statefulset/<your-statefulset> -n <namespace>
Verify kubelet completed phase 2:
kubectl describe pvc data-pvc-0 -n <namespace>
# Conditions block should be empty (FileSystemResizePending gone)
kubectl exec -it <pod> -- df -h /data
# Should now show 50G
Enterprise Best Practice — Zero-downtime resize for StatefulSets
For production StatefulSets where you cannot afford a full rollout restart:
# 1. Cordon the node the pod is on (prevents rescheduling chaos)
+kubectl cordon <node-name>
# 2. Delete only the specific pod — StatefulSet controller reschedules it
+kubectl delete pod <statefulset-name>-0 -n <namespace>
# 3. Watch kubelet complete fs resize on pod startup
+kubectl get events -n <namespace> --field-selector reason=FileSystemResizeSuccessful -w
# 4. Uncordon
+kubectl uncordon <node-name>
For CSI drivers that support online filesystem resize (e.g., ebs.csi.aws.com >= v1.11):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-online-resize
provisioner: ebs.csi.aws.com
parameters:
type: gp3
allowVolumeExpansion: true
+# CSI driver annotation to attempt online resize without pod restart
+# Only works if kernel supports it AND volume is ext4/xfs on Linux >= 5.4
annotations:
+ ebs.csi.aws.com/fs-resize-online: "true"
Note: Online resize without pod restart is driver and kernel version dependent. Do not rely on it for critical workloads without testing.
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. OPA/Gatekeeper — Enforce allowVolumeExpansion: true on all StorageClasses
package storageclass
deny[msg] {
input.kind == "StorageClass"
not input.allowVolumeExpansion == true
msg := sprintf("StorageClass '%v' must have allowVolumeExpansion: true", [input.metadata.name])
}
2. Checkov — Block non-expandable StorageClass in Terraform
checkov -d ./terraform --check CKV_K8S_STORAGE_ALLOW_EXPANSION
If using the Kubernetes Terraform provider:
resource "kubernetes_storage_class" "gp3" {
metadata { name = "gp3" }
storage_provisioner = "ebs.csi.aws.com"
+ allow_volume_expansion = true
parameters = { type = "gp3" }
}
3. Helm pre-upgrade hook — Validate PVC resize will succeed before rollout
apiVersion: batch/v1
kind: Job
metadata:
name: pvc-resize-preflight
annotations:
"helm.sh/hook": pre-upgrade
"helm.sh/hook-delete-policy": hook-succeeded
spec:
template:
spec:
containers:
- name: checker
image: bitnami/kubectl:latest
command:
- /bin/sh
- -c
- |
SC=$(kubectl get pvc $PVC_NAME -o jsonpath='{.spec.storageClassName}')
EXPAND=$(kubectl get sc $SC -o jsonpath='{.allowVolumeExpansion}')
[ "$EXPAND" = "true" ] || (echo "StorageClass $SC does not allow expansion. Aborting."; exit 1)
4. AlertManager rule — Alert on FileSystemResizePending before it causes ENOSPC
- alert: PVCFilesystemResizePending
expr: kube_persistentvolumeclaim_status_condition{condition="FileSystemResizePending",status="true"} == 1
for: 5m
labels:
severity: warning
annotations:
summary: "PVC {{ $labels.persistentvolumeclaim }} filesystem resize pending — pod restart required"