Initializing Enclave...

How to Fix ContainerCreating Stuck: Timeout Expired Waiting for Volumes to Attach or Mount in Kubernetes

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 15–30 mins

TL;DR

  • What broke: The kubelet on the target node could not attach or mount the PersistentVolume within the CSI/in-tree driver's timeout, leaving the pod indefinitely in ContainerCreating.
  • How to fix it: Force-delete stale VolumeAttachment objects, verify AZ/node affinity alignment between PV and node, and confirm the CSI driver pod is healthy on the affected node.
  • Use our Client-Side Sandbox above to paste your kubectl describe pod output and auto-generate the exact remediation commands and patched YAML.

The Incident (What Does the Error Mean?)

Raw event output from kubectl describe pod <pod-name>:

Warning  FailedMount  3m    kubelet  Unable to attach or mount volumes: 
  unmounted volumes=[data-vol], unattached volumes=[data-vol kube-api-access-xxxxx]: 
  timed out waiting for the condition

Warning  FailedAttachVolume  3m  attachdetach-controller  
  AttachVolume.Attach failed for volume "pvc-a1b2c3d4" : 
  context deadline exceeded

Immediate consequence: The pod never exits ContainerCreating. All containers inside are blocked. If this is a Deployment, the rollout stalls. If it's a StatefulSet, the entire ordinal sequence halts — every subsequent pod waits behind this one.


The Attack Vector / Blast Radius

This is not a soft degradation. It is a hard availability failure with the following cascade:

  1. StatefulSet deadlock: Pod-0 stuck → Pod-1 never scheduled → quorum-dependent workloads (Kafka, etcd, Postgres replicas) lose a member and may lose write quorum entirely.
  2. Node drain amplification: If the volume was previously attached to a node that was drained or terminated without a clean detach, the cloud provider's block device (EBS, GCE PD, Azure Disk) is still marked in-use. The new node cannot attach it. The VolumeAttachment object in the Kubernetes API is orphaned and will not self-heal without intervention.
  3. AZ mismatch silent killer: A ReadWriteOnce PV provisioned in us-east-1a cannot attach to a node rescheduled into us-east-1b. The scheduler does not enforce this by default unless volumeBindingMode: WaitForFirstConsumer is set on the StorageClass. This is the single most common root cause in auto-scaling clusters.
  4. CSI driver pod eviction: If the csi-node DaemonSet pod was evicted or OOMKilled on the target node, every mount request on that node silently times out. No obvious error surfaces until you check the CSI pod logs directly.

How to Fix It

Step 1: Pinpoint the root cause

# Get the full event stream
kubectl describe pod <pod-name> -n <namespace>

# Check VolumeAttachment objects for orphans
kubectl get volumeattachment
kubectl describe volumeattachment <va-name>

# Check CSI node driver health on the affected node
kubectl get pods -n kube-system -o wide | grep csi
kubectl logs -n kube-system <csi-node-pod> -c <driver-container> --tail=100

# Verify PV/PVC binding and AZ annotation
kubectl get pv <pv-name> -o yaml | grep -A5 'nodeAffinity'
kubectl get node <node-name> -o yaml | grep topology

Basic Fix: Force-delete the stale VolumeAttachment

When a node was terminated without cleanly detaching the volume, the VolumeAttachment object persists. Delete it — the attach-detach controller will recreate it correctly against the new node.

# Identify the stale attachment
kubectl get volumeattachment | grep <pv-name>

# Force delete (safe only after confirming the old node is gone)
kubectl delete volumeattachment <va-name>

If the object is stuck in Terminating due to a finalizer:

kubectl patch volumeattachment <va-name> \
  -p '{"metadata":{"finalizers":null}}' \
  --type=merge

Enterprise Best Practice: Fix the StorageClass to prevent AZ mismatch

The root architectural fix is enforcing WaitForFirstConsumer binding mode so the PV is always provisioned in the same AZ as the scheduled pod.

 apiVersion: storage.k8s.io/v1
 kind: StorageClass
 metadata:
   name: gp3-encrypted
 provisioner: ebs.csi.aws.com
 parameters:
   type: gp3
   encrypted: "true"
-volumeBindingMode: Immediate
+volumeBindingMode: WaitForFirstConsumer
+allowVolumeExpansion: true
 reclaimPolicy: Retain

Why this matters: Immediate binding provisions the EBS volume the moment the PVC is created, before any pod is scheduled — it picks an AZ at random. WaitForFirstConsumer delays provisioning until the scheduler picks a node, then provisions the volume in that node's AZ. This eliminates the entire class of cross-AZ attach failures.


Enterprise Best Practice: StatefulSet topology spread

For StatefulSets that must survive node failure without multi-attach errors, pin each replica to its own AZ using topologySpreadConstraints:

 spec:
   replicas: 3
+  topologySpreadConstraints:
+  - maxSkew: 1
+    topologyKey: topology.kubernetes.io/zone
+    whenUnsatisfiable: DoNotSchedule
+    labelSelector:
+      matchLabels:
+        app: my-stateful-app
   template:
     spec:
       containers:

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Enforce WaitForFirstConsumer with OPA/Gatekeeper

package k8sstorage

deny[msg] {
  input.review.object.kind == "StorageClass"
  input.review.object.volumeBindingMode != "WaitForFirstConsumer"
  msg := sprintf(
    "StorageClass '%v' must use volumeBindingMode: WaitForFirstConsumer to prevent AZ attach failures.",
    [input.review.object.metadata.name]
  )
}

Deploy as a ConstraintTemplate + K8sStorageBindingMode constraint. This blocks any StorageClass using Immediate from being applied to the cluster.

2. Checkov static scan in your Terraform pipeline

If you provision StorageClasses or EBS volumes via Terraform:

checkov -d ./terraform --check CKV_K8S_28

CKV_K8S_28 flags storage configurations that risk data exposure or misconfiguration. Add custom checks for binding mode via Checkov's Python SDK.

3. Alert on stuck ContainerCreating in your monitoring stack

# Prometheus alerting rule
- alert: PodStuckContainerCreating
  expr: |
    kube_pod_container_status_waiting_reason{reason="ContainerCreating"} == 1
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "Pod {{ $labels.pod }} stuck in ContainerCreating for >5m"
    runbook_url: "https://your-wiki/runbooks/volume-attach-timeout"

4. Node termination lifecycle hook (AWS)

For EKS with Cluster Autoscaler or Karpenter, ensure nodes drain cleanly before termination to guarantee volume detach:

 apiVersion: karpenter.sh/v1alpha5
 kind: Provisioner
 spec:
+  ttlSecondsUntilExpired: 2592000
+  ttlSecondsAfterEmpty: 30
   provider:
     instanceProfile: KarpenterNodeInstanceProfile

Pair this with the AWS Node Termination Handler DaemonSet to intercept Spot interruption and ASG lifecycle events, cordon/drain the node, and allow the attach-detach controller to cleanly release volumes before the instance disappears.

Related Diagnostics

"Part of the Performance Utility Matrix."

View all 219 Performance Tools →