Fixing Longhorn 'Failed to Provision Volume with StorageClass' CSI Driver Mismatch Error
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–10 mins
TL;DR
- What broke: Your StorageClass has a wrong
provisionerfield — it does not matchdriver.longhorn.io, so the Longhorn CSI plugin never receives the provision request. - How to fix it: Correct the
provisionervalue in your StorageClass to exactlydriver.longhorn.ioand re-apply. Existing PVCs inPendingstate will self-heal within seconds. - Fast path: Use our Client-Side Sandbox below to auto-refactor this — paste your StorageClass YAML and get a corrected manifest without sending data off your machine.
The Incident (What Does the Error Mean?)
Raw event output from kubectl describe pvc <pvc-name>:
Warning ProvisioningFailed 3s persistentvolume-controller
failed to provision volume with StorageClass "longhorn":
no volume plugin matched name: driver.longhorn.io.csi
or alternatively:
Warning ProvisioningFailed persistentvolume-controller
failed to provision volume with StorageClass "longhorn":
CSI driver "io.rancher.longhorn" not found, waiting for it to be registered
Immediate consequence: Every PVC bound to this StorageClass stays in Pending indefinitely. Any Pod with a volumeClaimTemplate referencing it — StatefulSets, databases, message queues — will not schedule. Your workload is dead in the water.
The Attack Vector / Blast Radius
This is not a transient failure. The external-provisioner sidecar inside the Longhorn CSI controller pod watches for PVCs whose storageClassName maps to a provisioner it owns. The matching is a pure string equality check. One character off — a trailing .csi, a reversed domain like io.rancher.longhorn instead of driver.longhorn.io — and the provisioner loop silently skips the PVC forever.
Blast radius in a production cluster:
- All StatefulSets referencing this StorageClass fail to produce ready Pods.
- Helm releases that auto-create PVCs (Prometheus, Loki, PostgreSQL operators) will show
0/1ready with no obvious application-layer error. - If this StorageClass is set as the cluster default, every unspecified PVC across all namespaces is affected.
- Node storage pressure does not increase — Longhorn never touches the disk — so infrastructure monitoring stays green while the application layer is fully degraded. This is the dangerous part. Ops teams waste 30–60 minutes checking node health before looking at the provisioner string.
How to Fix It (The Solution)
Basic Fix — Correct the StorageClass Provisioner
The canonical Longhorn CSI driver name registered with the Kubernetes API server is driver.longhorn.io. Verify what is currently registered:
kubectl get csidriver
Expected output:
NAME ATTACHREQUIRED PODINFOONMOUNT
driver.longhorn.io true true
Now inspect your broken StorageClass:
kubectl get storageclass longhorn -o yaml
Apply the corrected manifest:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn
annotations:
storageclass.kubernetes.io/is-default-class: "true"
-provisioner: io.rancher.longhorn
+provisioner: driver.longhorn.io
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "2880"
fromBackup: ""
- fsType: "ext4"
+ fsType: "ext4" # unchanged — shown for context
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
kubectl apply -f storageclass-longhorn-fixed.yaml
# PVCs in Pending state will be retried automatically by the controller
kubectl get pvc -A -w
Enterprise Best Practice — Validate at Admission Time
Do not rely on humans getting the provisioner string right. Enforce it at the API server level with an OPA/Gatekeeper constraint or a Kyverno policy.
Kyverno ClusterPolicy (recommended for Longhorn clusters):
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: validate-longhorn-storageclass
spec:
validationFailureAction: Enforce
rules:
- name: check-longhorn-provisioner
match:
resources:
kinds:
- StorageClass
validate:
message: >-
- # No validation — any provisioner string accepted
+ StorageClass provisioner must be 'driver.longhorn.io' for Longhorn volumes.
+ pattern:
+ provisioner: "driver.longhorn.io"
This blocks any kubectl apply or Helm install that ships a malformed StorageClass before it ever reaches etcd.
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. Conftest + OPA in your GitOps pipeline
Add a Rego rule to your conftest policy bundle that runs on every PR touching StorageClass manifests:
package main
deny[msg] {
input.kind == "StorageClass"
input.provisioner != "driver.longhorn.io"
msg := sprintf(
"StorageClass '%v' has invalid provisioner '%v'. Must be 'driver.longhorn.io'.",
[input.metadata.name, input.provisioner]
)
}
Wire it into CI:
conftest test ./k8s/storage/ --policy ./policies/
2. Checkov for Helm chart scanning
If you're rendering Longhorn via Helm values, add a custom Checkov check or use helm template | conftest test - in your pipeline stage.
3. Argo CD / Flux diff alerts
Enable diff notifications on StorageClass resources. A provisioner string change should trigger a Slack alert and require manual sync approval — it is never a safe auto-sync target.
4. Pin the Longhorn Helm chart version
Longhorn changed its CSI driver name from io.rancher.longhorn to driver.longhorn.io between versions. Unpinned chart upgrades in older clusters are the most common source of this exact regression. Always pin:
# helmrelease.yaml
spec:
chart:
spec:
chart: longhorn
version: "1.6.2" # pin explicitly, never use '*' or 'latest'
sourceRef:
kind: HelmRepository
name: longhorn