Initializing Enclave...

How to Fix Flannel 'Failed to Allocate Subnet: Out of Space' in Kubernetes

Threat/Impact Level: CRITICAL | Exploitability/Downtime Risk: HIGH | Time to Fix: 15–30 mins


TL;DR

  • What broke: Flannel's subnet pool (Network CIDR) is exhausted — it cannot carve out a new /24 (or configured SubnetLen) for an incoming node, so that node's pods never get IP addresses and stay in ContainerCreating indefinitely.
  • How to fix it: Expand the Network CIDR in the kube-flannel-cfg ConfigMap and restart the Flannel DaemonSet, OR reduce SubnetLen to pack more subnets from the existing pool.
  • Use our Client-Side Sandbox above to paste your net-conf.json or Flannel ConfigMap and auto-generate the corrected CIDR allocation.

The Incident (What Does the Error Mean?)

Raw log output from flanneld:

E0612 03:14:55.310420       1 network.go:116] failed to allocate subnet: out of space
FATAL failed to acquire lease: out of space

Flannel allocates one subnet per node from a global Network CIDR defined in its ConfigMap. With default settings (Network: 10.244.0.0/16, SubnetLen: 24), you get exactly 256 node subnets. Hit node 257, or accumulate enough stale leases from decommissioned nodes, and every new node's flanneld pod crashes at startup. Every pod scheduled to that node stays in ContainerCreating. The node is effectively dead to the scheduler.


The Attack Vector / Blast Radius

This is a cascading cluster availability failure, not a slow degradation:

  1. New node joins (autoscaler event, rolling upgrade, spot replacement) → flanneld fails to acquire lease → node never reaches Ready.
  2. Scheduler sees a non-Ready node → refuses to place pods → workload replicas don't scale out during a traffic spike.
  3. If the failing node was replacing an existing one (e.g., during a node group rolling update), you lose capacity and cannot replace it. Rolling updates stall indefinitely.
  4. Stale lease accumulation is the silent killer: Kubernetes node objects get deleted, but Flannel's etcd lease entries under /coreos.com/network/subnets/ are not always GC'd, consuming slots from dead nodes.
  5. In autoscaling clusters on EKS/GKE with custom CNI, this can trigger a thundering herd: autoscaler keeps provisioning nodes, each fails, autoscaler retries, burning cloud budget with zero capacity gain.

How to Fix It

Basic Fix: Expand the Network CIDR

Step 1 — Check current allocation:

# Count active subnet leases in etcd (flannel v0.x with etcd backend)
etcdctl get /coreos.com/network/subnets --prefix --keys-only | wc -l

# For Kubernetes backend, check the flannel ConfigMap
kubectl get cm kube-flannel-cfg -n kube-flannel -o jsonpath='{.data.net-conf\.json}'

# Check for stale node leases
kubectl get nodes --no-headers | wc -l

Step 2 — Purge stale leases (immediate relief, zero downtime):

# List flannel subnet annotations on nodes
kubectl get nodes -o json | jq '.items[].metadata.annotations | select(.["flannel.alpha.coreos.com/backend-data"] != null) | .["flannel.alpha.coreos.com/public-ip"]'

# Delete NotReady nodes that have stale flannel annotations
kubectl delete node <stale-node-name>

Step 3 — Expand the CIDR (requires Flannel restart, brief per-node CNI interruption):

# kubectl edit cm kube-flannel-cfg -n kube-flannel
 # net-conf.json value:
-{
-  "Network": "10.244.0.0/16",
-  "SubnetLen": 24,
-  "Backend": {
-    "Type": "vxlan"
-  }
-}
+{
+  "Network": "10.244.0.0/14",
+  "SubnetLen": 24,
+  "Backend": {
+    "Type": "vxlan"
+  }
+}

A /14 gives you 1,024 /24 subnets — 4× the default capacity.

# Rolling restart of flannel DaemonSet to pick up new config
kubectl rollout restart daemonset/kube-flannel-ds -n kube-flannel
kubectl rollout status daemonset/kube-flannel-ds -n kube-flannel

Enterprise Best Practice: Right-Size at Cluster Bootstrap

Never accept Flannel defaults for production. Calculate upfront:

Max subnets = 2^(SubnetLen - Network prefix bits)
For /14 Network + SubnetLen 24: 2^(24-14) = 1024 nodes max

Reduce SubnetLen if you need more nodes with fewer IPs per node:

-{
-  "Network": "10.244.0.0/16",
-  "SubnetLen": 24
-}
+{
+  "Network": "10.244.0.0/14",
+  "SubnetLen": 26
+}
# /26 per node = 64 IPs/node, but 2^(26-14) = 4096 possible nodes
# Trade pod density for node count

Verify no CIDR overlap with VPC/on-prem ranges before applying.


💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Enforce minimum CIDR size at cluster provisioning with OPA/Gatekeeper:

# ConstraintTemplate: deny flannel Network prefix > /15
package flannel.cidr

deny[msg] {
  input.review.object.kind == "ConfigMap"
  input.review.object.metadata.name == "kube-flannel-cfg"
  net_conf := json.unmarshal(input.review.object.data["net-conf.json"])
  cidr_prefix := to_number(split(net_conf.Network, "/")[1])
  cidr_prefix > 15
  msg := sprintf("Flannel Network CIDR /%v is too small for production. Use /15 or larger.", [cidr_prefix])
}

2. Alert on subnet exhaustion before it hits 100%:

# Prometheus alerting rule
groups:
- name: flannel
  rules:
  - alert: FlannelSubnetExhaustionWarning
    expr: |
      (count(kube_node_info) / (2 ^ (flannel_subnet_len - flannel_network_prefix_len))) > 0.75
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Flannel subnet pool >75% utilized — expand Network CIDR before cluster scales further."

3. Terraform: parameterize CIDR, never hardcode:

-  pod_cidr = "10.244.0.0/16"
+  pod_cidr = var.flannel_network_cidr  # default = "10.244.0.0/14"

4. Add a checkov custom check or conftest policy in your Helm values pipeline to reject any kube-flannel chart values where network prefix length exceeds /15.

5. Periodically run a stale lease audit as a CronJob — nodes that no longer exist in kubectl get nodes but still hold flannel subnet annotations should be evicted from etcd automatically via a cleanup script pinned to your cluster maintenance window.

Related Diagnostics

"Part of the Performance Utility Matrix."

View all 219 Performance Tools →