Initializing Enclave...

Fixing EKS Fargate 'Pod Sandbox Creation Failed' Due to ENI IP Exhaustion

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 15–45 mins depending on subnet re-IP strategy

TL;DR

  • What broke: Fargate can't allocate an ENI for the pod sandbox because every available IP in the configured subnet(s) is consumed. New pods are dead on arrival — FailedCreatePodSandBox in events, nothing scheduled.
  • How to fix it: Add secondary CIDR blocks to your VPC, provision new larger subnets, update your Fargate profile to include them, and optionally enable prefix delegation on the VPC CNI.
  • Shortcut: Use our Client-Side Sandbox above to auto-refactor your Fargate profile and Terraform VPC config — it detects exhausted CIDRs and outputs corrected subnet allocations without sending your ARNs anywhere.

The Incident (What Does the Error Mean?)

You'll see this in kubectl describe pod <pod-name>:

Warning  FailedCreatePodSandBox  kubelet  Failed to create pod sandbox:
  rpc error: code = Unknown desc = failed to set up sandbox container
  "abc123" network for pod "my-service-7d9f8b-xkqzp":
  networkPlugin cni failed to set up pod
  "my-service-7d9f8b-xkqzp_production" network:
  add cmd: failed to assign an IP address to container

And in CloudWatch /aws/eks/<cluster>/cluster or VPC Flow Logs you'll see ENI attachment failures with InsufficientFreeAddressesInSubnet.

Immediate consequence: Every new Fargate pod targeting the exhausted subnet fails to start. Deployments hang at 0/1 Running. HPA scale-out events silently fail. If this is your only AZ subnet in a Fargate profile, your entire workload loses the ability to self-heal or scale.


The Attack Vector / Blast Radius

Fargate is not EC2. There is no node-level IP sharing. Each Fargate pod consumes at minimum 1 ENI and 1 primary IP from your subnet. A /24 gives you 251 usable IPs minus AWS-reserved addresses (5 per subnet). At scale:

  • 200-pod namespace + ALB + NAT + other ENIs = subnet full, zero warning
  • No pre-exhaustion alerting exists by default. AWS does not emit a CloudWatch metric for "subnet IP utilization" out of the box until you build it yourself
  • Cascading failure: A rolling deployment tries to bring up new pods before terminating old ones (maxSurge). Both old and new pods compete for IPs. The deployment deadlocks. Old pods won't terminate because Kubernetes waits for new ones to be Ready. New ones can't start. You're stuck.
  • Multi-AZ configs where only one AZ subnet is exhausted produce intermittent failures that are hell to diagnose — some pods start, some don't, depending on which AZ the scheduler targets

How to Fix It

Basic Fix — Add a New Subnet and Update the Fargate Profile

The fastest path out of a live outage: provision a new subnet in the same AZ with a larger CIDR, then add it to your Fargate profile.

# terraform/vpc.tf

- resource "aws_subnet" "fargate_primary" {
-   vpc_id            = aws_vpc.main.id
-   cidr_block        = "10.0.1.0/24"   # 251 IPs — exhausted
-   availability_zone = "us-east-1a"
- }

+ resource "aws_subnet" "fargate_primary" {
+   vpc_id            = aws_vpc.main.id
+   cidr_block        = "10.0.1.0/24"
+   availability_zone = "us-east-1a"
+ }
+
+ resource "aws_subnet" "fargate_overflow" {
+   vpc_id            = aws_vpc.main.id
+   cidr_block        = "10.0.16.0/20"  # 4091 IPs — room to grow
+   availability_zone = "us-east-1a"
+   tags = {
+     Name = "fargate-overflow-1a"
+   }
+ }
# terraform/fargate_profile.tf

 resource "aws_eks_fargate_profile" "default" {
   cluster_name           = aws_eks_cluster.main.name
   fargate_profile_name   = "default"
   pod_execution_role_arn = aws_iam_role.fargate_pod_execution.arn

   subnet_ids = [
-    aws_subnet.fargate_primary.id
+    aws_subnet.fargate_primary.id,
+    aws_subnet.fargate_overflow.id
   ]

   selector {
     namespace = "production"
   }
 }

⚠️ Fargate profile updates that change subnet_ids require profile recreation in some Terraform provider versions. Plan for a brief scheduling gap or use blue/green profile swap.


Enterprise Best Practice — Secondary CIDR + VPC CNI Prefix Delegation

If your VPC CIDR space is itself exhausted (common in large orgs with RFC 1918 carve-outs), attach a secondary CIDR from the 100.64.0.0/10 CG-NAT range (AWS supports this) and build Fargate subnets there.

# terraform/vpc.tf

 resource "aws_vpc" "main" {
   cidr_block = "10.0.0.0/16"
 }

+ resource "aws_vpc_ipv4_cidr_block_association" "cgnat_secondary" {
+   vpc_id     = aws_vpc.main.id
+   cidr_block = "100.64.0.0/16"  # CG-NAT range, AWS-supported secondary
+ }
+
+ resource "aws_subnet" "fargate_cgnat_1a" {
+   vpc_id            = aws_vpc.main.id
+   cidr_block        = "100.64.0.0/18"  # 16,379 IPs
+   availability_zone = "us-east-1a"
+   depends_on        = [aws_vpc_ipv4_cidr_block_association.cgnat_secondary]
+ }
+
+ resource "aws_subnet" "fargate_cgnat_1b" {
+   vpc_id            = aws_vpc.main.id
+   cidr_block        = "100.64.64.0/18"
+   availability_zone = "us-east-1b"
+   depends_on        = [aws_vpc_ipv4_cidr_block_association.cgnat_secondary]
+ }

Also add a CloudWatch alarm on available IPs so you never get blindsided again:

+ resource "aws_cloudwatch_metric_alarm" "subnet_ip_exhaustion" {
+   alarm_name          = "fargate-subnet-ip-low-${each.key}"
+   comparison_operator = "LessThanThreshold"
+   evaluation_periods  = 2
+   metric_name         = "AvailableIPAddresses"
+   namespace           = "AWS/EC2"
+   period              = 300
+   statistic           = "Minimum"
+   threshold           = 50   # Alert at 50 IPs remaining
+   dimensions = {
+     SubnetId = each.value
+   }
+   alarm_actions = [aws_sns_topic.ops_alerts.arn]
+ }

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Checkov — block small Fargate subnets at plan time:

# checkov custom check: enforce minimum /20 for Fargate subnets
from checkov.terraform.checks.resource.base_resource_check import BaseResourceCheck
import ipaddress

class FargateSubnetSizeCheck(BaseResourceCheck):
    ID = "CKV_CUSTOM_FARGATE_SUBNET_SIZE"
    NAME = "Fargate subnets must be /20 or larger"
    SUPPORTED_RESOURCES = ["aws_subnet"]
    BLOCK_TYPE = "resource"

    def scan_resource_conf(self, conf):
        cidr = conf.get("cidr_block", [""])[0]
        try:
            network = ipaddress.IPv4Network(cidr, strict=False)
            return network.prefixlen <= 20  # /20 or larger = PASS
        except ValueError:
            return False

2. OPA/Conftest — gate Fargate profile PRs:

# policy/fargate_subnet.rego
package fargate

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_eks_fargate_profile"
  count(resource.change.after.subnet_ids) < 2
  msg := sprintf(
    "Fargate profile '%v' must reference subnets in at least 2 AZs to prevent single-AZ IP exhaustion.",
    [resource.name]
  )
}

3. GitHub Actions gate — run before terraform apply:

- name: Conftest Fargate Policy Check
  run: |
    terraform show -json plan.tfplan > plan.json
    conftest test plan.json --policy policy/fargate_subnet.rego

4. Ongoing: Set up a Lambda on a cron (or use AWS Config rule vpc-sg-open-only-to-authorized-ports as a template pattern) to publish AvailableIPAddresses per subnet to a custom CloudWatch namespace. Feed it into your runbook alerting. Treat subnet IP utilization above 70% as a P2 ticket, not a P0 fire.

Related Diagnostics

"Part of the Performance Utility Matrix."

View all 219 Performance Tools →