Initializing Enclave...

How to Fix Docker MTU Mismatch Causing Packet Drop on the docker0 Bridge

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 10 mins

TL;DR

  • What broke: Docker's docker0 bridge defaults to MTU 1500. If your host NIC or overlay network (VPN, VXLAN, AWS VPC with encapsulation) uses a smaller MTU (e.g., 1450 or 9001), packets exceeding that size are silently dropped — no ICMP fragmentation-needed is returned in many cloud environments.
  • How to fix it: Set "mtu": <correct_value> in /etc/docker/daemon.json to match or stay below your host interface MTU, then restart Docker.
  • Shortcut: Use our Client-Side Sandbox below to auto-refactor your daemon.json or docker-compose.yml network block with the correct MTU without pasting secrets into a third-party AI.

The Incident (What Does the Error Mean?)

You won't get a clean error. This is what makes it brutal. Symptoms look like:

# curl from inside container to external service hangs on large payloads
curl: (18) transfer closed with outstanding read data remaining

# ping works, but TCP sessions stall
ping 8.8.8.8   # succeeds
curl https://api.example.com  # hangs at TLS handshake or after headers

# tcpdump on docker0 shows retransmits
tcpdump -i docker0 -n 'tcp and (tcp[tcpflags] & tcp-syn != 0)'
# ... endless SYN-ACK retransmits for large-window packets

Immediate consequence: Any TCP session transferring payloads larger than the underlay MTU minus encapsulation overhead will stall or fail. This kills HTTPS API calls, database connections, and inter-service gRPC streams — silently, intermittently, and only under load. On-call engineers waste hours chasing application bugs that are actually a 50-byte MTU delta.


The Attack Vector / Blast Radius

This is a cascading infrastructure failure, not a one-container problem.

Why it's invisible: Most cloud VPCs (AWS, GCP, Azure) disable ICMP Type 3 Code 4 ("Fragmentation Needed") responses at the security group or VPC router level. Path MTU Discovery (PMTUD) breaks. Packets are simply dropped. TCP retransmits forever. Your application sees a hung connection.

Blast radius by environment:

Environment Host MTU Encap Overhead Safe Container MTU
Bare metal / LAN 1500 0 1500
AWS VPC (non-Jumbo) 9001 0 1500 (safe default)
AWS VPC + VXLAN (EKS, Flannel) 9001 50 bytes 1450
WireGuard VPN tunnel 1420 80 bytes 1420
GCP default 1460 0 1460
OpenStack VXLAN 1450 50 bytes 1400

Cascading failure path:

  1. Container MTU 1500 > tunnel MTU 1450
  2. Large packet hits docker0, exits host NIC, gets dropped at underlay router
  3. No ICMP fragmentation-needed returned (cloud firewall blocks it)
  4. TCP sender never learns the path MTU, keeps sending oversized segments
  5. Connection hangs → application timeout → retry storm → upstream service rate-limits your IPs → full service degradation

How to Fix It (The Solution)

Step 1: Identify Your Actual Host MTU

# Find the host's primary outbound interface MTU
ip link show eth0
# or
ip route get 8.8.8.8 | grep -oP 'dev \K\S+' | xargs ip link show

# Example output:
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 ...
# If you're on a VPN or VXLAN overlay, subtract 50-80 bytes from that value.
# Confirm docker0 current MTU
ip link show docker0
# 3: docker0: <...> mtu 1500 ...

Basic Fix — /etc/docker/daemon.json

# /etc/docker/daemon.json
 {
-  "log-driver": "json-file"
+  "log-driver": "json-file",
+  "mtu": 1450
 }
# Apply it
sudo systemctl restart docker

# Verify
ip link show docker0
# Should now show mtu 1450

⚠️ Restarting Docker stops all running containers. Schedule this during a maintenance window or use a rolling restart strategy on orchestrated workloads.

Enterprise Best Practice — docker-compose per-network MTU

For teams that cannot touch the daemon globally, scope the MTU to specific Compose networks:

# docker-compose.yml
 version: '3.8'
 services:
   api:
     image: my-api:latest
     networks:
-      - default
+      - internal
 
+networks:
+  internal:
+    driver: bridge
+    driver_opts:
+      com.docker.network.driver.mtu: '1450'

Enterprise Best Practice — Kubernetes / CNI (Flannel, Calico, Weave)

If this is surfacing in a Kubernetes cluster using Docker as the runtime:

# Flannel ConfigMap (kube-flannel-cfg)
 data:
   net-conf.json: |
     {
       "Network": "10.244.0.0/16",
       "Backend": {
         "Type": "vxlan",
+        "MTU": 1450
       }
     }

For Calico, set FELIX_IPINIPMTU and FELIX_VXLANMTU environment variables on the calico-node DaemonSet, or use the FelixConfiguration CRD:

 apiVersion: projectcalico.org/v3
 kind: FelixConfiguration
 metadata:
   name: default
 spec:
+  mtu: 1450

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

The goal: Catch MTU misconfiguration before it reaches production.

1. Checkov — Scan daemon.json in IaC pipelines

Checkov doesn't have a built-in Docker MTU rule, so write a custom check:

# checkov/custom_checks/docker_mtu.py
from checkov.common.models.enums import CheckResult, CheckCategories
from checkov.json_doc.base_json_check import BaseJsonCheck

class DockerMTUCheck(BaseJsonCheck):
    def __init__(self):
        name = "Ensure Docker daemon MTU is set below 1500 for overlay environments"
        id = "CKV_DOCKER_MTU_001"
        super().__init__(name=name, id=id, categories=[CheckCategories.NETWORKING], supported_entities=['*'])

    def get_resource_id(self, conf):
        return "daemon.json"

    def scan_resource_conf(self, conf):
        mtu = conf.get("mtu", 1500)
        if isinstance(mtu, int) and mtu <= 1450:
            return CheckResult.PASSED
        return CheckResult.FAILED

2. CI Pipeline Gate (GitHub Actions)

# .github/workflows/docker-config-lint.yml
jobs:
  lint-docker-config:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Validate daemon.json MTU
        run: |
          MTU=$(jq '.mtu // 1500' infra/daemon.json)
          if [ "$MTU" -gt 1450 ]; then
            echo "ERROR: daemon.json MTU ($MTU) exceeds safe threshold for overlay networks. Set to <= 1450."
            exit 1
          fi

3. Runtime Monitoring — Alert on MTU Mismatch

# Add to your node bootstrap / cloud-init script
HOST_MTU=$(ip link show eth0 | grep -oP 'mtu \K[0-9]+')
DOCKER_MTU=$(docker network inspect bridge --format '{{json .Options}}' | jq -r '["com.docker.network.driver.mtu"] // "1500"')

if [ "$DOCKER_MTU" -ge "$HOST_MTU" ]; then
  echo "WARN: docker0 MTU ($DOCKER_MTU) >= host MTU ($HOST_MTU). Packet drops likely."
  # Send to PagerDuty / Datadog / Slack webhook here
fi

4. Terraform — Enforce MTU in User Data

# terraform/modules/ec2-docker-host/main.tf
 resource "aws_instance" "docker_host" {
   user_data = <<-EOF
     #!/bin/bash
     cat > /etc/docker/daemon.json <<'DOCKERCFG'
     {
-      "log-driver": "json-file"
+      "log-driver": "json-file",
+      "mtu": 1450
     }
     DOCKERCFG
     systemctl restart docker
   EOF
 }

Enforce this with a Sentinel or OPA policy in your Terraform Cloud workspace to reject any daemon.json template missing the mtu key when the instance is tagged overlay: true.

Related Diagnostics

"Part of the Performance Utility Matrix."

View all 219 Performance Tools →