How to Fix Docker MTU Mismatch Causing Packet Drop on the docker0 Bridge
Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 10 mins
TL;DR
- What broke: Docker's
docker0bridge defaults to MTU 1500. If your host NIC or overlay network (VPN, VXLAN, AWS VPC with encapsulation) uses a smaller MTU (e.g., 1450 or 9001), packets exceeding that size are silently dropped — no ICMP fragmentation-needed is returned in many cloud environments. - How to fix it: Set
"mtu": <correct_value>in/etc/docker/daemon.jsonto match or stay below your host interface MTU, then restart Docker. - Shortcut: Use our Client-Side Sandbox below to auto-refactor your
daemon.jsonordocker-compose.ymlnetwork block with the correct MTU without pasting secrets into a third-party AI.
The Incident (What Does the Error Mean?)
You won't get a clean error. This is what makes it brutal. Symptoms look like:
# curl from inside container to external service hangs on large payloads
curl: (18) transfer closed with outstanding read data remaining
# ping works, but TCP sessions stall
ping 8.8.8.8 # succeeds
curl https://api.example.com # hangs at TLS handshake or after headers
# tcpdump on docker0 shows retransmits
tcpdump -i docker0 -n 'tcp and (tcp[tcpflags] & tcp-syn != 0)'
# ... endless SYN-ACK retransmits for large-window packets
Immediate consequence: Any TCP session transferring payloads larger than the underlay MTU minus encapsulation overhead will stall or fail. This kills HTTPS API calls, database connections, and inter-service gRPC streams — silently, intermittently, and only under load. On-call engineers waste hours chasing application bugs that are actually a 50-byte MTU delta.
The Attack Vector / Blast Radius
This is a cascading infrastructure failure, not a one-container problem.
Why it's invisible: Most cloud VPCs (AWS, GCP, Azure) disable ICMP Type 3 Code 4 ("Fragmentation Needed") responses at the security group or VPC router level. Path MTU Discovery (PMTUD) breaks. Packets are simply dropped. TCP retransmits forever. Your application sees a hung connection.
Blast radius by environment:
| Environment | Host MTU | Encap Overhead | Safe Container MTU |
|---|---|---|---|
| Bare metal / LAN | 1500 | 0 | 1500 |
| AWS VPC (non-Jumbo) | 9001 | 0 | 1500 (safe default) |
| AWS VPC + VXLAN (EKS, Flannel) | 9001 | 50 bytes | 1450 |
| WireGuard VPN tunnel | 1420 | 80 bytes | 1420 |
| GCP default | 1460 | 0 | 1460 |
| OpenStack VXLAN | 1450 | 50 bytes | 1400 |
Cascading failure path:
- Container MTU 1500 > tunnel MTU 1450
- Large packet hits docker0, exits host NIC, gets dropped at underlay router
- No ICMP fragmentation-needed returned (cloud firewall blocks it)
- TCP sender never learns the path MTU, keeps sending oversized segments
- Connection hangs → application timeout → retry storm → upstream service rate-limits your IPs → full service degradation
How to Fix It (The Solution)
Step 1: Identify Your Actual Host MTU
# Find the host's primary outbound interface MTU
ip link show eth0
# or
ip route get 8.8.8.8 | grep -oP 'dev \K\S+' | xargs ip link show
# Example output:
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 ...
# If you're on a VPN or VXLAN overlay, subtract 50-80 bytes from that value.
# Confirm docker0 current MTU
ip link show docker0
# 3: docker0: <...> mtu 1500 ...
Basic Fix — /etc/docker/daemon.json
# /etc/docker/daemon.json
{
- "log-driver": "json-file"
+ "log-driver": "json-file",
+ "mtu": 1450
}
# Apply it
sudo systemctl restart docker
# Verify
ip link show docker0
# Should now show mtu 1450
⚠️ Restarting Docker stops all running containers. Schedule this during a maintenance window or use a rolling restart strategy on orchestrated workloads.
Enterprise Best Practice — docker-compose per-network MTU
For teams that cannot touch the daemon globally, scope the MTU to specific Compose networks:
# docker-compose.yml
version: '3.8'
services:
api:
image: my-api:latest
networks:
- - default
+ - internal
+networks:
+ internal:
+ driver: bridge
+ driver_opts:
+ com.docker.network.driver.mtu: '1450'
Enterprise Best Practice — Kubernetes / CNI (Flannel, Calico, Weave)
If this is surfacing in a Kubernetes cluster using Docker as the runtime:
# Flannel ConfigMap (kube-flannel-cfg)
data:
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan",
+ "MTU": 1450
}
}
For Calico, set FELIX_IPINIPMTU and FELIX_VXLANMTU environment variables on the calico-node DaemonSet, or use the FelixConfiguration CRD:
apiVersion: projectcalico.org/v3
kind: FelixConfiguration
metadata:
name: default
spec:
+ mtu: 1450
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
The goal: Catch MTU misconfiguration before it reaches production.
1. Checkov — Scan daemon.json in IaC pipelines
Checkov doesn't have a built-in Docker MTU rule, so write a custom check:
# checkov/custom_checks/docker_mtu.py
from checkov.common.models.enums import CheckResult, CheckCategories
from checkov.json_doc.base_json_check import BaseJsonCheck
class DockerMTUCheck(BaseJsonCheck):
def __init__(self):
name = "Ensure Docker daemon MTU is set below 1500 for overlay environments"
id = "CKV_DOCKER_MTU_001"
super().__init__(name=name, id=id, categories=[CheckCategories.NETWORKING], supported_entities=['*'])
def get_resource_id(self, conf):
return "daemon.json"
def scan_resource_conf(self, conf):
mtu = conf.get("mtu", 1500)
if isinstance(mtu, int) and mtu <= 1450:
return CheckResult.PASSED
return CheckResult.FAILED
2. CI Pipeline Gate (GitHub Actions)
# .github/workflows/docker-config-lint.yml
jobs:
lint-docker-config:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate daemon.json MTU
run: |
MTU=$(jq '.mtu // 1500' infra/daemon.json)
if [ "$MTU" -gt 1450 ]; then
echo "ERROR: daemon.json MTU ($MTU) exceeds safe threshold for overlay networks. Set to <= 1450."
exit 1
fi
3. Runtime Monitoring — Alert on MTU Mismatch
# Add to your node bootstrap / cloud-init script
HOST_MTU=$(ip link show eth0 | grep -oP 'mtu \K[0-9]+')
DOCKER_MTU=$(docker network inspect bridge --format '{{json .Options}}' | jq -r '["com.docker.network.driver.mtu"] // "1500"')
if [ "$DOCKER_MTU" -ge "$HOST_MTU" ]; then
echo "WARN: docker0 MTU ($DOCKER_MTU) >= host MTU ($HOST_MTU). Packet drops likely."
# Send to PagerDuty / Datadog / Slack webhook here
fi
4. Terraform — Enforce MTU in User Data
# terraform/modules/ec2-docker-host/main.tf
resource "aws_instance" "docker_host" {
user_data = <<-EOF
#!/bin/bash
cat > /etc/docker/daemon.json <<'DOCKERCFG'
{
- "log-driver": "json-file"
+ "log-driver": "json-file",
+ "mtu": 1450
}
DOCKERCFG
systemctl restart docker
EOF
}
Enforce this with a Sentinel or OPA policy in your Terraform Cloud workspace to reject any daemon.json template missing the mtu key when the instance is tagged overlay: true.