Why does Docker keep picking subnets that conflict with my VPN?

Docker's IPAM allocator scans its configured address pool (default: 172.17.0.0/16 through 172.31.0.0/16 and several 192.168.x.0/20 blocks) but it checks only existing Docker network assignments — it does NOT inspect the host's kernel routing table or active network interfaces. When your VPN client adds a route for 172.16.0.0/12 after Docker has already started, Docker has no awareness of it. Fix: set `default-address-pools` in `/etc/docker/daemon.json` to a CIDR range outside your VPN's tunnel space, and restart the daemon.

How do I find which existing Docker network is consuming the conflicting subnet?

Run: `docker network ls --format '{{.Name}}' | xargs -I{} docker network inspect {} --format '{{.Name}}: {{range .IPAM.Config}}{{.Subnet}}{{end}}'`. This dumps every network's assigned CIDR in one shot. Cross-reference with `ip route show` to find the exact collision. Then either remove the stale Docker network with `docker network rm ` or reassign your new network to a different subnet.

Is it safe to use `docker network prune` in production?

Yes, with one caveat: `docker network prune` only removes networks that have zero containers attached (running or stopped). It will not touch any network currently in use. However, if you have stopped containers (not removed) attached to a network, that network is retained. Run `docker container prune -f` first to remove stopped containers, then `docker network prune -f`. Always verify with `docker network ls` before and after in production environments.

Fixing Docker 'failed to allocate gateway' Subnet Overlap Errors in Custom Bridge Networks

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–15 mins

TL;DR

What broke: docker network create failed because the requested (or auto-picked) subnet overlaps an existing Docker network, a host interface route, or a VPN tunnel CIDR already present in the kernel routing table.
How to fix it: Explicitly assign a non-conflicting --subnet and --gateway from an unused RFC-1918 block, or prune stale Docker networks consuming the pool.
Shortcut: Use our Client-Side Sandbox below to auto-refactor your docker network create command or daemon.json — no config leaves your browser.

The Incident (What Does the Error Mean?)

Raw error output you'll see in the Docker daemon log or CLI:

Error response from daemon: Failed to Setup IP tables: Unable to enable NAT rule:  (iptables failed: iptables --wait -t nat -I DOCKER 0 ...
# or more commonly:
Error response from daemon: could not find an available, non-overlapping IPv4 address pool among the defaults to assign to the network
# or on explicit subnet:
Error response from daemon: failed to allocate gateway (172.18.0.1): Address already in use

Immediate consequence: The network is never created. Any docker-compose up or docker run --network referencing this network hard-fails. In a Compose stack, this kills every dependent service simultaneously — a full stack outage, not a partial one.

The Docker libnetwork IPAM driver walks a default pool (172.17.0.0/16 → 172.18.0.0/16 → ... → 172.31.0.0/16, then 192.168.0.0/20 blocks). If every block in that pool is either assigned to an existing Docker network or conflicts with a host route (VPN, corporate LAN, cloud VPC interface), allocation fails entirely.

The Attack Vector / Blast Radius

This is a networking misconfiguration with a wide blast radius in multi-tenant and CI/CD environments:

CI runner exhaustion: Ephemeral CI pipelines (GitLab Runner, GitHub Actions self-hosted) create and destroy Docker networks rapidly. Stale networks from crashed jobs are never pruned. After enough runs, the entire default IPAM pool is consumed. The next pipeline run fails at network creation — not at the app layer, making the error non-obvious.
VPN/corporate LAN collision: Wireguard or OpenVPN tunnels commonly use 10.0.0.0/8 or 172.16.0.0/12 — the exact space Docker draws from. When a developer connects to VPN, Docker's gateway IP becomes unreachable or conflicts with the tunnel interface. Containers lose external connectivity silently or network creation fails outright.
Cloud VPC overlap: On EC2/GCE instances inside a 172.16.0.0/12 VPC, Docker's default pool collides with the VPC CIDR. Bridge traffic gets misrouted to the VPC gateway instead of staying local. Inter-container traffic leaks to the VPC router, which drops it — causing intermittent, maddening connectivity failures.
Cascading Compose failures: A single failed network in a docker-compose.yml with depends_on chains kills the entire stack. Healthchecks never pass, orchestrators mark the deployment failed, and rollback triggers — for what is ultimately a one-line CIDR fix.

How to Fix It (The Solution)

Diagnose First

# See all Docker networks and their subnets
docker network ls --format '{{.Name}}' | xargs -I{} docker network inspect {} --format '{{.Name}}: {{range .IPAM.Config}}{{.Subnet}}{{end}}'

# See host routing table for conflicts
ip route show

# Nuclear option: see everything consuming RFC-1918 space
ip addr show | grep 'inet '

Basic Fix — Explicit Non-Conflicting Subnet

Stop relying on Docker's auto-allocation. Always declare your subnet explicitly.

- docker network create my_app_net
+ docker network create \
+   --driver bridge \
+   --subnet 192.168.200.0/24 \
+   --gateway 192.168.200.1 \
+   my_app_net

For docker-compose.yml:

 networks:
   my_app_net:
     driver: bridge
+    ipam:
+      config:
+        - subnet: 192.168.200.0/24
+          gateway: 192.168.200.1

Prune Stale Networks (If Pool Is Exhausted)

# Remove all networks not used by at least one container
docker network prune -f

# Verify the pool freed up
docker network ls

Enterprise Best Practice — Restrict Docker's Default Address Pool in `daemon.json`

This is the real fix. Force Docker to only draw from a CIDR range you own and have confirmed is unused across your VPC, VPN, and host interfaces.

# /etc/docker/daemon.json
 {
-  "log-driver": "json-file"
+  "log-driver": "json-file",
+  "default-address-pools": [
+    {"base": "192.168.128.0/17", "size": 24}
+  ]
 }

Then restart the daemon:

sudo systemctl restart docker

What this does: Docker now only auto-allocates /24 subnets from 192.168.128.0 through 192.168.255.0 — 128 possible networks, all in a range you've explicitly carved out and verified doesn't conflict with your VPN (172.x) or VPC (10.x). Every docker network create without an explicit subnet draws from this pool only.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Automated Network Pruning in Pipeline Teardown

Add this to every CI job's after_script / post step:

# .gitlab-ci.yml
after_script:
  - docker compose down --volumes --remove-orphans
  - docker network prune -f

2. Checkov Policy — Enforce Explicit Subnets in Compose

Checkov doesn't natively lint Compose IPAM, so write a custom OPA/Conftest policy:

# policy/docker_network_subnet.rego
package docker.network

deny[msg] {
  net := input.networks[name]
  not net.ipam.config
  msg := sprintf("Network '%v' has no explicit IPAM subnet. Auto-allocation risks VPC/VPN overlap.", [name])
}

Run in CI:

conftest test docker-compose.yml --policy policy/

3. Terraform — Pin `docker_network` Resource Subnets

 resource "docker_network" "app" {
   name = "my_app_net"
+  ipam_config {
+    subnet  = "192.168.200.0/24"
+    gateway = "192.168.200.1"
+  }
 }

Run checkov -d . or tfsec . — both flag docker_network resources missing ipam_config as a misconfiguration.

4. Pre-flight Check Script (Embed in Makefile or Entrypoint)

#!/usr/bin/env bash
# check-subnet.sh — run before docker network create
SUBNET="192.168.200.0/24"
if ip route show | grep -q "${SUBNET%/*}"; then
  echo "ERROR: Subnet $SUBNET conflicts with host route. Aborting."
  exit 1
fi
echo "Subnet $SUBNET is clean. Proceeding."

Bottom line: Explicit subnets in daemon.json + network pruning in CI teardown eliminates 95% of these failures. The other 5% is VPN clients being added post-deployment — handle that with the pre-flight check.