Fixing Nginx 'connect() failed (111: Connection refused)' in Docker Host Network Mode
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–15 mins
TL;DR
- What broke: Nginx cannot reach the upstream service because the target process is bound to
127.0.0.1only, or the port is wrong — and in host network mode, the usual Docker DNS tricks don't apply. - How to fix it: Ensure the upstream service binds to
0.0.0.0(or the correct host IP), verify the port matches, and confirm--network hostis actually applied to the Nginx container. - Use our Client-Side Sandbox below to paste your
nginx.confanddocker-compose.ymland auto-generate the refactored config.
The Incident (What Does the Error Mean?)
Raw error from /var/log/nginx/error.log:
2024/01/15 03:42:17 [error] 29#29: *1 connect() failed (111: Connection refused)
while connecting to upstream, client: 10.0.0.5, server: _, request:
"GET /api/health HTTP/1.1", upstream: "http://127.0.0.1:8080/api/health",
host: "prod-gateway.internal"
errno 111 = ECONNREFUSED. The TCP SYN hit the host network stack and got a RST back. Nothing is listening on that socket. Nginx tried, the kernel rejected it. Every request hitting this upstream returns 502 Bad Gateway to your users — right now, in production.
In host network mode, the Nginx container shares the host's network namespace directly. There are no virtual bridges, no container IPs. localhost inside Nginx is the host's loopback. So if your upstream app is also on host network and bound to 127.0.0.1:8080, this should work — unless it isn't running, bound to the wrong port, or crashed silently.
The Attack Vector / Blast Radius
This isn't a security exploit — it's a cascading availability failure:
- Upstream process died or never started. Nginx came up, upstream didn't. No readiness gate caught it.
- Upstream bound to wrong interface. App server configured to
127.0.0.1inside its own container namespace, but then switched to host network without updating the bind address — now it's bound to a loopback that exists but the port assignment shifted. - Port conflict on the host. Another process grabbed
8080first. Your app silently failed to bind and exited non-zero — but yourrestart: alwayspolicy is in a crash loop you haven't noticed. --network hostmissing from the Nginx container. You applied it to the upstream but not Nginx. Nginx is still on the bridge network trying to reach127.0.0.1of its own isolated namespace — which has nothing on port 8080.
Blast radius: 100% of traffic to this upstream returns 502. If Nginx has no fallback (backup server, error_page redirect), your entire service is down.
How to Fix It
Step 1: Confirm What's Actually Listening
# On the host — not inside a container
ss -tlnp | grep 8080
# or
netstat -tlnp | grep 8080
If that returns nothing, your upstream process is not running. Fix that first.
Basic Fix — nginx.conf upstream address
In host network mode, use 127.0.0.1 explicitly (not a container hostname — DNS doesn't apply here):
upstream backend {
- server app-container:8080; # WRONG: container DNS doesn't resolve in host network
+ server 127.0.0.1:8080; # CORRECT: host loopback, shared namespace
}
Basic Fix — docker-compose.yml
services:
nginx:
image: nginx:1.25-alpine
- # network_mode not set — defaults to bridge
+ network_mode: host
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
app:
image: myapp:latest
- # network_mode not set
+ network_mode: host
environment:
- - BIND_ADDR=127.0.0.1:8080 # Only reachable within its own namespace if on bridge
+ - BIND_ADDR=0.0.0.0:8080 # Bind all interfaces so host loopback reaches it
Enterprise Best Practice — Health-Gated Startup + Upstream Keepalive
Don't let Nginx start routing until the upstream is confirmed alive. Add a healthcheck and a depends_on condition, plus keepalive to avoid connection churn:
services:
nginx:
image: nginx:1.25-alpine
network_mode: host
+ depends_on:
+ app:
+ condition: service_healthy
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
app:
image: myapp:latest
network_mode: host
+ healthcheck:
+ test: ["CMD", "curl", "-sf", "http://127.0.0.1:8080/healthz"]
+ interval: 5s
+ timeout: 3s
+ retries: 5
+ start_period: 10s
And in nginx.conf, add upstream keepalive and failure handling:
upstream backend {
server 127.0.0.1:8080;
+ keepalive 32;
}
server {
location / {
proxy_pass http://backend;
+ proxy_next_upstream error timeout http_502 http_503;
+ proxy_connect_timeout 2s;
+ proxy_read_timeout 10s;
+ proxy_http_version 1.1;
+ proxy_set_header Connection "";
}
}
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. Lint nginx.conf in your pipeline:
# In your GitHub Actions / GitLab CI step
docker run --rm -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf:ro nginx:1.25-alpine nginx -t
This catches syntax errors but not runtime upstream failures. Pair it with an integration test.
2. Integration smoke test after deploy:
# Post-deploy health gate — fail the pipeline if upstream is unreachable
curl --retry 5 --retry-delay 3 --retry-connrefused -sf http://127.0.0.1:80/healthz || exit 1
3. Checkov / Trivy for compose files:
checkov -f docker-compose.yml --check CKV_DOCKER_2 # Healthcheck defined
4. OPA policy to enforce healthcheck on host-network services:
deny[msg] {
service := input.services[name]
service.network_mode == "host"
not service.healthcheck
msg := sprintf("Service '%v' uses host network but has no healthcheck defined", [name])
}
5. Use a process supervisor on the host for the upstream app (systemd, supervisord) so it restarts before Nginx gives up — don't rely solely on Docker's restart: always which has backoff delays that create 502 windows.