Initializing Enclave...

How to Fix Nginx 'connect() to unix:/var/run/docker.sock failed (13: Permission denied)' in Docker Socket Proxy

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 5–15 mins

TL;DR

  • What broke: The Nginx process (or its container) is running as a non-root user without membership in the docker group, so the kernel rejects the Unix socket connection with EACCES (errno 13).
  • How to fix it: Add the container to the host docker GID, or—better—front the socket with Tecnativa/docker-socket-proxy and never expose the raw socket to Nginx at all.
  • Fastest path: Use our Client-Side Sandbox above to auto-refactor your docker-compose.yml—paste it in, get the hardened output back without sending your config to a third-party server.

The Incident (What does the error mean?)

Raw error from Nginx error log:

2024/01/15 03:42:11 [crit] 29#29: *1 connect() to unix:/var/run/docker.sock failed
(13: Permission denied) while connecting to upstream, client: 172.18.0.1,
server: _, request: "GET /containers/json HTTP/1.1", upstream:
"http://unix:/var/run/docker.sock:/containers/json", host: "localhost"

What is immediately broken:

  • Every upstream request Nginx proxies to the Docker API returns a 502.
  • Any dashboard, reverse-proxy autodiscovery, or container introspection feature (Traefik-style label reads, Portainer agents, custom health UIs) is completely dead.
  • If this is in a readiness probe path, your pod/container enters a crash loop.

Root cause in one line: /var/run/docker.sock is owned by root:docker with mode 660. The Nginx worker process is UID 101 (nginx user) and is not in the docker group, so the kernel denies the connect() syscall before a single byte is exchanged.


The Attack Vector / Blast Radius

This error is a symptom of a correct security boundary—but the wrong fix will create a critical vulnerability.

The naive fix developers reach for first:

chmod 777 /var/run/docker.sock   # DO NOT DO THIS

or

docker run --user root ...       # DO NOT DO THIS

Why that is catastrophic: The Docker socket is an unauthenticated root shell. Any process that can write to /var/run/docker.sock can:

  1. POST /containers/create with Binds: ["/:/host"] and Privileged: true.
  2. POST /containers/{id}/start
  3. POST /containers/{id}/exec → execute arbitrary commands as root on the host.

This is a full container escape to host root in three API calls. CVSSv3 score for unrestricted Docker socket exposure: 9.8 (Critical). If your Nginx container is internet-facing and you chmod 777 the socket, you have handed the host to the first person who finds an Nginx RCE or SSRF.

Blast radius of the permission denied error itself:

  • All Docker API proxy routes return 502.
  • Autodiscovery pipelines stall.
  • Monitoring agents lose container metadata.
  • CI/CD pipelines that query container state will fail their health gates.

How to Fix It (The Solution)

Basic Fix — Add Container to the Docker GID

Find the GID of the docker group on the host:

stat -c '%g' /var/run/docker.sock
# or
getent group docker | cut -d: -f3
# typical output: 999

Pass that GID into the container at runtime:

# docker-compose.yml
services:
  nginx:
    image: nginx:alpine
-   user: "101"
+   user: "101:999"   # 999 = host docker GID from stat above
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro

⚠️ The GID must match the host. If you hardcode 999 and the host uses 998, it still fails. Use $(stat -c '%g' /var/run/docker.sock) in your entrypoint or compose override.

Dynamic GID in compose (production-safe):

services:
  nginx:
    image: nginx:alpine
-   volumes:
-     - /var/run/docker.sock:/var/run/docker.sock
+   group_add:
+     - "${DOCKER_GID}"
+   volumes:
+     - /var/run/docker.sock:/var/run/docker.sock:ro
# .env or CI environment
DOCKER_GID=$(stat -c '%g' /var/run/docker.sock)

Enterprise Best Practice — Docker Socket Proxy (Zero Direct Exposure)

Never mount the raw socket into Nginx. Use Tecnativa/docker-socket-proxy as a sidecar that whitelists only the API endpoints Nginx actually needs.

# docker-compose.yml
+services:
+  socket-proxy:
+    image: tecnativa/docker-socket-proxy:latest
+    restart: unless-stopped
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock:ro
+    environment:
+      CONTAINERS: 1       # allow GET /containers/json
+      SERVICES: 0
+      TASKS: 0
+      NETWORKS: 0
+      VOLUMES: 0
+      INFO: 0
+      EXEC: 0             # NEVER enable this
+      POST: 0             # block all write operations
+    networks:
+      - socket-proxy-net
+    # Do NOT expose port 2375 to host or internet

  nginx:
    image: nginx:alpine
-   volumes:
-     - /var/run/docker.sock:/var/run/docker.sock
+   environment:
+     - DOCKER_HOST=tcp://socket-proxy:2375
+   depends_on:
+     - socket-proxy
+   networks:
+     - socket-proxy-net
+     - public

+networks:
+  socket-proxy-net:
+    internal: true   # air-gapped from internet
+  public:

Nginx upstream config pointing to the proxy:

# nginx.conf upstream block
 upstream docker_api {
-    server unix:/var/run/docker.sock;
+    server socket-proxy:2375;
 }

Now Nginx has zero access to the Docker socket. The proxy enforces an allowlist at the HTTP layer. Even a full Nginx RCE cannot escape to the host.


💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Checkov — Block Raw Socket Mounts in IaC

Checkov rule CKV_DOCKER_1 flags privileged containers. Add a custom policy for socket mounts:

# .checkov/custom_policies/no_raw_docker_socket.yaml
metadata:
  name: "NoRawDockerSocketMount"
  id: "CKV2_DOCKER_CUSTOM_1"
  severity: CRITICAL
definition:
  and:
    - cond_type: attribute
      resource_types: [docker_container, docker_service]
      attribute: volumes
      operator: not_contains
      value: "/var/run/docker.sock:/var/run/docker.sock"

Run in CI:

checkov -d . --check CKV2_DOCKER_CUSTOM_1 --soft-fail-on MEDIUM

2. OPA/Conftest — Enforce Socket Proxy Pattern

# policy/docker_socket.rego
package docker

deny[msg] {
  input.services[name].volumes[_] == "/var/run/docker.sock:/var/run/docker.sock"
  input.services[name].image != "tecnativa/docker-socket-proxy"
  msg := sprintf("Service '%v' mounts raw Docker socket. Use tecnativa/docker-socket-proxy.", [name])
}
conftest test docker-compose.yml --policy policy/

3. GitHub Actions Gate

# .github/workflows/security.yml
- name: Lint Docker Compose for socket exposure
  run: |
    if grep -rn 'docker.sock' docker-compose*.yml | grep -v 'socket-proxy'; then
      echo "ERROR: Raw docker.sock mount detected outside socket-proxy service."
      exit 1
    fi

4. Dockerfile — Never Run as Root

 FROM nginx:alpine
+RUN addgroup -g 999 docker 2>/dev/null || true && \
+    adduser nginx docker
 USER nginx

Summary of controls:

Layer Tool What it catches
IaC scan Checkov Raw socket mounts in compose/Terraform
Policy gate OPA/Conftest Enforces socket-proxy pattern
CI lint grep + bash Fast fail on any docker.sock reference
Runtime docker-socket-proxy Allowlists API endpoints, blocks exec/write
Image build Dockerfile USER Prevents accidental root process

Related Diagnostics

"Part of the Security Utility Matrix."

View all 140 Security Tools →