Why does worker_processes auto cause 502s only in Kubernetes and not on bare metal?

On bare metal, auto correctly reads available CPUs and the system ulimit is typically high enough to support them. In Kubernetes, the container sees the host's full CPU count via /proc/cpuinfo regardless of your resources.limits.cpu cgroup setting. A node with 64 cores spawns 64 Nginx workers, each competing for file descriptors within a container ulimit that's often capped at 1024–4096. The FD exhaustion causes connect() failures to upstream, which Nginx surfaces as 502.

Does nginx -s reload guarantee zero downtime?

Only if your worker_connections, worker_rlimit_nofile, and upstream keepalive pool are correctly sized. During reload, old and new workers run simultaneously in an overlap window — doubling FD pressure. Without proper limits and proxy_next_upstream retry logic, requests hitting a new worker before its upstream connections are warmed up will 502. Zero-downtime reload requires correct config, not just the reload signal.

What is the correct formula for setting worker_rlimit_nofile?

The safe formula is: worker_rlimit_nofile >= (worker_processes × worker_connections × 2). The ×2 accounts for both the client-side and upstream-side socket for each proxied connection. Example: 4 workers × 4096 connections × 2 = 32,768. Set worker_rlimit_nofile to at least 32768, and ensure the OS-level ulimit for the nginx user is equal or higher. Verify with: cat /proc/$(pgrep -o nginx)/limits | grep 'open files'.

Fixing 502 Bad Gateway After Nginx Reload with worker_processes auto: Root Cause & Production Fix

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 10–20 mins

TL;DR

What broke: On nginx -s reload, new worker processes spawn while old workers drain. With worker_processes auto, CPU count detection inside containers returns the host CPU count, not the cgroup-limited count — spawning excess workers that exhaust worker_connections and file descriptors, causing upstream connection failures (502).
How to fix it: Pin worker_processes to your actual cgroup CPU quota, set worker_rlimit_nofile explicitly, and add proxy_next_upstream error timeout with an upstream keepalive pool.
Shortcut: Use our Client-Side Sandbox above to auto-refactor your nginx.conf — it detects the mismatch locally without sending your config anywhere.

The Incident (What Does the Error Mean?)

Your monitoring fires. Upstream services are healthy. But clients are getting:

502 Bad Gateway
nginx/1.25.x

In /var/log/nginx/error.log:

2024/01/15 03:42:17 [error] 3847#3847: *198423 connect() failed (99: Cannot assign requested address) while connecting to upstream
2024/01/15 03:42:17 [error] 3851#3851: *198441 no live upstreams while connecting to upstream, client: 10.0.1.45
2024/01/15 03:42:18 [warn]  3847#3847: *198450 upstream server temporarily disabled while connecting to upstream

This fires during and immediately after nginx -s reload. The upstream pods never went down. Nginx killed itself.

The Attack Vector / Blast Radius

Why worker_processes auto is a trap in containerized environments:

Nginx resolves auto by reading /proc/cpuinfo or calling sysconf(_SC_NPROCESSORS_ONLN). In a Kubernetes pod with resources.limits.cpu: "2", this syscall returns the node's physical CPU count — say, 96 cores on a c5.24xlarge. Nginx spawns 96 worker processes.

Each worker allocates worker_connections (default: 512 or your configured value) file descriptors. 96 workers × 512 connections = 49,152 FDs required. Your container's ulimit -n is probably 1024 or 4096. The OS starts rejecting socket() and connect() calls. Every proxied request to upstream fails. Every failure is a 502.

Cascading failure chain:

nginx -s reload → old master signals workers to drain
New workers spawn (96 of them) → immediately exhaust FD limits
Old workers still draining → total FD pressure doubles during overlap window
Upstream keepalive pool is destroyed and rebuilt → cold connection storm hits upstream
Upstream's own connection queue fills → upstream starts returning 503/504
Now both layers are degraded. Recovery takes minutes, not seconds.

In non-containerized bare-metal: auto is usually correct. The blast radius is still real if worker_rlimit_nofile is unset and system ulimit is low.

How to Fix It

Basic Fix: Pin Worker Count and File Descriptor Limits

- worker_processes auto;
+ worker_processes 2;  # Match your cgroup CPU limit exactly

  events {
-     worker_connections 512;
+     worker_connections 4096;
+     use epoll;
+     multi_accept on;
  }

  http {
+     worker_rlimit_nofile 65535;  # Must be > (worker_processes * worker_connections * 2)

Set worker_processes to the integer value of your resources.limits.cpu. For fractional limits like "1500m", use 1.

Enterprise Best Practice: Upstream Resilience + Keepalive Pool

The 502 window during reload is also caused by the upstream keepalive pool being torn down. Fix the proxy layer too:

  upstream backend {
      server 10.0.2.10:8080;
      server 10.0.2.11:8080;
+     keepalive 32;           # Persistent connection pool survives worker reload overlap
+     keepalive_requests 1000;
+     keepalive_timeout 60s;
  }

  server {
      location / {
          proxy_pass http://backend;
+         proxy_http_version 1.1;
+         proxy_set_header Connection "";
+         proxy_next_upstream error timeout http_502 http_503;
+         proxy_next_upstream_tries 3;
+         proxy_next_upstream_timeout 10s;
-         proxy_connect_timeout 60s;
+         proxy_connect_timeout 5s;   # Fail fast, let next_upstream retry
+         proxy_read_timeout 30s;
      }
  }

For Kubernetes deployments — set this in your nginx container spec:

  containers:
  - name: nginx
    resources:
      limits:
-       cpu: "2"
+       cpu: "2"          # This MUST match worker_processes integer value
    securityContext:
+     sysctls: []         # Don't rely on sysctls; set worker_rlimit_nofile in nginx.conf

If you must use auto (dynamic environments), detect the cgroup limit at startup:

- worker_processes auto;
+ # In your entrypoint.sh, before nginx starts:
+ # CPUS=$(cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us | awk -v period=$(cat /sys/fs/cgroup/cpu/cpu.cfs_period_us) '{printf "%d", $1/period}')  
+ # sed -i "s/worker_processes auto/worker_processes ${CPUS}/" /etc/nginx/nginx.conf
+ worker_processes auto;  # Only safe if NOT running in CPU-limited containers

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Lint nginx.conf in your pipeline with gixy:

# Dockerfile.ci
RUN pip install gixy && gixy /etc/nginx/nginx.conf
# Catches: worker_processes misconfig, missing proxy_next_upstream, SSRF vectors

2. Enforce worker_processes policy with OPA/Conftest:

# policy/nginx_workers.rego
package nginx

deny[msg] {
    input.worker_processes == "auto"
    msg := "worker_processes 'auto' is banned in containerized deployments. Pin to CPU limit integer."
}

deny[msg] {
    not input.worker_rlimit_nofile
    msg := "worker_rlimit_nofile must be explicitly set. Default ulimit is unsafe."
}

conftest test nginx.conf --policy policy/

3. Smoke-test reload in staging with connection hold:

#!/bin/bash
# ci/reload_test.sh
# Hold 100 connections open, trigger reload, assert zero 502s
nginx -s reload
for i in $(seq 1 100); do
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://localhost/healthz)
  if [ "$STATUS" == "502" ]; then
    echo "FAIL: 502 detected post-reload"
    exit 1
  fi
done
echo "PASS: No 502s during reload window"

4. Prometheus alert on reload-correlated 502 spikes:

# alerts/nginx_reload_502.yaml
- alert: NginxReload502Spike
  expr: |
    increase(nginx_http_requests_total{status="502"}[2m]) > 10
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "502 spike detected — likely nginx reload with worker starvation"