Initializing Enclave...

How to Fix Nginx 'upstream timed out (110) while SSL handshaking to upstream' Error

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 15–45 mins

TL;DR

  • What broke: Nginx cannot complete a TLS handshake with the upstream backend before proxy_read_timeout or proxy_connect_timeout expires — connection is killed mid-handshake with ETIMEDOUT (110).
  • How to fix it: Tune proxy_connect_timeout, fix upstream certificate chain validity, enable proxy_ssl_session_reuse on, and verify proxy_ssl_verify is not blocking on an unreachable CA.
  • Use our Client-Side Sandbox above to paste your Nginx upstream block and auto-refactor the SSL directives without leaking your config.

The Incident (What Does This Error Mean?)

You will see this in /var/log/nginx/error.log:

2024/01/15 03:42:17 [error] 1234#1234: *5678 upstream timed out (110: Connection timed out)
  while SSL handshaking to upstream, client: 10.0.1.45, server: api.internal,
  request: "POST /v2/data HTTP/1.1", upstream: "https://10.0.2.100:8443/v2/data",
  host: "api.internal"

Immediate consequence: Every request proxied to that upstream returns a 502 Bad Gateway to the client. The TLS handshake — TCP connect, ClientHello, ServerHello, certificate exchange, key agreement — did not complete within the timeout window. Nginx terminated the socket. Your upstream never processed the request.

This is not a network packet loss issue by default. The TCP connection typically succeeds. The handshake itself stalls — meaning the problem lives in one of: certificate chain resolution, OCSP stapling, cipher negotiation mismatch, overloaded upstream TLS terminator, or a misconfigured proxy_ssl_* directive on the Nginx side.


The Attack Vector / Blast Radius

This is a availability and cascading failure scenario, not a direct exploit vector — but the blast radius is severe:

  1. Connection pool exhaustion: Nginx worker processes hold open sockets waiting for the handshake to complete. Under load, this saturates worker_connections. New requests queue, then fail. The entire reverse proxy layer goes dark.

  2. Retry storms: Upstream clients (mobile apps, other services) retry on 502. Each retry spawns a new connection attempt. Each new attempt hits the same stalled handshake. You get an exponential pile-up.

  3. Hidden certificate expiry: A common trigger is an upstream certificate that has expired or has a broken intermediate chain. proxy_ssl_verify on causes Nginx to attempt OCSP/CRL validation against an external CA endpoint. If that CA endpoint is unreachable from your VPC (common in air-gapped or strict egress environments), every handshake hangs until timeout. This means a cert rotation event silently kills your proxy tier.

  4. Session reuse disabled: Without proxy_ssl_session_reuse on, every proxied request performs a full TLS handshake. At high RPS against a slow upstream TLS stack (e.g., Java-based services with expensive RSA key exchange), the upstream TLS thread pool saturates and stops responding to new ClientHellos in time.


How to Fix It

Basic Fix — Adjust Timeouts and Session Reuse

If you need an immediate stop-the-bleeding fix while you investigate the root cause:

 upstream backend_ssl {
     server 10.0.2.100:8443;
+    keepalive 32;
 }
 
 server {
     location /api/ {
         proxy_pass https://backend_ssl;
 
-        proxy_connect_timeout 5s;
-        proxy_read_timeout    30s;
+        proxy_connect_timeout 10s;
+        proxy_read_timeout    60s;
 
-        # proxy_ssl_session_reuse not set (defaults to on, but verify explicitly)
+        proxy_ssl_session_reuse on;
 
-        proxy_ssl_verify      on;
-        proxy_ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;
+        proxy_ssl_verify      off;  # TEMPORARY — re-enable after cert chain is validated
     }
 }

⚠️ proxy_ssl_verify off is a temporary diagnostic step only. If this resolves the timeout, your upstream certificate chain is the root cause. Do not leave this in production.


Enterprise Best Practice — Full Hardened Upstream SSL Block

This is the production-grade configuration. It enables verification with a pinned CA bundle, explicit cipher control, session reuse, and keepalive to eliminate per-request full handshakes.

 upstream backend_ssl {
     server 10.0.2.100:8443;
+    keepalive 64;
+    keepalive_requests 1000;
+    keepalive_timeout 75s;
 }
 
 server {
     listen 443 ssl;
 
     location /api/ {
         proxy_pass               https://backend_ssl;
         proxy_http_version       1.1;
 
+        proxy_set_header Connection "";
 
-        proxy_connect_timeout    5s;
+        proxy_connect_timeout    10s;
 
-        proxy_ssl_verify         on;
-        proxy_ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;
+        proxy_ssl_verify                  on;
+        proxy_ssl_trusted_certificate     /etc/ssl/private/internal-ca-chain.pem;
+        proxy_ssl_verify_depth            3;
 
-        # No session reuse configured
+        proxy_ssl_session_reuse           on;
 
+        proxy_ssl_protocols               TLSv1.2 TLSv1.3;
+        proxy_ssl_ciphers                 HIGH:!aNULL:!MD5:!RC4;
 
+        proxy_ssl_certificate             /etc/ssl/private/nginx-client.crt;
+        proxy_ssl_certificate_key         /etc/ssl/private/nginx-client.key;
     }
 }

Key directives explained:

  • proxy_ssl_session_reuse on — Nginx caches the TLS session ticket from the upstream. Subsequent requests skip the full handshake. Critical at high RPS.
  • proxy_ssl_trusted_certificate — Point this at your internal CA bundle, not the system bundle, for internal services. Pointing at the system bundle for an internal service signed by a private CA causes verify failures that manifest as timeouts.
  • proxy_ssl_verify_depth 3 — Prevents Nginx from giving up on chains with intermediate CAs.
  • keepalive 64 on the upstream block + proxy_http_version 1.1 + Connection "" — This is the trio that enables upstream keepalive. Without all three, Nginx opens a new TCP+TLS connection per request.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Validate Nginx Configs in the Pipeline Before Deploy

# In your GitHub Actions / GitLab CI deploy step
nginx -t -c /path/to/nginx.conf
# Non-zero exit code blocks the deploy

This catches syntax errors but not semantic SSL misconfigs. You need the steps below for that.

2. Certificate Chain Validation in CI

Add a pre-deploy step that validates the upstream certificate chain is reachable and valid from within your deployment environment:

# Run this from inside your cluster/VPC during CI
openssl s_client \
  -connect 10.0.2.100:8443 \
  -CAfile /etc/ssl/private/internal-ca-chain.pem \
  -verify_return_error \
  -brief < /dev/null

# Exit code 0 = chain valid and handshake completes
# Non-zero = your Nginx WILL hit SSL handshake timeouts in prod

3. Checkov / OPA Policy — Enforce proxy_ssl_session_reuse

For teams using Nginx config linting via gixy or custom OPA policies:

# opa/nginx_ssl_policy.rego
package nginx.ssl

deny[msg] {
  input.proxy_ssl_session_reuse == "off"
  msg := "proxy_ssl_session_reuse must be 'on' for upstream SSL blocks to prevent handshake exhaustion"
}

deny[msg] {
  not input.proxy_ssl_trusted_certificate
  input.proxy_ssl_verify == "on"
  msg := "proxy_ssl_trusted_certificate must be explicitly set when proxy_ssl_verify is on"
}

4. Synthetic Monitoring on the Handshake Path

Don't wait for production alerts. Add a synthetic check that measures TLS handshake time specifically:

# Prometheus blackbox exporter probe config
modules:
  https_upstream_tls:
    prober: tcp
    timeout: 5s
    tcp:
      tls: true
      tls_config:
        ca_file: /etc/ssl/private/internal-ca-chain.pem
      query_response:
        - expect: ""

Alert if handshake time exceeds 2 seconds — that's your early warning before the 110 timeout fires in Nginx.

Related Diagnostics

"Part of the Performance Utility Matrix."

View all 219 Performance Tools →