How to Fix Nginx 'upstream timed out (110) while SSL handshaking to upstream' Error
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 15–45 mins
TL;DR
- What broke: Nginx cannot complete a TLS handshake with the upstream backend before
proxy_read_timeoutorproxy_connect_timeoutexpires — connection is killed mid-handshake with ETIMEDOUT (110). - How to fix it: Tune
proxy_connect_timeout, fix upstream certificate chain validity, enableproxy_ssl_session_reuse on, and verifyproxy_ssl_verifyis not blocking on an unreachable CA. - Use our Client-Side Sandbox above to paste your Nginx upstream block and auto-refactor the SSL directives without leaking your config.
The Incident (What Does This Error Mean?)
You will see this in /var/log/nginx/error.log:
2024/01/15 03:42:17 [error] 1234#1234: *5678 upstream timed out (110: Connection timed out)
while SSL handshaking to upstream, client: 10.0.1.45, server: api.internal,
request: "POST /v2/data HTTP/1.1", upstream: "https://10.0.2.100:8443/v2/data",
host: "api.internal"
Immediate consequence: Every request proxied to that upstream returns a 502 Bad Gateway to the client. The TLS handshake — TCP connect, ClientHello, ServerHello, certificate exchange, key agreement — did not complete within the timeout window. Nginx terminated the socket. Your upstream never processed the request.
This is not a network packet loss issue by default. The TCP connection typically succeeds. The handshake itself stalls — meaning the problem lives in one of: certificate chain resolution, OCSP stapling, cipher negotiation mismatch, overloaded upstream TLS terminator, or a misconfigured proxy_ssl_* directive on the Nginx side.
The Attack Vector / Blast Radius
This is a availability and cascading failure scenario, not a direct exploit vector — but the blast radius is severe:
Connection pool exhaustion: Nginx worker processes hold open sockets waiting for the handshake to complete. Under load, this saturates
worker_connections. New requests queue, then fail. The entire reverse proxy layer goes dark.Retry storms: Upstream clients (mobile apps, other services) retry on 502. Each retry spawns a new connection attempt. Each new attempt hits the same stalled handshake. You get an exponential pile-up.
Hidden certificate expiry: A common trigger is an upstream certificate that has expired or has a broken intermediate chain.
proxy_ssl_verify oncauses Nginx to attempt OCSP/CRL validation against an external CA endpoint. If that CA endpoint is unreachable from your VPC (common in air-gapped or strict egress environments), every handshake hangs until timeout. This means a cert rotation event silently kills your proxy tier.Session reuse disabled: Without
proxy_ssl_session_reuse on, every proxied request performs a full TLS handshake. At high RPS against a slow upstream TLS stack (e.g., Java-based services with expensive RSA key exchange), the upstream TLS thread pool saturates and stops responding to new ClientHellos in time.
How to Fix It
Basic Fix — Adjust Timeouts and Session Reuse
If you need an immediate stop-the-bleeding fix while you investigate the root cause:
upstream backend_ssl {
server 10.0.2.100:8443;
+ keepalive 32;
}
server {
location /api/ {
proxy_pass https://backend_ssl;
- proxy_connect_timeout 5s;
- proxy_read_timeout 30s;
+ proxy_connect_timeout 10s;
+ proxy_read_timeout 60s;
- # proxy_ssl_session_reuse not set (defaults to on, but verify explicitly)
+ proxy_ssl_session_reuse on;
- proxy_ssl_verify on;
- proxy_ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;
+ proxy_ssl_verify off; # TEMPORARY — re-enable after cert chain is validated
}
}
⚠️
proxy_ssl_verify offis a temporary diagnostic step only. If this resolves the timeout, your upstream certificate chain is the root cause. Do not leave this in production.
Enterprise Best Practice — Full Hardened Upstream SSL Block
This is the production-grade configuration. It enables verification with a pinned CA bundle, explicit cipher control, session reuse, and keepalive to eliminate per-request full handshakes.
upstream backend_ssl {
server 10.0.2.100:8443;
+ keepalive 64;
+ keepalive_requests 1000;
+ keepalive_timeout 75s;
}
server {
listen 443 ssl;
location /api/ {
proxy_pass https://backend_ssl;
proxy_http_version 1.1;
+ proxy_set_header Connection "";
- proxy_connect_timeout 5s;
+ proxy_connect_timeout 10s;
- proxy_ssl_verify on;
- proxy_ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;
+ proxy_ssl_verify on;
+ proxy_ssl_trusted_certificate /etc/ssl/private/internal-ca-chain.pem;
+ proxy_ssl_verify_depth 3;
- # No session reuse configured
+ proxy_ssl_session_reuse on;
+ proxy_ssl_protocols TLSv1.2 TLSv1.3;
+ proxy_ssl_ciphers HIGH:!aNULL:!MD5:!RC4;
+ proxy_ssl_certificate /etc/ssl/private/nginx-client.crt;
+ proxy_ssl_certificate_key /etc/ssl/private/nginx-client.key;
}
}
Key directives explained:
proxy_ssl_session_reuse on— Nginx caches the TLS session ticket from the upstream. Subsequent requests skip the full handshake. Critical at high RPS.proxy_ssl_trusted_certificate— Point this at your internal CA bundle, not the system bundle, for internal services. Pointing at the system bundle for an internal service signed by a private CA causes verify failures that manifest as timeouts.proxy_ssl_verify_depth 3— Prevents Nginx from giving up on chains with intermediate CAs.keepalive 64on the upstream block +proxy_http_version 1.1+Connection ""— This is the trio that enables upstream keepalive. Without all three, Nginx opens a new TCP+TLS connection per request.
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. Validate Nginx Configs in the Pipeline Before Deploy
# In your GitHub Actions / GitLab CI deploy step
nginx -t -c /path/to/nginx.conf
# Non-zero exit code blocks the deploy
This catches syntax errors but not semantic SSL misconfigs. You need the steps below for that.
2. Certificate Chain Validation in CI
Add a pre-deploy step that validates the upstream certificate chain is reachable and valid from within your deployment environment:
# Run this from inside your cluster/VPC during CI
openssl s_client \
-connect 10.0.2.100:8443 \
-CAfile /etc/ssl/private/internal-ca-chain.pem \
-verify_return_error \
-brief < /dev/null
# Exit code 0 = chain valid and handshake completes
# Non-zero = your Nginx WILL hit SSL handshake timeouts in prod
3. Checkov / OPA Policy — Enforce proxy_ssl_session_reuse
For teams using Nginx config linting via gixy or custom OPA policies:
# opa/nginx_ssl_policy.rego
package nginx.ssl
deny[msg] {
input.proxy_ssl_session_reuse == "off"
msg := "proxy_ssl_session_reuse must be 'on' for upstream SSL blocks to prevent handshake exhaustion"
}
deny[msg] {
not input.proxy_ssl_trusted_certificate
input.proxy_ssl_verify == "on"
msg := "proxy_ssl_trusted_certificate must be explicitly set when proxy_ssl_verify is on"
}
4. Synthetic Monitoring on the Handshake Path
Don't wait for production alerts. Add a synthetic check that measures TLS handshake time specifically:
# Prometheus blackbox exporter probe config
modules:
https_upstream_tls:
prober: tcp
timeout: 5s
tcp:
tls: true
tls_config:
ca_file: /etc/ssl/private/internal-ca-chain.pem
query_response:
- expect: ""
Alert if handshake time exceeds 2 seconds — that's your early warning before the 110 timeout fires in Nginx.