Fixing Nginx 'SSL_do_handshake() failed' for HTTPS Upstreams: proxy_ssl_name SNI Mismatch Resolved
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–15 mins
TL;DR
- What broke: Nginx is proxying to an HTTPS upstream but sending the wrong SNI hostname (or none at all) during the TLS handshake. The upstream server rejects the
ClientHello, killing every proxied request. - How to fix it: Add
proxy_ssl_name <exact-upstream-cn>;andproxy_ssl_server_name on;to the offendinglocationblock. If certificate verification is enabled, also pointproxy_ssl_trusted_certificateat the correct CA bundle. - Fast path: Use our Client-Side Sandbox below to auto-refactor this — paste your failing
locationblock and get a corrected diff without sending your config to any external server.
The Incident (What Does the Error Mean?)
Raw error from /var/log/nginx/error.log:
2024/07/15 03:41:22 [error] 1187#1187: *4821 SSL_do_handshake() failed
(SSL: error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure
:s3_pkt.c:1493:SSL alert number 40)
while SSL handshaking to upstream,
client: 10.0.1.55, server: api.internal.example.com,
request: "POST /v2/ingest HTTP/1.1",
upstream: "https://10.0.4.22:443/v2/ingest",
host: "api.internal.example.com"
Immediate consequence: Every request hitting that location block returns a 502 Bad Gateway to the client. There is no retry, no fallback. If this upstream is a payment processor, auth service, or data pipeline, the blast is immediate and total — 100% of proxied traffic fails from the moment the upstream enforces SNI-based virtual hosting or strict certificate matching.
The core mechanic: Nginx, by default, uses the IP address from the proxy_pass directive as the SNI value (or sends no SNI at all if a bare IP is used). The upstream TLS stack performs SNI-based certificate selection. When the SNI value doesn't match any certificate the upstream serves, it fires alert handshake failure (40) and drops the connection before a single byte of HTTP is exchanged.
The Attack Vector / Blast Radius
This is not just an uptime problem. It has a security dimension:
Scenario A — Silent SNI bypass (the dangerous one): If you "fix" this by setting proxy_ssl_verify off; without correcting the SNI, you've disabled certificate validation entirely. Nginx will now happily complete a handshake with any certificate, including a self-signed cert from a man-in-the-middle on your internal network. On AWS/GCP, a compromised instance in the same VPC can ARP-spoof the upstream IP and intercept all proxied traffic — credentials, tokens, PII — in plaintext from Nginx's perspective.
Scenario B — Certificate pinning failures in regulated environments: In PCI-DSS and HIPAA environments, mutual TLS (mTLS) to payment or EHR upstreams is mandatory. A broken proxy_ssl_name means the handshake never reaches the mutual auth phase. Your compliance posture is broken even if you don't notice the error in logs.
Scenario C — Cascading upstream pool failure: In an upstream {} block with multiple servers, the SSL handshake failure marks the peer as temporarily unavailable. Nginx's passive health check will cycle through all peers, fail each one, and return 502 for the entire pool. A single misconfigured proxy_ssl_name can take down a weighted round-robin pool of 10 nodes.
How to Fix It (The Solution)
Basic Fix
The minimal change: tell Nginx exactly what hostname to send in the SNI extension and enable SNI.
location /api/ {
proxy_pass https://10.0.4.22:443;
- # Missing: proxy_ssl_name and proxy_ssl_server_name
+ proxy_ssl_name backend.internal.example.com;
+ proxy_ssl_server_name on;
}
proxy_ssl_server_name on is the critical switch — it instructs Nginx's OpenSSL layer to populate the SNI extension in the ClientHello. Without it, proxy_ssl_name is parsed but silently ignored.
Enterprise Best Practice
For production systems with certificate verification, mTLS, and named upstream pools:
upstream backend_pool {
server backend-1.internal.example.com:443;
server backend-2.internal.example.com:443;
keepalive 32;
}
server {
listen 443 ssl;
server_name api.gateway.example.com;
location /v2/ {
proxy_pass https://backend_pool;
- # BAD: No SNI config, verify disabled to "make it work"
- proxy_ssl_verify off;
+ # GOOD: Explicit SNI hostname matching the upstream certificate CN/SAN
+ proxy_ssl_name backend.internal.example.com;
+ proxy_ssl_server_name on;
+
+ # Verify the upstream cert against your internal CA or a pinned bundle
+ proxy_ssl_verify on;
+ proxy_ssl_verify_depth 3;
+ proxy_ssl_trusted_certificate /etc/nginx/certs/internal-ca-chain.pem;
+
+ # For mTLS: present Nginx's own client cert to the upstream
+ proxy_ssl_certificate /etc/nginx/certs/nginx-client.crt;
+ proxy_ssl_certificate_key /etc/nginx/certs/nginx-client.key;
+
+ # Reuse TLS sessions across keepalive connections (performance + security)
+ proxy_ssl_session_reuse on;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
Key directives explained:
| Directive | Why it matters |
|---|---|
proxy_ssl_name |
The hostname placed in the SNI extension. Must match the CN or a SAN on the upstream's certificate. |
proxy_ssl_server_name on |
Activates SNI population in OpenSSL. Without this, proxy_ssl_name does nothing. |
proxy_ssl_verify on |
Validates the upstream cert chain. Disabling this is a security liability, not a fix. |
proxy_ssl_trusted_certificate |
Path to the CA bundle that signed the upstream cert. Use your internal PKI root for private services. |
proxy_ssl_session_reuse on |
Reuses TLS session tickets across keepalive connections — reduces handshake overhead by ~60%. |
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
This class of misconfiguration is 100% preventable before it reaches production.
1. Nginx config linting with gixy (static analysis):
# Install gixy — the Nginx security static analyzer
pip install gixy
gixy /etc/nginx/nginx.conf
# Catches: missing proxy_ssl_verify, proxy_ssl_name absent when proxy_pass is https://
2. Checkov IaC scanning (if Nginx config is templated via Terraform/Ansible):
checkov -d ./nginx-templates --framework ansible
# Add a custom check for proxy_ssl_server_name presence
3. OPA/Conftest policy for Nginx upstream TLS (add to your PR pipeline):
# policy/nginx_ssl_upstream.rego
package nginx.ssl
deny[msg] {
block := input.locations[_]
startswith(block.proxy_pass, "https://")
not block.proxy_ssl_server_name == "on"
msg := sprintf(
"Location '%v' proxies to HTTPS but proxy_ssl_server_name is not 'on'. SNI will not be sent.",
[block.path]
)
}
deny[msg] {
block := input.locations[_]
startswith(block.proxy_pass, "https://")
not block.proxy_ssl_name
msg := sprintf(
"Location '%v' proxies to HTTPS but proxy_ssl_name is not set. Upstream SNI will default to the IP address.",
[block.path]
)
}
4. Integration test in staging with curl --resolve:
# Simulate the exact SNI that Nginx will send and verify the upstream accepts it
curl -v --resolve backend.internal.example.com:443:10.0.4.22 \
--cacert /etc/nginx/certs/internal-ca-chain.pem \
https://backend.internal.example.com/healthz
# If this fails, the upstream cert or CA bundle is the problem, not Nginx config.
5. Nginx proxy_ssl_verify on as a mandatory baseline in your nginx.conf defaults:
# In http {} block — applies as default to all server/location blocks
proxy_ssl_verify on;
proxy_ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;
proxy_ssl_session_reuse on;
# Individual location blocks must explicitly set proxy_ssl_name
# — this forces engineers to think about SNI, not skip it.
Enforce this via a templating standard (Cookiecutter, Helm chart for Nginx, or Ansible role). Any proxy_pass https:// without an accompanying proxy_ssl_name should fail the PR lint step.