Initializing Enclave...

Fixing Nginx 'upstream sent invalid header' CRLF Error from PHP-FPM (502 Bad Gateway)

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH (immediate 502 outage) | Time to Fix: 5–15 mins

TL;DR

  • What broke: PHP-FPM is emitting a raw HTTP/1.1 200 OK status line as a FastCGI response header. Nginx's FastCGI parser does not expect an HTTP-line in that position and hard-rejects it, returning a 502 to every client.
  • How to fix it: Remove the fastcgi_param HTTP_PROXY bleed-through, correct the fastcgi_pass directive to point at the right socket/port, and ensure PHP-FPM's cgi.fix_pathinfo + fastcgi_index are not triggering double-header emission. In most cases the immediate fix is adding fastcgi_ignore_headers or correcting the SCRIPT_FILENAME param so PHP doesn't fall back to a raw HTTP response.
  • Use our Client-Side Sandbox above to auto-refactor this — paste your nginx.conf server block and your www.conf FPM pool file and get a corrected diff in seconds.

The Incident (What Does the Error Mean?)

Raw Nginx error log entry:

2024/01/15 03:42:17 [error] 1234#1234: *5891 upstream sent invalid header
while reading response header from upstream,
client: 203.0.113.44, server: api.example.com,
request: "POST /api/checkout HTTP/1.1",
upstream: "fastcgi://unix:/run/php/php8.2-fpm.sock",
host: "api.example.com"

What is actually happening at the wire level:

The FastCGI protocol mandates that the upstream (PHP-FPM) returns CGI headers — plain Key: Value\r\n pairs like Content-Type: text/html\r\n. Nginx's ngx_http_fastcgi_module reads bytes off the socket and expects to parse those CGI headers.

Instead, PHP-FPM is sending back a full HTTP response line as the first bytes:

HTTP/1.1 200 OK\r\n
Content-Type: text/html; charset=UTF-8\r\n
...

Nginx sees HTTP/1.1 where it expects a CGI header name, flags the entire response as malformed, closes the upstream connection, and serves the client a 502 Bad Gateway. The PHP process itself completed successfully — the error is 100% in the protocol translation layer.

Immediate consequence: Every request hitting the affected location block returns 502. If this is a location / catch-all, your entire application is down.


The Attack Vector / Blast Radius

This is a cascading availability failure, not a direct security exploit, but the blast radius is severe:

  1. Total application outage for the affected vhost. Nginx will not serve a single PHP-generated response.
  2. Health check death spiral: If your load balancer (ALB, HAProxy, Cloudflare) uses HTTP health checks against this endpoint, all instances get marked unhealthy simultaneously. Auto-scaling groups spin up new instances that are also misconfigured. You burn through your EC2/GKE quota with no recovery.
  3. Queue backpressure: Background job workers (Laravel Horizon, Symfony Messenger) that call internal APIs start timing out. Dead-letter queues fill. Database connection pools stay open waiting for responses that never come.
  4. Log flooding: Each 502 writes two log lines (Nginx error + access). Under production traffic, this can fill /var/log partitions in under an hour on small instances, causing secondary failures in unrelated services writing to the same disk.
  5. Indirect security risk: When ops teams scramble during a 502 outage, they often temporarily set fastcgi_ignore_invalid_headers on; — which tells Nginx to silently accept any malformed header from upstream. This can mask HTTP Response Splitting vulnerabilities in PHP code that injects \r\n sequences into user-controlled header values.

Root causes ranked by frequency in production:

Rank Root Cause Frequency
1 SCRIPT_FILENAME points to wrong path; PHP 404s internally and emits raw HTTP 45%
2 PHP script calls header('HTTP/1.1 200 OK') explicitly 25%
3 FastCGI socket/port mismatch (connecting to wrong FPM pool) 15%
4 cgi.rfc2616_headers = 1 set in php.ini 10%
5 Reverse proxy chain double-wrapping (Nginx → Nginx → FPM) 5%

How to Fix It (The Solution)

Basic Fix — Correct SCRIPT_FILENAME and Remove RFC2616 Headers

The #1 cause is a wrong SCRIPT_FILENAME fastcgi param. PHP cannot find the file, generates an internal error response with a full HTTP status line, and FPM forwards it verbatim.

# /etc/nginx/sites-available/api.example.com

 location ~ \.php$ {
     include fastcgi_params;
 
-    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
+    fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
 
-    fastcgi_pass 127.0.0.1:9000;
+    fastcgi_pass unix:/run/php/php8.2-fpm.sock;
 
     fastcgi_index index.php;
     fastcgi_read_timeout 300;
+
+    # Explicitly reject responses with raw HTTP status lines
+    # Do NOT use fastcgi_ignore_invalid_headers on; — that hides bugs
 }
# /etc/php/8.2/fpm/php.ini

-cgi.rfc2616_headers = 1
+cgi.rfc2616_headers = 0

After editing:

nginx -t && systemctl reload nginx
php-fpm8.2 -t && systemctl reload php8.2-fpm

Enterprise Best Practice — Harden the Entire FastCGI Stack

Step 1: Audit PHP code for explicit HTTP-line header injection

# In any PHP file (framework bootstrap, middleware, legacy code)

- header('HTTP/1.1 200 OK');        # NEVER do this in FastCGI context
- header('HTTP/1.0 200 OK');        # Same problem
+ http_response_code(200);          # Correct: sets status code only
+ // Or in PSR-7: $response->withStatus(200)

Step 2: Lock down the FPM pool config

# /etc/php/8.2/fpm/pool.d/www.conf

 [www]
 user = www-data
 group = www-data

-listen = 127.0.0.1:9000
+listen = /run/php/php8.2-fpm.sock
+listen.owner = www-data
+listen.group = www-data
+listen.mode = 0660

 pm = dynamic
 pm.max_children = 50
 pm.start_servers = 5
 pm.min_spare_servers = 5
 pm.max_spare_servers = 35

+; Catch slow scripts before they emit partial/corrupt responses
+request_slowlog_timeout = 10s
+slowlog = /var/log/php-fpm/www-slow.log
+
+; Hard kill runaway processes
+request_terminate_timeout = 60s

Step 3: Nginx upstream hardening with error interception

# /etc/nginx/sites-available/api.example.com

 server {
     listen 443 ssl http2;
     server_name api.example.com;
     root /var/www/api/public;

+    # Intercept FPM errors and serve your own error pages
+    fastcgi_intercept_errors on;
+    error_page 502 503 504 /50x.html;

     location ~ \.php$ {
         try_files $uri =404;
         include fastcgi_params;
         fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
         fastcgi_pass unix:/run/php/php8.2-fpm.sock;
         fastcgi_index index.php;

+        # Buffer FPM responses to prevent partial header delivery
+        fastcgi_buffering on;
+        fastcgi_buffer_size 16k;
+        fastcgi_buffers 16 16k;
+        fastcgi_busy_buffers_size 32k;

-        # Remove this — it can cause FPM to see itself as a proxy target
-        fastcgi_param HTTP_PROXY "";
+        fastcgi_param HTTP_PROXY "";  # Keep but ensure it's blank
     }
 }

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Lint Nginx configs in your pipeline

# .github/workflows/nginx-lint.yml
jobs:
  nginx-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Validate Nginx config
        run: |
          docker run --rm \
            -v ${{ github.workspace }}/nginx:/etc/nginx:ro \
            nginx:alpine nginx -t
      - name: Run gixy security linter
        run: |
          pip install gixy
          gixy nginx/sites-available/*.conf

2. Detect cgi.rfc2616_headers drift with Ansible/Chef

# ansible/roles/php-fpm/tasks/assert_headers.yml
- name: Assert cgi.rfc2616_headers is disabled
  ansible.builtin.lineinfile:
    path: /etc/php/{{ php_version }}/fpm/php.ini
    regexp: '^cgi\.rfc2616_headers'
    line: 'cgi.rfc2616_headers = 0'
    state: present
  notify: Reload php-fpm

3. Checkov custom policy for SCRIPT_FILENAME

# checkov/custom/nginx_script_filename.py
from checkov.common.models.enums import CheckResult
from checkov.nginx.checks.base_nginx_check import BaseNginxCheck

class NginxScriptFilenameCheck(BaseNginxCheck):
    def __init__(self):
        name = "Ensure SCRIPT_FILENAME uses $realpath_root not $document_root"
        id = "CKV_NGINX_CUSTOM_001"
        super().__init__(name=name, check_id=id)

    def check_resource_conf(self, conf):
        # Flag any fastcgi_param SCRIPT_FILENAME using $document_root
        if '$document_root$fastcgi_script_name' in str(conf):
            return CheckResult.FAILED, conf
        return CheckResult.PASSED, conf

4. Smoke test in staging before deploy

#!/bin/bash
# scripts/smoke-test-fpm.sh — run this in your CD pipeline after deploy

set -euo pipefail

RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" https://staging.example.com/health)

if [ "$RESPONSE" != "200" ]; then
  echo "FATAL: FPM health check returned $RESPONSE — rolling back"
  # Trigger your rollback mechanism here
  exit 1
fi

# Also check Nginx error log for the specific string
if journalctl -u nginx --since "5 minutes ago" | grep -q 'invalid header'; then
  echo "FATAL: Nginx reporting invalid headers from upstream — deploy blocked"
  exit 1
fi

echo "Smoke test passed."

5. Alerting rule (Prometheus + Alertmanager)

# prometheus/rules/nginx.yml
groups:
  - name: nginx_fpm_health
    rules:
      - alert: NginxUpstreamInvalidHeader
        expr: increase(nginx_http_requests_total{status="502"}[5m]) > 10
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Nginx 502 spike — likely PHP-FPM invalid header"
          description: "More than 10 502s in 5 minutes on {{ $labels.instance }}. Check /var/log/nginx/error.log for 'upstream sent invalid header'."
          runbook_url: "https://stackengine.dev/fix/nginx-upstream-invalid-header-crlf"

Related Diagnostics

"Part of the Performance Utility Matrix."

View all 219 Performance Tools →