How to Fix Nginx 'limiting requests, excess: 5.000 by zone' Rate Limit Errors in Production
Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 10 mins
TL;DR
- What broke: Nginx's
limit_reqzone is rejecting legitimate requests because theburstqueue is exhausted — excess of 5.000 means 5 requests over the defined rate arrived simultaneously with no buffer headroom. - How to fix it: Increase the
burstparameter on yourlimit_reqdirective and addnodelayto prevent queuing latency from compounding the problem. - Use our Client-Side Sandbox above to paste your
nginx.confand auto-refactor thelimit_req_zoneandlimit_reqdirectives instantly.
The Incident (What Does the Error Mean?)
Raw log output triggering this alert:
2024/01/15 03:42:17 [error] 1234#1234: *98231 limiting requests, excess: 5.000 by zone "api_limit", client: 203.0.113.47, server: api.example.com, request: "POST /v1/checkout HTTP/1.1"
excess: 5.000 is Nginx's internal leaky-bucket counter. It means the incoming request rate exceeded the zone's defined rate by exactly 5 requests at the moment of evaluation. Nginx uses a millisecond-resolution token bucket — when excess breaches the burst ceiling, the request is either delayed (if nodelay is absent) or dropped with HTTP 503.
The zone referenced (api_limit in this example) was defined with a rate like rate=10r/s and a burst=0 or insufficient burst value, giving zero tolerance for any real-world traffic spikes.
The Attack Vector / Blast Radius
This is a self-inflicted denial of service vector. The blast radius:
- Legitimate users get 503'd during any bursty-but-normal traffic pattern: mobile retries, JS parallel fetches, webhook fan-outs.
- Upstream services starve. If Nginx sits in front of a Node.js/Go API, the 503s bypass your application entirely — no circuit breaker, no graceful degradation, just a dropped connection logged nowhere in your app layer.
- Monitoring blindspot. Most teams alert on 5xx at the app layer. Nginx-level rate limit rejections never reach the app, so your APM (Datadog, New Relic) shows zero errors while users are screaming.
- Misconfigured
nodelaycompounds latency. Withoutnodelay, bursting requests are queued and held for(excess / rate)seconds. Atrate=10r/sandexcess=5, that's a 500ms artificial delay injected per request before the eventual 503 — worst of both worlds.
How to Fix It (The Solution)
Basic Fix — Increase Burst Headroom
The immediate lever is burst. Set it to absorb your realistic peak concurrency above the steady-state rate.
http {
# Zone definition — 10 MB shared memory, 10 requests/sec per IP
- limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
+ limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
server {
location /v1/ {
- limit_req zone=api_limit;
+ limit_req zone=api_limit burst=20 nodelay;
}
}
}
burst=20 allows up to 20 excess requests to be accepted instantly before rejection begins. nodelay processes them immediately rather than spacing them out over 2 seconds of artificial delay.
Enterprise Best Practice — Tiered Rate Limiting with Status Code Logging
Flat per-IP rate limiting destroys legitimate power users and internal services equally. Use key-based zoning and structured logging to distinguish abuse from burst.
http {
- limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
+ # Tier 1: Authenticated users keyed by JWT sub claim (set by auth proxy)
+ limit_req_zone $http_x_user_id zone=authed_limit:20m rate=50r/s;
+
+ # Tier 2: Unauthenticated / public endpoints, keyed by IP
+ limit_req_zone $binary_remote_addr zone=public_limit:10m rate=5r/s;
+
+ # Log rate limit rejections to a dedicated file for alerting
+ limit_req_log_level warn;
+ limit_req_status 429;
server {
location /v1/public/ {
- limit_req zone=api_limit;
+ limit_req zone=public_limit burst=10 nodelay;
}
+ location /v1/authed/ {
+ limit_req zone=authed_limit burst=100 nodelay;
+ }
}
}
Key changes:
limit_req_status 429— Return RFC-correct429 Too Many Requestsinstead of503. This lets clients implement exponential backoff correctly.$http_x_user_idzone key — Rate limit by authenticated identity, not IP. Shared office NATs and CDN egress IPs will no longer cause false positives.- Separate zones per endpoint class — Public endpoints stay locked down; authenticated API consumers get headroom proportional to their tier.
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
Stop misconfigured rate limit zones from reaching production:
1. nginx -t in your deploy pipeline (non-negotiable baseline)
# GitHub Actions step
- name: Validate Nginx Config
run: docker run --rm -v $(pwd)/nginx:/etc/nginx nginx nginx -t
2. gixy static analysis — catches logic errors nginx -t misses
pip install gixy
gixy /etc/nginx/nginx.conf
# Flags missing nodelay, dangerous zone sizes, open proxy risks
3. Checkov policy for limit_req enforcement
# checkov custom check: require burst > 0 on all limit_req directives
# Add to your .checkov.yaml policy set
check: CKV_NGINX_RATE_BURST
message: "limit_req must define burst > 0 and nodelay"
4. Load test gate in staging — Use k6 or vegeta to replay your P99 traffic spike against staging before every deploy. If Nginx starts returning 429/503 at expected concurrency, the deploy is blocked.
# vegeta: 50 req/s for 30s, assert zero 5xx
echo "POST https://staging.api.example.com/v1/checkout" | \
vegeta attack -rate=50/s -duration=30s | \
vegeta report --type=text
If 5xx in the report is non-zero, your burst headroom is still insufficient for real traffic.