Initializing Enclave...

How to Fix 'Etcd mvcc: database space exceeded' and Restore Your Kubernetes Control Plane

Threat/Impact Level: CRITICAL | Exploitability/Downtime Risk: HIGH | Time to Fix: 15–30 mins

TL;DR

  • What broke: Etcd hit its --quota-backend-bytes ceiling (default 2GB). All Kubernetes API write operations — deployments, configmaps, secrets — are now rejected with rpc error: code = ResourceExhausted.
  • How to fix it: Force compaction on old MVCC revisions, run etcdctl defrag on every member, then raise the quota and restart.
  • Fast path: Use our Client-Side Sandbox above to paste your etcd pod manifest or etcdctl endpoint status output and auto-generate the corrected defrag sequence and patched manifest.

The Incident (What does the error mean?)

Raw error surface — you'll see this in kube-apiserver logs and from kubectl itself:

etcdserver: mvcc: database space exceeded
rpc error: code = ResourceExhausted desc = etcdserver: mvcc: database space exceeded

Etcd uses a MVCC (Multi-Version Concurrency Control) bolt database. Every write appends a new revision; old revisions are NOT garbage-collected automatically unless compaction is explicitly triggered. When the bolt DB file on disk crosses the --quota-backend-bytes threshold, etcd raises an alarm and enters a read-only state. At that point:

  • kubectl apply fails.
  • New pod scheduling stops.
  • The Kubernetes controller manager cannot reconcile any resources.
  • Your cluster is effectively frozen.

The Attack Vector / Blast Radius

This is a full control-plane outage vector. The cascading failure path:

  1. Etcd alarm raised → etcd stops accepting writes.
  2. Kube-apiserver returns 503 or ResourceExhausted to all mutating requests.
  3. Controller Manager and Scheduler lose the ability to write status updates — running pods continue, but no new scheduling decisions are persisted.
  4. Operators and Helm releases begin timing out and retrying, generating even more write pressure on recovery.
  5. In multi-member clusters, if you defrag only one member, the others remain in alarm state — a common mistake that extends the outage.

High-churn workloads (HPA thrashing, frequent ConfigMap/Secret rotations, high-volume CRD controllers like Flux or ArgoCD) are the primary drivers. A single runaway controller writing status patches every few seconds can exhaust 2GB in hours.


How to Fix It (The Solution)

Step 1 — Confirm the alarm

etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  alarm list
# Expected: memberID:XXXXXXXX alarm:NOSPACE

Step 2 — Compact old revisions

# Get current revision
REV=$(etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint status --write-out="json" | jq '.[0].Status.header.revision')

# Compact to current revision
etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  compact $REV

Step 3 — Defrag ALL members (critical — do not skip members)

etcdctl --endpoints=https://etcd1:2379,https://etcd2:2379,https://etcd3:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  defrag --cluster

Step 4 — Disarm the alarm

etcdctl ... alarm disarm

Basic Fix — Raise the quota

# /etc/kubernetes/manifests/etcd.yaml (static pod)
  - command:
    - etcd
-   - --quota-backend-bytes=2147483648
+   - --quota-backend-bytes=8589934592

Warning: The hard ceiling is 8GB. Beyond that, etcd performance degrades non-linearly. If you're hitting 8GB, you have an architectural problem — not a quota problem.


Enterprise Best Practice — Enable automatic compaction

# /etc/kubernetes/manifests/etcd.yaml
  - command:
    - etcd
+   - --auto-compaction-mode=periodic
+   - --auto-compaction-retention=1h
-   # No compaction configured — revisions accumulate indefinitely
    - --quota-backend-bytes=8589934592

For kubeadm-managed clusters, patch via ClusterConfiguration:

# kubeadm-config.yaml
 apiVersion: kubeadm.k8s.io/v1beta3
 kind: ClusterConfiguration
 etcd:
   local:
     extraArgs:
+      auto-compaction-mode: "periodic"
+      auto-compaction-retention: "1h"
+      quota-backend-bytes: "8589934592"
-      # missing compaction config

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Alert before you hit the wall. Prometheus rule:

- alert: EtcdDatabaseQuotaUsageHigh
  expr: etcd_mvcc_db_total_size_in_bytes / etcd_server_quota_backend_bytes > 0.80
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "Etcd DB usage above 80% on {{ $labels.instance }}"

2. OPA/Gatekeeper policy — reject CRDs from controllers that write high-frequency status patches without resourceVersion guards.

3. Enforce compaction in your etcd Helm chart or Ansible role — make auto-compaction-retention a required, non-defaultable parameter in your IaC. Fail the pipeline if it's absent:

# Checkov custom check (pseudocode)
if 'auto-compaction-retention' not in etcd_args:
    raise CheckovCheckFailure("Etcd compaction not configured")

4. Regular defrag job — run etcdctl defrag as a CronJob on a maintenance window, not reactively during an outage.

5. Snapshot + size audit in CI — after every major deployment, run etcdctl snapshot save and assert db_size < threshold as a pipeline gate.

Related Diagnostics

"Part of the Performance Utility Matrix."

View all 219 Performance Tools →