Initializing Enclave...

Fixing HPA 'failed to get cpu utilization: missing request for cpu' — Kubernetes Autoscaling Broken by Missing Resource Requests

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–10 mins


TL;DR

  • What broke: Your HPA targets CPU utilization, but the Deployment's container spec has no resources.requests.cpu defined — the metrics pipeline has nothing to divide against, so HPA emits this error and stops scaling entirely.
  • How to fix it: Add explicit resources.requests.cpu (and ideally limits.cpu) to every container in the Deployment spec.
  • Fast path: Use our Client-Side Sandbox above to auto-refactor this — paste your Deployment YAML and the tool generates the corrected spec locally without sending your config anywhere.

The Incident (What Does the Error Mean?)

Raw error from kubectl describe hpa <name> or kubectl get events:

failed to get cpu utilization: missing request for cpu for container <container-name> in pod <pod-name>

The HPA controller calls the Metrics API to compute:

current_utilization% = current_cpu_usage / requested_cpu

If requested_cpu is zero or absent, this is a divide-by-zero condition. The controller refuses to produce a utilization percentage and marks the HPA condition AbleToScale: False or ScalingActive: False. Your workload is now running with zero autoscaling. Under load, pods will not be added. You will OOMKill or saturate CPU with no relief.


The Attack Vector / Blast Radius

This is a silent failure. The Deployment runs. The HPA object exists. kubectl get hpa shows <unknown>/50% in the TARGETS column. Engineers assume autoscaling is working. It is not.

Cascading failure chain:

  1. Traffic spike hits the service.
  2. HPA evaluates — emits missing request for cpu — takes no action.
  3. Existing pods absorb load until CPU throttling kicks in (if limits exist) or until the node is saturated.
  4. If only memory limits are set (common pattern), there is no CPU throttling either. Pods consume unbounded CPU.
  5. Noisy neighbor effect degrades other workloads on the same node.
  6. Node-level OOM or CPU starvation triggers pod evictions across unrelated services.
  7. If the Deployment is behind an ingress with no circuit breaker, cascading 503s propagate to clients.

The blast radius is not limited to this Deployment. Node saturation is a cluster-wide event.


How to Fix It

Basic Fix

Add resources.requests.cpu to every container in the Deployment. This is the minimum required for HPA CPU metrics to function.

     spec:
       containers:
       - name: app
         image: my-app:latest
+        resources:
+          requests:
+            cpu: "250m"
+            memory: "256Mi"
-        resources:
-          limits:
-            memory: "512Mi"

Enterprise Best Practice

Requests without limits create a different problem (noisy neighbor, no throttling). Set both. Use a LimitRange at the namespace level as a safety net to reject pods that omit resource specs entirely.

     spec:
       containers:
       - name: app
         image: my-app:latest
         resources:
-          limits:
-            memory: "512Mi"
+          requests:
+            cpu: "250m"
+            memory: "256Mi"
+          limits:
+            cpu: "1000m"
+            memory: "512Mi"

LimitRange to enforce this at admission time:

apiVersion: v1
kind: LimitRange
metadata:
  name: enforce-resource-requests
  namespace: production
spec:
  limits:
  - type: Container
    default:
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:
      cpu: "100m"
      memory: "128Mi"
    max:
      cpu: "2"
      memory: "2Gi"

A LimitRange with defaultRequest will inject CPU requests automatically for containers that omit them — but do not rely on this as your only control. It masks the misconfiguration rather than failing fast.

Verify HPA is healthy after the fix:

kubectl describe hpa <hpa-name> -n <namespace> | grep -A5 Conditions
# Expect: AbleToScale: True, ScalingActive: True

kubectl get hpa <hpa-name> -n <namespace>
# TARGETS column should show actual utilization, e.g., 45%/50%

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. OPA/Gatekeeper ConstraintTemplate — Reject Deployments missing CPU requests at admission:

package kubernetes.admission

deny[msg] {
  input.request.kind.kind == "Deployment"
  container := input.request.object.spec.template.spec.containers[_]
  not container.resources.requests.cpu
  msg := sprintf("Container '%v' must define resources.requests.cpu", [container.name])
}

2. Checkov — Run in your pipeline against all manifests:

checkov -d ./k8s-manifests --check CKV_K8S_11,CKV_K8S_10
# CKV_K8S_11: CPU requests must be set
# CKV_K8S_10: Memory requests must be set

3. Kyverno ClusterPolicy — Simpler alternative to OPA for teams already running Kyverno:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-cpu-requests
spec:
  validationFailureAction: Enforce
  rules:
  - name: check-cpu-requests
    match:
      resources:
        kinds: [Deployment]
    validate:
      message: "CPU requests are required for all containers (HPA dependency)."
      pattern:
        spec:
          template:
            spec:
              containers:
              - resources:
                  requests:
                    cpu: "?*"

4. Helm lint + values schema — If your workloads are Helm-managed, enforce requests in values.schema.json:

"resources": {
  "type": "object",
  "required": ["requests"],
  "properties": {
    "requests": {
      "type": "object",
      "required": ["cpu", "memory"]
    }
  }
}

This fails helm install and helm upgrade before anything reaches the cluster.

Related Diagnostics

"Part of the Performance Utility Matrix."

View all 219 Performance Tools →