Fixing HPA 'failed to get cpu utilization: missing request for cpu' — Kubernetes Autoscaling Broken by Missing Resource Requests
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–10 mins
TL;DR
- What broke: Your HPA targets CPU utilization, but the Deployment's container spec has no
resources.requests.cpudefined — the metrics pipeline has nothing to divide against, so HPA emits this error and stops scaling entirely. - How to fix it: Add explicit
resources.requests.cpu(and ideallylimits.cpu) to every container in the Deployment spec. - Fast path: Use our Client-Side Sandbox above to auto-refactor this — paste your Deployment YAML and the tool generates the corrected spec locally without sending your config anywhere.
The Incident (What Does the Error Mean?)
Raw error from kubectl describe hpa <name> or kubectl get events:
failed to get cpu utilization: missing request for cpu for container <container-name> in pod <pod-name>
The HPA controller calls the Metrics API to compute:
current_utilization% = current_cpu_usage / requested_cpu
If requested_cpu is zero or absent, this is a divide-by-zero condition. The controller refuses to produce a utilization percentage and marks the HPA condition AbleToScale: False or ScalingActive: False. Your workload is now running with zero autoscaling. Under load, pods will not be added. You will OOMKill or saturate CPU with no relief.
The Attack Vector / Blast Radius
This is a silent failure. The Deployment runs. The HPA object exists. kubectl get hpa shows <unknown>/50% in the TARGETS column. Engineers assume autoscaling is working. It is not.
Cascading failure chain:
- Traffic spike hits the service.
- HPA evaluates — emits
missing request for cpu— takes no action. - Existing pods absorb load until CPU throttling kicks in (if limits exist) or until the node is saturated.
- If only memory limits are set (common pattern), there is no CPU throttling either. Pods consume unbounded CPU.
- Noisy neighbor effect degrades other workloads on the same node.
- Node-level OOM or CPU starvation triggers pod evictions across unrelated services.
- If the Deployment is behind an ingress with no circuit breaker, cascading 503s propagate to clients.
The blast radius is not limited to this Deployment. Node saturation is a cluster-wide event.
How to Fix It
Basic Fix
Add resources.requests.cpu to every container in the Deployment. This is the minimum required for HPA CPU metrics to function.
spec:
containers:
- name: app
image: my-app:latest
+ resources:
+ requests:
+ cpu: "250m"
+ memory: "256Mi"
- resources:
- limits:
- memory: "512Mi"
Enterprise Best Practice
Requests without limits create a different problem (noisy neighbor, no throttling). Set both. Use a LimitRange at the namespace level as a safety net to reject pods that omit resource specs entirely.
spec:
containers:
- name: app
image: my-app:latest
resources:
- limits:
- memory: "512Mi"
+ requests:
+ cpu: "250m"
+ memory: "256Mi"
+ limits:
+ cpu: "1000m"
+ memory: "512Mi"
LimitRange to enforce this at admission time:
apiVersion: v1
kind: LimitRange
metadata:
name: enforce-resource-requests
namespace: production
spec:
limits:
- type: Container
default:
cpu: "500m"
memory: "256Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
max:
cpu: "2"
memory: "2Gi"
A LimitRange with defaultRequest will inject CPU requests automatically for containers that omit them — but do not rely on this as your only control. It masks the misconfiguration rather than failing fast.
Verify HPA is healthy after the fix:
kubectl describe hpa <hpa-name> -n <namespace> | grep -A5 Conditions
# Expect: AbleToScale: True, ScalingActive: True
kubectl get hpa <hpa-name> -n <namespace>
# TARGETS column should show actual utilization, e.g., 45%/50%
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. OPA/Gatekeeper ConstraintTemplate — Reject Deployments missing CPU requests at admission:
package kubernetes.admission
deny[msg] {
input.request.kind.kind == "Deployment"
container := input.request.object.spec.template.spec.containers[_]
not container.resources.requests.cpu
msg := sprintf("Container '%v' must define resources.requests.cpu", [container.name])
}
2. Checkov — Run in your pipeline against all manifests:
checkov -d ./k8s-manifests --check CKV_K8S_11,CKV_K8S_10
# CKV_K8S_11: CPU requests must be set
# CKV_K8S_10: Memory requests must be set
3. Kyverno ClusterPolicy — Simpler alternative to OPA for teams already running Kyverno:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-cpu-requests
spec:
validationFailureAction: Enforce
rules:
- name: check-cpu-requests
match:
resources:
kinds: [Deployment]
validate:
message: "CPU requests are required for all containers (HPA dependency)."
pattern:
spec:
template:
spec:
containers:
- resources:
requests:
cpu: "?*"
4. Helm lint + values schema — If your workloads are Helm-managed, enforce requests in values.schema.json:
"resources": {
"type": "object",
"required": ["requests"],
"properties": {
"requests": {
"type": "object",
"required": ["cpu", "memory"]
}
}
}
This fails helm install and helm upgrade before anything reaches the cluster.