Why does kubectl top pods show 'error: metrics not available yet' even though Metrics Server pods are Running?

Metrics Server pods being in Running state only means the container started — it does not mean it is successfully scraping kubelets. The actual failure is an x509 TLS handshake error when Metrics Server tries to connect to each kubelet's port 10250. Check Metrics Server logs with `kubectl logs -n kube-system deploy/metrics-server` for 'certificate signed by unknown authority' errors. The fix is either adding --kubelet-insecure-tls (non-prod) or mounting the cluster CA and using --kubelet-certificate-authority (production).

Is --kubelet-insecure-tls safe to use in a production Kubernetes cluster?

No. --kubelet-insecure-tls disables all TLS certificate verification between Metrics Server and every kubelet in the cluster. On a compromised internal network, this enables metrics poisoning — an attacker could serve fabricated CPU/memory metrics causing HPAs to make incorrect scaling decisions. It also violates CIS Kubernetes Benchmark control 4.2.10 and will fail SOC2/PCI-DSS audits. Use it only in local dev clusters (kind, minikube) and enforce its absence in production via OPA/Conftest policies in CI.

How do I get Metrics Server working on a kubeadm cluster without disabling TLS verification?

Enable kubelet server certificate rotation in kubelet-config.yaml by setting `serverTLSBootstrap: true`. After restarting kubelets, approve the pending CSRs with `kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve`. Then mount /etc/kubernetes/pki/ca.crt into the Metrics Server pod and pass `--kubelet-certificate-authority=/etc/kubernetes/pki/ca.crt` as a container arg. This establishes a proper trust chain without bypassing TLS.

How to Fix kubectl top pods Showing Empty Metrics: Metrics Server kubelet-insecure-tls Configuration

Threat/Impact Level: MEDIUM | Exploitability/Downtime Risk: LOW (operational blind spot, not direct RCE) | Time to Fix: 5 mins

TL;DR

What broke: Metrics Server pods are running but kubectl top pods/nodes returns error: metrics not available yet or hangs empty — because Metrics Server cannot scrape kubelet /metrics/resource endpoints over TLS without valid cert verification.
How to fix it: Add --kubelet-insecure-tls (dev/non-prod) or mount proper CA certs and use --kubelet-certificate-authority (production) to the Metrics Server Deployment args.
Shortcut: Use our Client-Side Sandbox below to auto-refactor this — paste your Metrics Server manifest and get the corrected spec without sending your cluster config to any external server.

The Incident (What does the error mean?)

You run kubectl top pods -n production and get:

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

or the HPA silently stops scaling because it has no CPU/memory signal. Metrics Server logs tell the real story:

E0612 14:32:01.783204       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.1.10:10250/metrics/resource\": x509: certificate signed by unknown authority" node="worker-node-01"

Immediate consequence: HPA controllers go blind. kubectl top is dead. Any autoscaling tied to CPU/memory metrics stops functioning. In clusters using kubeadm or self-managed CA, kubelet serving certs are signed by the cluster CA — which Metrics Server doesn't trust by default because it runs with its own TLS context and no CA bundle is injected.

The Attack Vector / Blast Radius

This is an operational security misconfiguration, not a direct exploit vector — but the blast radius is significant:

Autoscaler blindness: HPA cannot retrieve metrics.k8s.io API objects. Pods under load will not scale out. You get cascading OOM kills or latency spikes with zero automated remediation.
The insecure bypass risk: The common "just add --kubelet-insecure-tls" fix disables TLS verification entirely between Metrics Server and every kubelet endpoint. In a multi-tenant cluster or one exposed to a compromised internal network, a MITM attacker on the pod network could serve fake metrics — causing HPAs to scale down healthy workloads or refuse to scale up under real load. This is a metrics poisoning attack surface.
Compliance failure: PCI-DSS, SOC2, and CIS Kubernetes Benchmark (Control 4.2.10) explicitly require kubelet serving certificate verification. Running --kubelet-insecure-tls in production is an automatic audit finding.

How to Fix It (The Solution)

Basic Fix (Dev / Non-Prod Only)

Add --kubelet-insecure-tls to the Metrics Server container args. This skips x509 verification against kubelet certs.

# metrics-server Deployment spec.containers[0].args
  args:
    - --cert-dir=/tmp
    - --secure-port=4443
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
+   - --kubelet-insecure-tls

If deploying via Helm:

# values.yaml
 defaultArgs:
-  - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
+  - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
+  - --kubelet-insecure-tls

Do not use this in production. Full stop.

Enterprise Best Practice (Production)

The correct fix is to provision valid kubelet serving certificates signed by a CA that Metrics Server trusts, then mount that CA into the Metrics Server pod.

Step 1: Ensure kubelets are using properly signed serving certs. With kubeadm, enable server cert rotation:

# kubelet-config.yaml (on each node)
  serverTLSBootstrap: true
+ rotateCertificates: true

Approve pending CSRs:

kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve

Step 2: Mount the cluster CA into Metrics Server and reference it:

# metrics-server Deployment
  spec:
    containers:
    - name: metrics-server
      args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP
-       - --kubelet-insecure-tls
+       - --kubelet-certificate-authority=/etc/kubernetes/pki/ca.crt
+     volumeMounts:
+       - name: cluster-ca
+         mountPath: /etc/kubernetes/pki/ca.crt
+         readOnly: true
+   volumes:
+     - name: cluster-ca
+       hostPath:
+         path: /etc/kubernetes/pki/ca.crt
+         type: File

Alternatively, use a projected ConfigMap containing the CA bundle instead of a hostPath mount — hostPath is itself a security finding in hardened clusters.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Conftest / OPA policy — block insecure-tls in production namespaces:

# policy/deny_insecure_metrics_server.rego
package main

deny[msg] {
  input.kind == "Deployment"
  input.metadata.namespace == "kube-system"
  container := input.spec.template.spec.containers[_]
  contains(container.name, "metrics-server")
  arg := container.args[_]
  arg == "--kubelet-insecure-tls"
  msg := "POLICY VIOLATION: metrics-server --kubelet-insecure-tls is forbidden in production. Use --kubelet-certificate-authority instead."
}

2. Checkov inline scan in your pipeline:

checkov -f metrics-server-deployment.yaml --check CKV_K8S_28,CKV_K8S_30

3. Kube-bench scheduled job: Run kube-bench node as a CronJob targeting control 4.2.10 to continuously assert kubelet serving cert configuration hasn't drifted.

4. Helm chart pinning: Lock your metrics-server chart version in your GitOps repo (fleet.yaml or helmrelease.yaml) and gate upgrades through a PR policy that requires the OPA policy check to pass in CI before merge.