What is a Linkerd identity trust domain and why does it need to match the issuer cert?

The trust domain is the SPIFFE authority component (e.g., `cluster.local`) that scopes all workload identities in the mesh. Every SVID issued by the Linkerd identity controller takes the form `spiffe:// /ns/ /sa/ `. The issuer certificate's Subject Alternative Name URI must contain the same trust domain. If they differ, the proxy cannot validate peer SVIDs against the trust bundle — mTLS peer authentication fails entirely, and the injector webhook blocks new pod admission.

Can I fix the trust domain mismatch without rotating the CA certificate?

Only if the cert was generated with the correct trust domain and the Helm `identityTrustDomain` value is simply wrong. In that case, update the Helm value and restart the control plane components. However, if the CA cert itself was generated with the wrong SAN URI (the more common case after a copy-paste error in cert generation), you must rotate the CA — there is no way to retroactively change a cert's SAN without reissuing it.

How do I rotate Linkerd's trust anchor without downtime?

Linkerd supports a dual-trust-anchor bundle during rotation. Generate the new CA, then concatenate both the old and new `ca.crt` PEMs into the `identityTrustAnchorsPEM` Helm value. Upgrade the control plane with both anchors present. Roll all workloads to get SVIDs signed by the new issuer. Once all pods are running new SVIDs (verify with `linkerd edges`), remove the old anchor from the bundle and do a final upgrade. This keeps old and new SVIDs mutually valid during the transition window.

Fixing Linkerd 'Proxy Injection Failed' Identity Trust Domain Mismatch in Production

Threat/Impact Level: CRITICAL | Exploitability/Downtime Risk: HIGH | Time to Fix: 15–30 mins

TL;DR

What broke: The identityTrustDomain set in the Linkerd control plane (e.g., cluster.local) does not match the SPIFFE trust domain embedded in the identity issuer certificate (ca.crt), causing the proxy injector webhook to reject all annotated pods.
How to fix it: Re-align the --identity-trust-domain Helm value with the SAN URI in your root CA cert, then re-roll affected workloads. If the cert was generated with the wrong domain, you must rotate it.
Shortcut: Use our Client-Side Sandbox above to paste your linkerd-config ConfigMap and CA cert PEM — it will auto-diff the mismatch and generate the corrected Helm override without sending your certs anywhere.

The Incident (What Does the Error Mean?)

Raw error from kubectl describe pod or the injector webhook logs:

Error from server: error when creating "deploy.yaml":
admission webhook "linkerd-proxy-injector.linkerd.io" denied the request:
proxy injection failed: identity trust domain mismatch:
  issuer has domain "prod.example.com" but control plane expects "cluster.local"

Or from linkerd check:

× issuer cert is signed by the trust anchor
    issuer certificate is not signed by any of the trust anchors
    see https://linkerd.io/2/checks/#l5d-identity-issuer-cert-signed-by-trust-anchor

Immediate consequence: The proxy injector webhook — a validating/mutating admission controller — hard-blocks pod scheduling for every namespace with linkerd.io/inject: enabled. Your deployment rolls out zero replicas. In an existing cluster mid-upgrade, running pods lose the ability to renew their SPIFFE SVIDs, causing mTLS session failures within the SVID TTL window (default: 24h).

The Attack Vector / Blast Radius

This isn't just an ops nuisance — it's a mesh-wide identity collapse.

Why it's dangerous:

mTLS falls back silently in some configurations. If proxy.defaultInboundPolicy is set to all-unauthenticated, workloads that were injected before the mismatch continue running but with no mutual TLS. An attacker with network access to the pod CIDR can intercept east-west traffic in plaintext.
SPIFFE SVID chaining breaks. Linkerd's identity service issues X.509 SVIDs scoped to the trust domain (e.g., spiffe://cluster.local/ns/default/sa/myapp). A mismatched domain means the proxy cannot validate peer certificates against the trust bundle — all peer authentication policies evaluate to DENY or skip, depending on your Server and AuthorizationPolicy resources.
Blast radius on upgrade: This most commonly surfaces during linkerd upgrade when a custom CA was generated with a hardcoded domain and the new Helm chart defaults differ. Every namespace with injection enabled is simultaneously affected. Rollback requires cert rotation, not just a Helm rollback.
Audit gap: Because pods fail at admission, no workload logs are generated — the failure is invisible to application-level alerting. Only webhook audit logs or linkerd check catches it.

How to Fix It

Step 1: Confirm the Mismatch

# Extract the trust domain the control plane expects
kubectl -n linkerd get cm linkerd-config -o jsonpath='{.data.values}' | \
  python3 -c "import sys,json; v=json.load(sys.stdin); print(v['identityTrustDomain'])"

# Extract the trust domain burned into the issuer cert
kubectl -n linkerd get secret linkerd-identity-issuer \
  -o jsonpath='{.data.crt\.pem}' | base64 -d | \
  openssl x509 -noout -text | grep -A1 "Subject Alternative Name"

You will see something like:

# ConfigMap says:  cluster.local
# Cert SAN says:   URI:spiffe://prod.example.com

That delta is your outage.

Basic Fix — Align Helm Value to Existing Cert

If the cert was intentionally generated with prod.example.com and the Helm value is wrong:

# linkerd-values-override.yaml
 identity:
   issuer:
     scheme: kubernetes.io/tls
-  identityTrustDomain: cluster.local
+  identityTrustDomain: prod.example.com

helm upgrade linkerd-control-plane linkerd/linkerd-control-plane \
  -n linkerd \
  -f linkerd-values-override.yaml \
  --reuse-values

# Force restart the injector and identity controller
kubectl -n linkerd rollout restart deploy/linkerd-proxy-injector
kubectl -n linkerd rollout restart deploy/linkerd-identity

# Validate
linkerd check

Enterprise Best Practice — Rotate the CA to Match Cluster Convention

If you're standardizing on cluster.local (recommended for portability) and the cert is wrong, rotate the trust anchor. Do not skip the step-cli verification.

# Generate new root CA with correct trust domain
step certificate create root.linkerd.cluster.local ca.crt ca.key \
  --profile root-ca \
  --no-password \
  --insecure \
  --san "root.linkerd.cluster.local"

# Generate issuer cert signed by new root
step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \
  --profile intermediate-ca \
  --not-after 8760h \
  --no-password \
  --insecure \
  --ca ca.crt \
  --ca-key ca.key

# Helm upgrade with explicit cert injection
 identity:
   issuer:
     scheme: kubernetes.io/tls
+  identityTrustDomain: cluster.local
+  identityTrustAnchorsPEM: |
+    <contents of ca.crt>

helm upgrade linkerd-control-plane linkerd/linkerd-control-plane \
  -n linkerd \
  --set-file identityTrustAnchorsPEM=ca.crt \
  --set identity.issuer.tls.crtPEM="$(cat issuer.crt)" \
  --set identity.issuer.tls.keyPEM="$(cat issuer.key)" \
  --set identityTrustDomain=cluster.local \
  --reuse-values

# Re-roll ALL injected workloads to get new SVIDs
for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
  kubectl -n $ns rollout restart deploy 2>/dev/null
done

linkerd check --proxy

⚠️ During the rotation window, pods with old SVIDs (signed by the old CA) and pods with new SVIDs cannot mutually authenticate. Schedule this in a maintenance window or use Linkerd's trust anchor rotation procedure which supports a dual-trust-anchor bundle.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Pre-Flight `linkerd check` in Your Deploy Pipeline

# .github/workflows/deploy.yaml (or equivalent)
- name: Linkerd pre-flight check
  run: |
    linkerd check --pre 2>&1 | tee /tmp/linkerd-check.log
    if grep -q "×" /tmp/linkerd-check.log; then
      echo "Linkerd control plane check failed. Blocking deploy."
      exit 1
    fi

2. OPA/Gatekeeper Policy — Enforce Trust Domain Annotation Consistency

# opa-linkerd-trustdomain.rego
package linkerd.trustdomain

violation[{"msg": msg}] {
  input.review.object.kind == "ConfigMap"
  input.review.object.metadata.name == "linkerd-config"
  input.review.object.metadata.namespace == "linkerd"
  domain := input.review.object.data.values
  not contains(domain, "identityTrustDomain\":\"cluster.local")
  msg := sprintf("linkerd-config identityTrustDomain must be cluster.local, got: %v", [domain])
}

3. cert-manager + Trust Domain Pinning

Use cert-manager with a ClusterIssuer to enforce the correct SAN on every generated cert:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: linkerd-identity-issuer
  namespace: linkerd
spec:
  secretName: linkerd-identity-issuer
  duration: 8760h
  renewBefore: 720h
  isCA: true
  privateKey:
    algorithm: ECDSA
  dnsNames:
    - identity.linkerd.cluster.local
  uris:
    - spiffe://cluster.local  # <-- THIS must match identityTrustDomain
  issuerRef:
    name: linkerd-trust-anchor
    kind: ClusterIssuer

4. Checkov / Helm Chart Linting

# Render the chart and scan for trust domain consistency
helm template linkerd-control-plane linkerd/linkerd-control-plane \
  -f values.yaml > rendered.yaml

checkov -f rendered.yaml --check CKV2_K8S_6

# Custom script: cross-check rendered trust domain vs. CA cert SAN
python3 scripts/validate_linkerd_trust_domain.py rendered.yaml ca.crt

Pin this validation in your Helm pre-upgrade hook and your GitOps reconciliation loop (Flux/ArgoCD pre-sync hook). A 30-second check here prevents a 30-minute outage.