Initializing Enclave...

Fixing 'unable to upgrade connection: 403 Forbidden' in EKS kubectl logs with IRSA

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 15–30 mins


TL;DR

  • What broke: kubectl logs -f initiates a WebSocket upgrade request through the EKS API server to the kubelet. A 403 Forbidden means the IAM principal or the Kubernetes RBAC subject is being denied at one of three checkpoints: the aws-auth ConfigMap mapping, the RBAC pods/log verb, or the node IAM instance profile lacking AmazonEKSWorkerNodePolicy.
  • How to fix it: Verify the aws-auth ConfigMap maps your IAM role correctly, confirm your RBAC ClusterRole includes pods/log and pods/exec verbs, and ensure the EC2 node instance profile has the required managed policies attached.
  • Shortcut: Use our Client-Side Sandbox below to auto-refactor your aws-auth ConfigMap and RBAC manifests without leaking your ARNs.

The Incident (What Does the Error Mean?)

$ kubectl logs -f my-pod-7d9f8b-xkz2p
error: unable to upgrade connection: Forbidden (user=arn:aws:iam::123456789012:assumed-role/my-role/session, verb=create, resource=nodes, subresource=proxy)

kubectl logs -f is not a simple REST GET. It requests an HTTP/1.1 → WebSocket protocol upgrade routed through the EKS API server down to the kubelet's /containerLogs/ endpoint on the node. The EKS API server enforces both IAM authentication (via the aws-iam-authenticator webhook) and Kubernetes RBAC authorization on this upgrade path. A 403 means the request passed TLS but was denied at the authorization layer. Your pod is running. Your cluster is up. But you are blind — no logs, no exec, no port-forward.


The Attack Vector / Blast Radius

This is not just an inconvenience. The blast radius breaks down into two failure modes:

1. Operational Blindness During Incidents If this hits during a production outage, your on-call engineer cannot stream logs. They cannot kubectl exec into a crashing container. Every second spent debugging the 403 instead of the actual application failure compounds the MTTR.

2. IRSA Privilege Confusion Leading to Overpermissioning The dangerous failure pattern: an engineer hits this 403, panics, and attaches AmazonEKSClusterPolicy or a wildcard eks:* IAM policy to the node role to "just make it work." This is how nodes end up with permissions to call eks:CreateCluster, iam:PassRole, or ec2:RunInstances. A compromised pod on that node can then use the IMDS v1 endpoint (http://169.254.169.254/latest/meta-data/iam/security-credentials/) to retrieve the node's instance profile credentials and escalate to full cluster or account takeover.

Root cause is almost always one of these four things:

# Cause Where It Breaks
1 IAM role not mapped in aws-auth ConfigMap Authentication webhook
2 RBAC missing pods/log subresource verb Kubernetes RBAC
3 Node instance profile missing AmazonEKSWorkerNodePolicy Kubelet ↔ API server trust
4 IRSA OIDC trust policy condition mismatch (sts:AssumeRoleWithWebIdentity) IAM STS

How to Fix It

Step 1 — Confirm Who the API Server Thinks You Are

kubectl auth whoami
# or for older clusters:
kubectl get pods --v=9 2>&1 | grep "Response Status"

If the returned username is system:anonymous, your IAM role is not in aws-auth. If it returns your role ARN but still 403s, it's RBAC.


Fix A — aws-auth ConfigMap (Basic Fix)

# kubectl edit configmap aws-auth -n kube-system
 apiVersion: v1
 kind: ConfigMap
 metadata:
   name: aws-auth
   namespace: kube-system
 data:
   mapRoles: |
-    - rolearn: arn:aws:iam::123456789012:role/my-dev-role
-      username: my-dev-user
-      groups:
-        - system:masters
+    - rolearn: arn:aws:iam::123456789012:role/my-dev-role
+      username: my-dev-user
+      groups:
+        - eks-log-viewers

Do not map developer roles to system:masters. That grants cluster-admin. Map to a scoped RBAC group.


Fix B — RBAC ClusterRole with pods/log (Enterprise Best Practice)

-apiVersion: rbac.authorization.k8s.io/v1
-kind: ClusterRole
-metadata:
-  name: eks-log-viewers
-rules:
-  - apiGroups: [""]
-    resources: ["pods"]
-    verbs: ["get", "list", "watch"]
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: eks-log-viewers
+rules:
+  - apiGroups: [""]
+    resources: ["pods"]
+    verbs: ["get", "list", "watch"]
+  - apiGroups: [""]
+    resources: ["pods/log"]
+    verbs: ["get", "list"]
+  - apiGroups: [""]
+    resources: ["nodes/proxy"]
+    verbs: ["get"]
---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: eks-log-viewers-binding
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: eks-log-viewers
+subjects:
+  - kind: Group
+    name: eks-log-viewers
+    apiGroup: rbac.authorization.k8s.io

Fix C — Node IAM Instance Profile (Managed Policies)

 # Terraform: aws_iam_role_policy_attachment for node group role
-resource "aws_iam_role_policy_attachment" "node_cni_only" {
-  role       = aws_iam_role.eks_node.name
-  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
-}
+resource "aws_iam_role_policy_attachment" "node_worker" {
+  role       = aws_iam_role.eks_node.name
+  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
+}
+
+resource "aws_iam_role_policy_attachment" "node_cni" {
+  role       = aws_iam_role.eks_node.name
+  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
+}
+
+resource "aws_iam_role_policy_attachment" "node_ecr_readonly" {
+  role       = aws_iam_role.eks_node.name
+  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
+}

Fix D — IRSA OIDC Trust Policy Condition Mismatch

 {
   "Effect": "Allow",
   "Principal": {
     "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"
   },
   "Action": "sts:AssumeRoleWithWebIdentity",
   "Condition": {
     "StringEquals": {
-      "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:default:my-sa"
+      "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:my-namespace:my-sa",
+      "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:aud": "sts.amazonaws.com"
     }
   }
 }

Namespace mismatch in the OIDC sub claim is the #1 silent IRSA failure. The token is issued but STS rejects AssumeRoleWithWebIdentity, causing the pod to fall back to the node instance profile — or fail entirely.


💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing aws-auth ConfigMap or IRSA trust policy into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Checkov — Scan Terraform Node Role Policies

checkov -d ./terraform --check CKV_AWS_58,CKV_AWS_37
# CKV_AWS_58: EKS node group must have AmazonEKSWorkerNodePolicy

2. OPA/Gatekeeper — Block system:masters Mappings in aws-auth

package k8s.awsauth

violation[{"msg": msg}] {
  binding := input.review.object.data.mapRoles
  contains(binding, "system:masters")
  msg := "Mapping IAM roles to system:masters is prohibited. Use scoped RBAC groups."
}

3. kubectl auth can-i in Pipeline Smoke Tests

# Add to post-deploy validation step
kubectl auth can-i get pods/log \
  --namespace=production \
  --as=system:serviceaccount:production:my-sa
# Must return: yes

4. Enforce IMDSv2 on All EKS Nodes (Blocks SSRF-based credential theft)

 resource "aws_launch_template" "eks_node" {
+  metadata_options {
+    http_endpoint               = "enabled"
+    http_tokens                 = "required"  # IMDSv2 only
+    http_put_response_hop_limit = 1
+  }
 }

5. Audit aws-auth Changes with CloudTrail + EventBridge

# Alert on any ConfigMap update in kube-system
aws events put-rule --name detect-aws-auth-mutation \
  --event-pattern '{"source":["aws.eks"],"detail-type":["AWS API Call via CloudTrail"],"detail":{"eventName":["UpdateConfigMap"],"requestParameters":{"namespace":["kube-system"],"name":["aws-auth"]}}}'

Related Diagnostics

"Part of the Security Utility Matrix."

View all 140 Security Tools →