Fixing 'unable to upgrade connection: 403 Forbidden' in EKS kubectl logs with IRSA
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 15–30 mins
TL;DR
- What broke:
kubectl logs -finitiates a WebSocket upgrade request through the EKS API server to the kubelet. A403 Forbiddenmeans the IAM principal or the Kubernetes RBAC subject is being denied at one of three checkpoints: theaws-authConfigMap mapping, the RBACpods/logverb, or the node IAM instance profile lackingAmazonEKSWorkerNodePolicy. - How to fix it: Verify the
aws-authConfigMap maps your IAM role correctly, confirm your RBAC ClusterRole includespods/logandpods/execverbs, and ensure the EC2 node instance profile has the required managed policies attached. - Shortcut: Use our Client-Side Sandbox below to auto-refactor your
aws-authConfigMap and RBAC manifests without leaking your ARNs.
The Incident (What Does the Error Mean?)
$ kubectl logs -f my-pod-7d9f8b-xkz2p
error: unable to upgrade connection: Forbidden (user=arn:aws:iam::123456789012:assumed-role/my-role/session, verb=create, resource=nodes, subresource=proxy)
kubectl logs -f is not a simple REST GET. It requests an HTTP/1.1 → WebSocket protocol upgrade routed through the EKS API server down to the kubelet's /containerLogs/ endpoint on the node. The EKS API server enforces both IAM authentication (via the aws-iam-authenticator webhook) and Kubernetes RBAC authorization on this upgrade path. A 403 means the request passed TLS but was denied at the authorization layer. Your pod is running. Your cluster is up. But you are blind — no logs, no exec, no port-forward.
The Attack Vector / Blast Radius
This is not just an inconvenience. The blast radius breaks down into two failure modes:
1. Operational Blindness During Incidents
If this hits during a production outage, your on-call engineer cannot stream logs. They cannot kubectl exec into a crashing container. Every second spent debugging the 403 instead of the actual application failure compounds the MTTR.
2. IRSA Privilege Confusion Leading to Overpermissioning
The dangerous failure pattern: an engineer hits this 403, panics, and attaches AmazonEKSClusterPolicy or a wildcard eks:* IAM policy to the node role to "just make it work." This is how nodes end up with permissions to call eks:CreateCluster, iam:PassRole, or ec2:RunInstances. A compromised pod on that node can then use the IMDS v1 endpoint (http://169.254.169.254/latest/meta-data/iam/security-credentials/) to retrieve the node's instance profile credentials and escalate to full cluster or account takeover.
Root cause is almost always one of these four things:
| # | Cause | Where It Breaks |
|---|---|---|
| 1 | IAM role not mapped in aws-auth ConfigMap |
Authentication webhook |
| 2 | RBAC missing pods/log subresource verb |
Kubernetes RBAC |
| 3 | Node instance profile missing AmazonEKSWorkerNodePolicy |
Kubelet ↔ API server trust |
| 4 | IRSA OIDC trust policy condition mismatch (sts:AssumeRoleWithWebIdentity) |
IAM STS |
How to Fix It
Step 1 — Confirm Who the API Server Thinks You Are
kubectl auth whoami
# or for older clusters:
kubectl get pods --v=9 2>&1 | grep "Response Status"
If the returned username is system:anonymous, your IAM role is not in aws-auth. If it returns your role ARN but still 403s, it's RBAC.
Fix A — aws-auth ConfigMap (Basic Fix)
# kubectl edit configmap aws-auth -n kube-system
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- - rolearn: arn:aws:iam::123456789012:role/my-dev-role
- username: my-dev-user
- groups:
- - system:masters
+ - rolearn: arn:aws:iam::123456789012:role/my-dev-role
+ username: my-dev-user
+ groups:
+ - eks-log-viewers
Do not map developer roles to
system:masters. That grants cluster-admin. Map to a scoped RBAC group.
Fix B — RBAC ClusterRole with pods/log (Enterprise Best Practice)
-apiVersion: rbac.authorization.k8s.io/v1
-kind: ClusterRole
-metadata:
- name: eks-log-viewers
-rules:
- - apiGroups: [""]
- resources: ["pods"]
- verbs: ["get", "list", "watch"]
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+ name: eks-log-viewers
+rules:
+ - apiGroups: [""]
+ resources: ["pods"]
+ verbs: ["get", "list", "watch"]
+ - apiGroups: [""]
+ resources: ["pods/log"]
+ verbs: ["get", "list"]
+ - apiGroups: [""]
+ resources: ["nodes/proxy"]
+ verbs: ["get"]
---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+ name: eks-log-viewers-binding
+roleRef:
+ apiGroup: rbac.authorization.k8s.io
+ kind: ClusterRole
+ name: eks-log-viewers
+subjects:
+ - kind: Group
+ name: eks-log-viewers
+ apiGroup: rbac.authorization.k8s.io
Fix C — Node IAM Instance Profile (Managed Policies)
# Terraform: aws_iam_role_policy_attachment for node group role
-resource "aws_iam_role_policy_attachment" "node_cni_only" {
- role = aws_iam_role.eks_node.name
- policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
-}
+resource "aws_iam_role_policy_attachment" "node_worker" {
+ role = aws_iam_role.eks_node.name
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
+}
+
+resource "aws_iam_role_policy_attachment" "node_cni" {
+ role = aws_iam_role.eks_node.name
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
+}
+
+resource "aws_iam_role_policy_attachment" "node_ecr_readonly" {
+ role = aws_iam_role.eks_node.name
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
+}
Fix D — IRSA OIDC Trust Policy Condition Mismatch
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
- "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:default:my-sa"
+ "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:my-namespace:my-sa",
+ "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:aud": "sts.amazonaws.com"
}
}
}
Namespace mismatch in the OIDC sub claim is the #1 silent IRSA failure. The token is issued but STS rejects
AssumeRoleWithWebIdentity, causing the pod to fall back to the node instance profile — or fail entirely.
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing
aws-authConfigMap or IRSA trust policy into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. Checkov — Scan Terraform Node Role Policies
checkov -d ./terraform --check CKV_AWS_58,CKV_AWS_37
# CKV_AWS_58: EKS node group must have AmazonEKSWorkerNodePolicy
2. OPA/Gatekeeper — Block system:masters Mappings in aws-auth
package k8s.awsauth
violation[{"msg": msg}] {
binding := input.review.object.data.mapRoles
contains(binding, "system:masters")
msg := "Mapping IAM roles to system:masters is prohibited. Use scoped RBAC groups."
}
3. kubectl auth can-i in Pipeline Smoke Tests
# Add to post-deploy validation step
kubectl auth can-i get pods/log \
--namespace=production \
--as=system:serviceaccount:production:my-sa
# Must return: yes
4. Enforce IMDSv2 on All EKS Nodes (Blocks SSRF-based credential theft)
resource "aws_launch_template" "eks_node" {
+ metadata_options {
+ http_endpoint = "enabled"
+ http_tokens = "required" # IMDSv2 only
+ http_put_response_hop_limit = 1
+ }
}
5. Audit aws-auth Changes with CloudTrail + EventBridge
# Alert on any ConfigMap update in kube-system
aws events put-rule --name detect-aws-auth-mutation \
--event-pattern '{"source":["aws.eks"],"detail-type":["AWS API Call via CloudTrail"],"detail":{"eventName":["UpdateConfigMap"],"requestParameters":{"namespace":["kube-system"],"name":["aws-auth"]}}}'