Initializing Enclave...

How to Fix Terraform 'Error: failed to assume role' IAM – Root Cause & Complete Solution

Threat/Impact Level: CRITICAL | Exploitability/Downtime Risk: HIGH | Time to Fix: 10–20 mins

TL;DR

  • What broke: Terraform's AWS provider cannot call sts:AssumeRole — the trust policy on the target role doesn't authorize the caller, the external_id is missing or wrong, or the source identity lacks sts:AssumeRole in its permission policy.
  • How to fix it: Audit the trust relationship on the target IAM role, verify the external_id matches exactly, and confirm the calling principal has an explicit sts:AssumeRole allow in its attached policy.
  • Shortcut: Use our Client-Side Sandbox below to auto-refactor this — paste your provider block and the role's trust policy JSON and get a corrected diff instantly.

The Incident (What Does the Error Mean?)

Raw error output from terraform apply:

Error: failed to assume role

  with provider["registry.terraform.io/hashicorp/aws"],
  on main.tf line 3, in provider "aws":
   3:   assume_role {

AWS Error: AccessDenied: User: arn:aws:iam::123456789012:role/ci-runner
is not authorized to assume role: arn:aws:iam::987654321098:role/terraform-deploy-role
        status code: 403, request id: a1b2c3d4-...

Immediate consequence: Every single terraform plan and terraform apply in this workspace is dead. No resources can be created, updated, or destroyed. If this is your CI/CD pipeline, every deployment is blocked. If this is a cross-account setup, all cross-account automation is offline.

This is not a transient error. It will not self-heal. STS returned a hard 403 AccessDenied — the assume-role handshake failed at the IAM control plane level.


The Attack Vector / Blast Radius

This error surfaces in three distinct failure scenarios, each with its own blast radius:

Scenario 1 — Trust Policy Gap (Most Common) The target role's trust policy (sts:AssumeRole principal) does not list the calling entity. If you recently rotated the CI runner's IAM role ARN, renamed an instance profile, or migrated from IAM users to IAM roles, the trust relationship is stale. Every cross-account and same-account automation using this role is broken.

Scenario 2 — Missing external_id (Third-Party / Multi-Tenant Risk) If the role was created with a required external_id condition (standard for vendor integrations and multi-tenant SaaS), omitting it in the Terraform provider block causes a silent 403. Worse: if external_id is present in the trust policy but your Terraform config doesn't send it, an attacker who gains access to your caller ARN cannot assume the role — so this is a security control, not just a config nuisance. Breaking it means your automation is locked out, but fixing it carelessly (removing the condition) opens a confused deputy attack surface.

Scenario 3 — Permission Boundary or SCP Blocking sts:AssumeRole AWS Organizations Service Control Policies (SCPs) or IAM permission boundaries on the calling role can explicitly deny sts:AssumeRole even if the trust policy is correct. This is the hardest to debug because the trust policy looks fine. An SCP Deny is an explicit override that cannot be overridden by any identity policy.


How to Fix It (The Solution)

Step 1 — Identify the Exact Failure Point

Run this before touching any Terraform:

# Test assume-role manually with the exact ARN Terraform uses
aws sts assume-role \
  --role-arn "arn:aws:iam::987654321098:role/terraform-deploy-role" \
  --role-session-name "debug-session" \
  --external-id "your-external-id-if-required"

# Check the calling identity
aws sts get-caller-identity

If this returns AccessDenied, the problem is confirmed at the STS layer — not Terraform.


Basic Fix — Correct the Trust Policy

Navigate to IAM → Roles → terraform-deploy-role → Trust relationships and verify the principal.

# IAM Trust Policy JSON on the TARGET role (terraform-deploy-role)
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
-       "AWS": "arn:aws:iam::123456789012:role/old-ci-runner"
+       "AWS": "arn:aws:iam::123456789012:role/ci-runner"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
-         "sts:ExternalId": "wrong-external-id"
+         "sts:ExternalId": "correct-external-id-matching-provider-block"
        }
      }
    }
  ]
}

Basic Fix — Correct the Terraform Provider Block

# provider.tf
provider "aws" {
  region = "us-east-1"

  assume_role {
-   role_arn     = "arn:aws:iam::987654321098:role/wrong-role-name"
+   role_arn     = "arn:aws:iam::987654321098:role/terraform-deploy-role"
+   session_name = "terraform-ci-session"
+   external_id  = "correct-external-id-matching-trust-policy"
  }
}

Enterprise Best Practice — Least-Privilege Cross-Account Role with Conditions

Never use a bare principal without condition keys. Lock down the trust policy with aws:PrincipalArn, sts:ExternalId, and optionally aws:MultiFactorAuthPresent for human roles.

# Terraform resource: the TARGET role's trust policy (in the target account)
resource "aws_iam_role" "terraform_deploy_role" {
  name = "terraform-deploy-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect    = "Allow"
        Principal = {
-         AWS = "arn:aws:iam::123456789012:root"
+         AWS = "arn:aws:iam::123456789012:role/ci-runner"
        }
        Action = "sts:AssumeRole"
        Condition = {
          StringEquals = {
+           "sts:ExternalId"      = var.external_id
+           "aws:PrincipalArn"    = "arn:aws:iam::123456789012:role/ci-runner"
          }
+         Bool = {
+           "aws:SecureTransport" = "true"
+         }
        }
      }
    ]
  })
}

# Terraform resource: permission policy on the CALLING role (in the source account)
resource "aws_iam_role_policy" "allow_assume_deploy_role" {
  name = "allow-assume-terraform-deploy-role"
  role = aws_iam_role.ci_runner.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
-       Action   = "sts:*"
-       Resource = "*"
+       Action   = "sts:AssumeRole"
+       Resource = "arn:aws:iam::987654321098:role/terraform-deploy-role"
      }
    ]
  })
}

Key hardening decisions:

  • sts:* on * is a critical misconfiguration — scope to exact action and exact role ARN.
  • Using arn:aws:iam::ACCOUNT:root as principal allows any identity in that account to assume the role if they have sts:AssumeRole — always pin to the specific role ARN.
  • external_id stored in var.external_id should be sourced from a secrets manager, not hardcoded.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1 — Checkov Static Analysis (Catches Wildcard STS Before Apply)

Add to your CI pipeline:

# .github/workflows/terraform.yml
- name: Run Checkov IAM Scan
  uses: bridgecrewio/checkov-action@master
  with:
    directory: .
    check: CKV_AWS_110,CKV_AWS_111,CKV_AWS_49
    framework: terraform

CKV_AWS_49 specifically flags IAM roles with overly permissive trust policies.

2 — OPA / Conftest Policy (Enforce external_id Requirement)

# policies/iam_assume_role.rego
package terraform.iam

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_iam_role"
  policy := json.unmarshal(resource.change.after.assume_role_policy)
  stmt := policy.Statement[_]
  stmt.Effect == "Allow"
  stmt.Action == "sts:AssumeRole"
  not stmt.Condition.StringEquals["sts:ExternalId"]
  msg := sprintf("Role '%v' trust policy allows sts:AssumeRole without sts:ExternalId condition — confused deputy risk.", [resource.name])
}

Run in CI:

terraform show -json tfplan.binary | conftest test --policy policies/ -

3 — terraform validate + Provider Auth Pre-Flight

Add an explicit STS identity check as the first CI step — fail fast before terraform plan wastes time:

# ci-preflight.sh
set -e
echo "Verifying caller identity..."
aws sts get-caller-identity
echo "Testing role assumption..."
aws sts assume-role \
  --role-arn "${TF_VAR_deploy_role_arn}" \
  --role-session-name "ci-preflight" \
  --external-id "${TF_VAR_external_id}" \
  --query 'AssumedRoleUser.Arn' \
  --output text
echo "Role assumption verified. Proceeding with terraform plan."

4 — AWS Config Rule for Trust Policy Drift

Deploy AWS_IAM_ROLE_TRUST_POLICY_CHANGE Config rule with SNS alerting. Any modification to a trust relationship triggers an alert to your security channel — catches accidental or malicious trust policy changes before the next deployment fails.

Related Diagnostics

"Part of the Security Utility Matrix."

View all 140 Security Tools →