Fixing IAM Trust Policy Error: Principal Must Be an ARN or Service for Service-Linked Roles
Threat/Impact Level: CRITICAL | Exploitability/Downtime Risk: HIGH | Time to Fix: 5 mins
TL;DR
- What broke: Your IAM trust policy's
Principalfield contains a bare string, wildcard, or account ID without the required ARN format — AWS rejects this outright when the target is a service-linked role. - How to fix it: Replace the malformed
Principalvalue with a fully-qualified ARN (arn:aws:iam::123456789012:root) or a valid AWS service principal (elasticloadbalancing.amazonaws.com). - Fast path: Use our Client-Side Sandbox above to auto-refactor this — paste your broken policy and get a corrected trust document without leaking your ARNs to any server.
The Incident (What Does the Error Mean?)
Raw error output from AWS CLI / Console:
An error occurred (MalformedPolicyDocument) when calling the CreateRole operation:
role trust policy 'Principal' must be an ARN or service when attaching to service-linked role
Or via Terraform:
Error: error creating IAM Role: MalformedPolicyDocument:
role trust policy 'Principal' must be an ARN or service
when attaching to service-linked role
status code: 400, request id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Immediate consequence: Role creation or modification fails entirely. Any dependent infrastructure — ECS task definitions, EKS node groups, RDS enhanced monitoring, Lambda execution roles — cannot attach to the service-linked role and will throw downstream AccessDenied or resource provisioning failures. In a Terraform apply pipeline, this aborts the entire plan mid-execution, leaving partial state.
The Attack Vector / Blast Radius
This is not just a syntax annoyance. Here is the actual blast radius:
1. Broken trust = broken service identity. Service-linked roles (e.g., AWSServiceRoleForElasticLoadBalancing, AWSServiceRoleForECS) have stricter validation than standard IAM roles. AWS enforces that the Principal is verifiably resolvable — a bare account ID like "Principal": "123456789012" or a glob like "Principal": "*" is rejected at the API layer before the role is ever written.
2. Wildcard Principal is a privilege escalation vector. If you got here by trying "Principal": "*" to "make it work quickly" — stop. A trust policy with Principal: * and no Condition block means any AWS identity in any account can assume this role. This is a textbook IAM privilege escalation path. Attackers scan for exactly this misconfiguration using tools like Pacu and Prowler.
3. Cascading deployment failures. In an IaC pipeline (Terraform, CDK, CloudFormation), a failed aws_iam_role resource causes all dependent resources — aws_iam_role_policy_attachment, aws_ecs_service, aws_lb — to fail with DependencyError. Your deployment is dead until this is corrected and re-applied.
4. Silent drift in manually managed roles. If an engineer manually patches a trust policy in the console using a copy-pasted snippet with a bare string Principal, the role silently fails to update. The old trust relationship persists, and the service can no longer assume the role after a rotation — causing a production outage with no obvious error trail.
How to Fix It (The Solution)
Basic Fix
The Principal in a service-linked role trust policy must be one of:
- A fully-qualified IAM ARN:
arn:aws:iam::ACCOUNT_ID:rootorarn:aws:iam::ACCOUNT_ID:role/RoleName - An AWS service principal:
ec2.amazonaws.com,ecs.amazonaws.com,elasticloadbalancing.amazonaws.com - A federated principal:
cognito-identity.amazonaws.com,arn:aws:iam::ACCOUNT_ID:saml-provider/PROVIDER
Never use a bare account ID, a display name, or * without a restrictive Condition block.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
- "Service": "123456789012"
+ "Service": "elasticloadbalancing.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Another common variant — bare account ID instead of ARN:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
- "AWS": "123456789012"
+ "AWS": "arn:aws:iam::123456789012:root"
},
"Action": "sts:AssumeRole"
}
]
}
Enterprise Best Practice
For production service-linked roles, lock the trust policy down with Condition keys to prevent lateral movement even if credentials are compromised:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
- "AWS": "*"
+ "AWS": "arn:aws:iam::123456789012:role/my-ecs-task-role"
},
"Action": "sts:AssumeRole",
+ "Condition": {
+ "StringEquals": {
+ "sts:ExternalId": "unique-external-id-per-tenant",
+ "aws:PrincipalOrgID": "o-xxxxxxxxxx"
+ },
+ "Bool": {
+ "aws:MultiFactorAuthPresent": "true"
+ }
+ }
}
]
}
Key hardening controls:
aws:PrincipalOrgID: Restricts assumption to identities within your AWS Organization. Blocks cross-account attacks from outside your org entirely.sts:ExternalId: Mandatory for any cross-account role. Mitigates the confused deputy problem.aws:MultiFactorAuthPresent: Enforces MFA for human-assumed roles. Do not apply to service principals — services cannot present MFA.- Avoid
arn:aws:iam::ACCOUNT_ID:rootunless you explicitly need account-level trust. Prefer a specific role ARN or user ARN.
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
This class of error should never reach a terraform apply or a CloudFormation deploy. Gate it earlier:
1. Checkov — static IaC scanning (runs in 30 seconds):
checkov -d . --check CKV_AWS_110,CKV_AWS_111,CKV_AWS_107
# CKV_AWS_110: Ensure IAM policies do not allow privilege escalation
# CKV_AWS_111: Ensure IAM policies do not allow write access without constraint
# CKV_AWS_107: Ensure IAM policies do not allow credentials exposure
2. OPA / Conftest policy for Principal validation:
# policy/iam_trust_principal.rego
package main
deny[msg] {
role := input.resource.aws_iam_role[_]
policy := json.unmarshal(role.assume_role_policy)
stmt := policy.Statement[_]
principal := stmt.Principal
is_string(principal)
not startswith(principal, "arn:aws:")
not endswith(principal, ".amazonaws.com")
msg := sprintf("IAM role '%v' has a non-ARN, non-service Principal: '%v'", [role.name, principal])
}
conftest test terraform_plan.json --policy policy/
3. AWS Config Rule — continuous drift detection post-deploy:
Enable the managed rule iam-no-inline-policy-check and write a custom Config rule using IAM_ROLE_MANAGED_POLICY_CHECK to flag any trust policy where Principal resolves to a wildcard or bare account ID.
4. Pre-commit hook with aws iam simulate-principal-policy:
# .pre-commit-config.yaml
- repo: local
hooks:
- id: validate-trust-policy
name: Validate IAM Trust Policy Principal
entry: bash -c 'python3 scripts/validate_trust_principals.py'
language: system
files: \.json$
The validation script should use jsonschema or a regex gate: any Principal value that does not match ^arn:aws: or end in .amazonaws.com or .amazon.com should fail the commit.
5. Terraform precondition block (Terraform ≥ 1.2):
resource "aws_iam_role" "service_linked" {
name = "AWSServiceRoleForECS"
assume_role_policy = data.aws_iam_policy_document.trust.json
lifecycle {
precondition {
condition = can(regex("^arn:aws:|.amazonaws.com$", var.principal_identifier))
error_message = "Principal must be a valid ARN or AWS service endpoint. Bare account IDs and wildcards are not permitted for service-linked roles."
}
}
}
This gates the plan phase — the error surfaces before any AWS API call is made.