Initializing Enclave...

Fixing AWS STS GetCallerIdentity 'Invalid Security Token' Error for Temporary Credentials

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 5–15 mins

TL;DR

  • What broke: sts:GetCallerIdentity rejected your temporary credentials because the X-Amz-Security-Token is expired, malformed, revoked, or the system clock is skewed beyond AWS's 5-minute tolerance.
  • How to fix it: Reissue the temporary credentials via sts:AssumeRole or refresh the instance/ECS/EKS metadata endpoint; verify NTP sync; confirm the session token is passed as the third credential parameter, not omitted.
  • Shortcut: Use our Client-Side Sandbox above to auto-refactor your credential initialization code — no data leaves your browser.

The Incident (What Does the Error Mean?)

Raw error output from AWS CLI or SDK:

An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation:
The request was rejected because the security token is invalid.

or via Boto3/SDK:

botocore.exceptions.ClientError: An error occurred (InvalidClientTokenId):
The security token included in the request is invalid.

Immediate consequence: Every downstream call gated on identity verification fails. This includes service-to-service auth handshakes, Vault AWS auth backends, IRSA token validation in EKS, and any CI/CD pipeline step that assumes a role before deploying. The pipeline is dead until credentials are valid.


The Attack Vector / Blast Radius

This error is not just an annoyance — it signals a credential lifecycle failure that has real security implications:

  • If the token was revoked: Someone with IAM write access may have called sts:RevokeSession or deleted the underlying IAM role trust policy. Treat this as a potential credential compromise event until confirmed otherwise. Check CloudTrail immediately for DeleteRole, UpdateAssumeRolePolicy, or RevokeSession events.
  • If the token is expired and your code silently retries with stale credentials: Depending on your SDK retry logic, you may be logging the full AWS_SESSION_TOKEN value in retry error traces — exposing a still-valid-until-recently token in plaintext logs.
  • Clock skew >5 minutes: AWS SigV4 signing rejects requests outside a ±5-minute window. On EC2 instances with chrony misconfigured, or in air-gapped environments, this silently breaks all IAM auth. An attacker who can manipulate the system clock on a compromised host can deliberately push it outside the window to deny service.
  • Blast radius: Any service using this identity — S3 bucket access, Secrets Manager reads, ECR pulls, DynamoDB operations — is fully blocked. In a microservices mesh where one service assumes a role and vends sub-tokens, the failure cascades to every dependent service.

How to Fix It (The Solution)

Root Cause Checklist (run in this order)

  1. Check token expiry — STS temporary credentials expire between 15 minutes and 36 hours depending on DurationSeconds at assume-role time.
  2. Verify clock synctimedatectl status or chronyc tracking. Offset must be <5 minutes.
  3. Confirm session token is present — Temporary credentials require three components: AccessKeyId, SecretAccessKey, AND SessionToken. Omitting the session token is the #1 cause of this error.
  4. Check partition/region — A token issued in aws-cn or aws-us-gov is invalid in the standard aws partition.
  5. Check CloudTrail for revocationRevokeSession, DeleteRole, DetachRolePolicy events in the last hour.

Basic Fix — Boto3 (Python)

import boto3

- # WRONG: Omitting session token for temporary credentials
- client = boto3.client(
-     'sts',
-     aws_access_key_id='ASIA....',
-     aws_secret_access_key='wJalrXUtn....',
- )

+ # CORRECT: All three components required for temporary credentials
+ client = boto3.client(
+     'sts',
+     aws_access_key_id='ASIA....',
+     aws_secret_access_key='wJalrXUtn....',
+     aws_session_token='IQoJb3JpZ2lu...',  # Required for STS-issued creds
+ )

response = client.get_caller_identity()

Basic Fix — AWS CLI credential refresh

- # Stale assume-role output hardcoded in ~/.aws/credentials
- [default]
- aws_access_key_id = ASIA...OLD
- aws_secret_access_key = oldSecret
- # aws_session_token line missing entirely

+ # Re-issue via CLI and export fresh credentials
+ export $(aws sts assume-role \
+   --role-arn arn:aws:iam::123456789012:role/MyRole \
+   --role-session-name debug-session \
+   --query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]' \
+   --output text | awk '{print "AWS_ACCESS_KEY_ID="$1" AWS_SECRET_ACCESS_KEY="$2" AWS_SESSION_TOKEN="$3}')

Enterprise Best Practice — Auto-Refresh with SDK Credential Providers

Never hardcode or manually manage temporary credentials in production. Use the SDK's built-in provider chain, which handles refresh automatically.

- # ANTI-PATTERN: Manually managing short-lived credentials in application code
- creds = get_creds_from_vault()  # Fetched once at startup, never refreshed
- client = boto3.client('sts',
-     aws_access_key_id=creds['AccessKeyId'],
-     aws_secret_access_key=creds['SecretAccessKey'],
-     aws_session_token=creds['SessionToken']
- )

+ # BEST PRACTICE: Use AssumeRoleProvider via botocore RefreshableCredentials
+ from botocore.credentials import RefreshableCredentials
+ from botocore.session import get_session
+ from datetime import datetime, timezone
+
+ def refresh_credentials():
+     sts = boto3.client('sts')
+     response = sts.assume_role(
+         RoleArn='arn:aws:iam::123456789012:role/MyRole',
+         RoleSessionName='auto-refresh-session',
+         DurationSeconds=3600
+     )
+     c = response['Credentials']
+     return {
+         'access_key': c['AccessKeyId'],
+         'secret_key': c['SecretAccessKey'],
+         'token': c['SessionToken'],
+         'expiry_time': c['Expiration'].isoformat()
+     }
+
+ refreshable = RefreshableCredentials.create_from_metadata(
+     metadata=refresh_credentials(),
+     refresh_using=refresh_credentials,
+     method='sts-assume-role'
+ )

For EKS/IRSA: Ensure AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_ARN env vars are injected by the pod identity webhook. The SDK handles the rest.

For EC2/ECS: Do not set AWS_ACCESS_KEY_ID manually. Delete the env var and let the IMDSv2 metadata provider serve auto-rotating credentials.


💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Enforce credential expiry checks with Checkov

# .checkov.yml — flag any hardcoded AWS credential patterns
checks:
  - CKV_AWS_46  # Ensure no hardcoded credentials in Lambda env vars
  - CKV_SECRET_2 # AWS Secret Access Key in source

2. OPA policy — block static credentials in ECS task definitions

package aws.ecs

deny[msg] {
  container := input.containerDefinitions[_]
  env := container.environment[_]
  regex.match(`^AWS_(ACCESS_KEY|SECRET|SESSION_TOKEN)`, env.name)
  msg := sprintf("Container '%v' hardcodes AWS credentials. Use task IAM role instead.", [container.name])
}

3. GitHub Actions — use OIDC, never store STS tokens as secrets

- # WRONG: Storing long-lived or manually-refreshed STS tokens in GitHub Secrets
- env:
-   AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
-   AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
-   AWS_SESSION_TOKEN: ${{ secrets.AWS_SESSION_TOKEN }}  # Expires, breaks pipeline

+ # CORRECT: GitHub OIDC federation — zero stored secrets, auto-rotating
+ permissions:
+   id-token: write
+   contents: read
+ steps:
+   - uses: aws-actions/configure-aws-credentials@v4
+     with:
+       role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
+       aws-region: us-east-1

4. NTP enforcement via AWS Config rule

Enable the ec2-instance-profile-attached and monitor for clock drift using CloudWatch agent metrics (chrony_tracking_system_time_offset). Alert if offset exceeds 240 seconds — before AWS's 300-second hard cutoff.

# Terraform — CloudWatch alarm for NTP drift
resource "aws_cloudwatch_metric_alarm" "ntp_drift" {
  alarm_name          = "ntp-offset-critical"
  metric_name         = "chrony_tracking_system_time_offset"
  namespace           = "CWAgent"
  statistic           = "Maximum"
  period              = 60
  evaluation_periods  = 2
  threshold           = 240
  comparison_operator = "GreaterThanThreshold"
  alarm_actions       = [aws_sns_topic.ops_alerts.arn]
}

Related Diagnostics

"Part of the Security Utility Matrix."

View all 140 Security Tools →