What exact IAM actions does Velero need to access an S3 backup storage location?

Velero requires two sets of permissions. On the bucket itself (arn:aws:s3:::BUCKET): s3:GetBucketLocation, s3:ListBucket, s3:ListBucketMultipartUploads. On bucket objects (arn:aws:s3:::BUCKET/*): s3:GetObject, s3:PutObject, s3:DeleteObject, s3:AbortMultipartUpload, s3:ListMultipartUploadParts. Missing s3:GetBucketLocation is the single most common cause of the 'failed to get backup storage location' 403 error.

Why does Velero show BackupStorageLocation phase 'Unavailable' even after I update the IAM policy?

IAM policy propagation can take 10–30 seconds, but the BSL reconciliation loop may not retry immediately. Force a re-validation by patching the validationFrequency on the BSL object to 1m, or restart the Velero pod with 'kubectl rollout restart deployment/velero -n velero'. Also verify the correct IAM role is actually annotated on the velero service account — a stale pod may still be using a cached token from the old role.

Is it safe to use a static IAM user credentials secret for Velero instead of IRSA?

No. Static credentials (aws_access_key_id / aws_secret_access_key in a Kubernetes Secret) are a significant security risk. Kubernetes Secrets are base64-encoded, not encrypted at rest by default, and accessible to any principal with 'get secret' RBAC in the velero namespace. A compromised node or misconfigured RBAC policy exposes long-lived credentials. Use IRSA on EKS or Workload Identity on GKE. These issue short-lived, automatically rotated tokens scoped to the specific service account.

Fixing Velero 'Failed to Get Backup Storage Location' S3 IAM Permission Error

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 10–20 mins

TL;DR

What broke: Velero's BackupStorageLocation controller cannot call s3:GetBucketLocation, s3:ListBucket, or related actions because the IAM principal (role or user) attached to the Velero service account is missing required permissions or scoped to the wrong resource ARN.
How to fix it: Attach a least-privilege IAM policy granting the exact S3 actions Velero requires, scoped to your specific bucket ARN — not *.
Shortcut: Use our Client-Side Sandbox above to auto-refactor your broken IAM policy or BSL manifest without leaking your ARNs to a third-party server.

The Incident (What Does the Error Mean?)

You will see one or more of the following in velero pod logs or in kubectl get backupstoragelocations -n velero:

time="2024-01-15T03:12:44Z" level=error msg="failed to get backup storage location" 
error="rpc error: code = Unknown desc = RequestError: send request failed\ncaused by: AccessDenied: Access Denied\n\tstatus code: 403, request id: A1B2C3D4E5F6"

BackupStorageLocation "default" — Phase: Unavailable

Immediate consequence: Every scheduled and on-demand backup fails silently. velero backup get shows Failed or the BSL shows Unavailable. Your cluster is running without a valid recovery point. If a node failure or namespace wipe happens right now, you have no restore target.

The Attack Vector / Blast Radius

This is a misconfiguration that creates two simultaneous risks:

1. Operational: Zero backup coverage. Velero's BSL validation runs on a reconciliation loop. One 403 poisons the entire location as Unavailable. All backups — including those for stateful workloads like Postgres, Kafka, and Elasticsearch PVCs — are silently skipped. No alert fires unless you have explicit BSL phase monitoring.

2. Security: Overly permissive fixes introduce lateral movement risk. The knee-jerk fix is attaching s3:* on arn:aws:s3:::*. This is catastrophic. A compromised Velero pod (via a malicious container image, a CVE in the Velero binary, or a stolen IRSA token) now has full read/write/delete access to every S3 bucket in the account — including CloudTrail logs, billing exports, and other backup buckets. An attacker can:

Exfiltrate all backup archives (they contain etcd secrets, kubeconfig fragments, PVC data)
Delete all backup objects, destroying your DR capability
Overwrite backups with corrupted archives to poison future restores

Blast radius of s3:* on *: Total account-wide S3 compromise from a single Velero pod escape.

How to Fix It

Basic Fix — Attach the Correct Minimum IAM Policy

The following actions are required by Velero for S3 BSL operation. Scope them to your specific bucket.

- {
-   "Effect": "Allow",
-   "Action": "s3:*",
-   "Resource": "*"
- }

+ {
+   "Version": "2012-10-17",
+   "Statement": [
+     {
+       "Effect": "Allow",
+       "Action": [
+         "s3:GetBucketLocation",
+         "s3:ListBucket",
+         "s3:ListBucketMultipartUploads"
+       ],
+       "Resource": "arn:aws:s3:::YOUR-VELERO-BUCKET"
+     },
+     {
+       "Effect": "Allow",
+       "Action": [
+         "s3:AbortMultipartUpload",
+         "s3:DeleteObject",
+         "s3:GetObject",
+         "s3:ListMultipartUploadParts",
+         "s3:PutObject"
+       ],
+       "Resource": "arn:aws:s3:::YOUR-VELERO-BUCKET/*"
+     }
+   ]
+ }

After applying, force a BSL re-validation:

kubectl patch backupstoragelocation default \
  -n velero \
  --type merge \
  -p '{"spec":{"validationFrequency":"1m"}}'

# Watch until phase flips to Available
kubectl get backupstoragelocation -n velero -w

Enterprise Best Practice — IRSA with Condition Keys (EKS)

Do not use long-lived IAM user credentials (velero-credentials secret with aws_access_key_id). Use IAM Roles for Service Accounts (IRSA).

# IAM Trust Policy — WRONG: overly broad trust
- {
-   "Effect": "Allow",
-   "Principal": {"Federated": "arn:aws:iam::ACCOUNT:oidc-provider/oidc.eks.REGION.amazonaws.com/id/OIDCID"},
-   "Action": "sts:AssumeRoleWithWebIdentity"
- }

# IAM Trust Policy — CORRECT: locked to Velero service account
+ {
+   "Effect": "Allow",
+   "Principal": {
+     "Federated": "arn:aws:iam::ACCOUNT:oidc-provider/oidc.eks.REGION.amazonaws.com/id/OIDCID"
+   },
+   "Action": "sts:AssumeRoleWithWebIdentity",
+   "Condition": {
+     "StringEquals": {
+       "oidc.eks.REGION.amazonaws.com/id/OIDCID:sub": "system:serviceaccount:velero:velero",
+       "oidc.eks.REGION.amazonaws.com/id/OIDCID:aud": "sts.amazonaws.com"
+     }
+   }
+ }

Annotate the Velero service account:

kubectl annotate serviceaccount velero \
  -n velero \
  eks.amazonaws.com/role-arn=arn:aws:iam::ACCOUNT:role/velero-irsa-role

Add --no-secret to your Velero install so it does not mount static credentials:

velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.9.0 \
  --bucket YOUR-VELERO-BUCKET \
  --backup-location-config region=us-east-1 \
  --no-secret \
  --sa-annotations eks.amazonaws.com/role-arn=arn:aws:iam::ACCOUNT:role/velero-irsa-role

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Checkov — Scan IAM policies before terraform apply:

checkov -d ./terraform --check CKV_AWS_40,CKV_AWS_274
# CKV_AWS_40: IAM policy must not allow wildcard actions
# CKV_AWS_274: Disallow IAM policies with resource '*'

2. OPA/Conftest — Enforce no s3:* in Velero namespace policies:

package velero.iam

deny[msg] {
  action := input.Statement[_].Action
  action == "s3:*"
  msg := "Velero IAM policy must not use wildcard s3:* action"
}

deny[msg] {
  resource := input.Statement[_].Resource
  resource == "*"
  msg := "Velero IAM policy must scope Resource to specific bucket ARN"
}

3. Terraform aws_iam_policy validation block:

- resource "aws_iam_policy" "velero" {
-   policy = jsonencode({
-     Statement = [{ Effect = "Allow", Action = "s3:*", Resource = "*" }]
-   })
- }

+ resource "aws_iam_policy" "velero" {
+   policy = data.aws_iam_policy_document.velero_s3.json
+ }
+
+ data "aws_iam_policy_document" "velero_s3" {
+   statement {
+     actions   = ["s3:GetBucketLocation", "s3:ListBucket", "s3:ListBucketMultipartUploads"]
+     resources = ["arn:aws:s3:::${var.velero_bucket_name}"]
+   }
+   statement {
+     actions   = ["s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts"]
+     resources = ["arn:aws:s3:::${var.velero_bucket_name}/*"]
+   }
+ }

4. AlertManager rule — fire immediately when BSL goes Unavailable:

- alert: VeleroBSLUnavailable
  expr: velero_backup_storage_location_info{phase!="Available"} == 1
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "Velero BSL {{ $labels.backup_storage_location }} is {{ $labels.phase }}"
    runbook: "https://your-wiki/velero-bsl-iam-fix"

This fires before the next scheduled backup window, giving you time to fix IAM before you lose a recovery point.