Initializing Enclave...

Fixing AccessDenied on glue:CreateDatabase When AWS Lake Formation Is Enabled

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 10–20 mins


TL;DR

  • What broke: Lake Formation's authorization layer is rejecting glue:CreateDatabase even though your IAM role has Glue permissions — LF ignores IAM-only grants once it takes control of the data catalog.
  • How to fix it: Grant the calling principal CREATE DATABASE permission in Lake Formation and ensure the role has lakeformation:GrantOnResource or is registered as a Data Lake Administrator.
  • Fast path: Use our Client-Side Sandbox below to auto-refactor this — paste your IAM policy and LF settings, get the corrected config without sending your ARNs to a third-party server.

The Incident (What Does the Error Mean?)

You hit this at 2 AM during a pipeline deploy or a Terraform apply:

An error occurred (AccessDenied) when calling the CreateDatabase operation:
  User: arn:aws:sts::123456789012:assumed-role/MyGlueRole/session
  is not authorized to perform: glue:CreateDatabase
  on resource: arn:aws:glue:us-east-1:123456789012:catalog
  with an explicit deny

The phrase "with an explicit deny" is the tell. This is not a missing IAM policy. This is Lake Formation's resource-level authorization layer issuing an active denial. Once LF is enabled on your account (either manually or via AWS Organizations SCPs), it intercepts all Glue Data Catalog API calls. Your IAM policy is irrelevant at this point — LF has its own permission model layered on top, and the principal has zero LF grants.

Immediate consequence: Any ETL job, Glue crawler, Terraform Glue resource, or data pipeline that calls CreateDatabase is dead in the water. CI/CD pipelines fail silently or with cryptic errors if your tooling doesn't surface the full AWS error response.


The Attack Vector / Blast Radius

This misconfiguration cuts both ways — it's a security control working as intended that becomes an operational incident when not provisioned correctly.

Why this is dangerous if you work around it wrong:

The instinctive "fix" engineers reach for is re-enabling Use only IAM access control in Lake Formation settings, or granting lakeformation:* to the role. Both are catastrophic:

  • Re-enabling IAM-only mode retroactively removes all fine-grained LF column/row/table permissions across your entire data lake — every downstream Athena query, Redshift Spectrum job, and EMR cluster that relied on LF grants breaks simultaneously.
  • Granting lakeformation:* to a service role means that role can now grant itself access to any table in your data catalog, exfiltrate schemas, and register arbitrary S3 paths as data lake locations — a full lateral movement path inside your data perimeter.

Blast radius of the root cause:

  • All Glue jobs, crawlers, and Terraform modules using this role fail.
  • If this role is shared across environments (common anti-pattern), prod, staging, and dev pipelines all halt simultaneously.
  • If the role is attached to a Lambda or Step Functions state machine, downstream orchestration silently skips database provisioning steps, causing data quality failures hours later.

How to Fix It (The Solution)

Basic Fix — Grant CREATE DATABASE in Lake Formation Console

  1. Navigate to AWS Lake Formation → Permissions → Data lake permissions → Grant.
  2. Select the IAM principal (role or user) that is being denied.
  3. Under LF-Tags or catalog resources, choose Named data catalog resources.
  4. Set Database to * (or the specific target database name).
  5. Check Create database under Database permissions.
  6. Click Grant.

This takes effect immediately — no IAM propagation delay.


Enterprise Best Practice — Least-Privilege LF Grant via CLI + Terraform

The bad state — IAM policy that looks sufficient but is silently overridden by LF:

- {
-   "Version": "2012-10-17",
-   "Statement": [
-     {
-       "Effect": "Allow",
-       "Action": [
-         "glue:CreateDatabase",
-         "glue:GetDatabase",
-         "glue:DeleteDatabase"
-       ],
-       "Resource": "*"
-     }
-   ]
- }
- # Missing: Zero Lake Formation grants. LF intercepts and denies.
- # Missing: No lakeformation:GetDataAccess on the role.

The correct state — IAM policy + mandatory LF passthrough permission + Terraform LF grant:

+ # Step 1: IAM Policy — add lakeformation:GetDataAccess (required for LF passthrough)
+ {
+   "Version": "2012-10-17",
+   "Statement": [
+     {
+       "Effect": "Allow",
+       "Action": [
+         "glue:CreateDatabase",
+         "glue:GetDatabase",
+         "glue:DeleteDatabase",
+         "lakeformation:GetDataAccess"
+       ],
+       "Resource": "*"
+     }
+   ]
+ }
+
+ # Step 2: Terraform — Lake Formation database-level grant (least privilege)
+ resource "aws_lakeformation_permissions" "glue_role_create_db" {
+   principal   = aws_iam_role.glue_role.arn
+   permissions = ["CREATE_DATABASE"]
+
+   catalog_resource {
+     catalog_id = data.aws_caller_identity.current.account_id
+   }
+ }
+
+ # Step 3: If creating a SPECIFIC database, scope to that database post-creation
+ resource "aws_lakeformation_permissions" "glue_role_db_permissions" {
+   principal   = aws_iam_role.glue_role.arn
+   permissions = ["ALTER", "DROP", "DESCRIBE"]
+
+   database {
+     catalog_id = data.aws_caller_identity.current.account_id
+     name       = aws_glue_catalog_database.target_db.name
+   }
+ }

CLI equivalent for immediate remediation:

# Grant CREATE_DATABASE on the catalog root
aws lakeformation grant-permissions \
  --principal DataLakePrincipalIdentifier="arn:aws:iam::ACCOUNT_ID:role/MyGlueRole" \
  --resource '{"Catalog": {}}' \
  --permissions "CREATE_DATABASE" \
  --region us-east-1

# Verify the grant landed
aws lakeformation list-permissions \
  --principal DataLakePrincipalIdentifier="arn:aws:iam::ACCOUNT_ID:role/MyGlueRole" \
  --region us-east-1

Critical note on Data Lake Administrators: If your account has designated Data Lake Admins, those principals bypass all LF permission checks. Do not make your Glue service role a DL Admin to solve this — that grants it unrestricted access to your entire catalog and all registered S3 paths.


💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Checkov — Catch Missing LF Grants in Terraform Plans

Add this to your .checkov.yml or pipeline step:

# .checkov.yml
checks:
  - CKV_AWS_97  # Ensure Glue Data Catalog encryption is enabled
  # Custom check: ensure every aws_glue_catalog_database has a paired
  # aws_lakeformation_permissions resource in the same module

For a custom Checkov policy:

# checkov/custom_checks/check_lf_grant_paired_with_glue_db.py
from checkov.terraform.checks.resource.base_resource_check import BaseResourceCheck
from checkov.common.models.enums import CheckResult, CheckCategories

class GlueDBHasLFGrant(BaseResourceCheck):
    def __init__(self):
        name = "Ensure aws_glue_catalog_database has paired LF permissions"
        id = "CKV_CUSTOM_LF_001"
        supported_resources = ["aws_glue_catalog_database"]
        categories = [CheckCategories.IAM]
        super().__init__(name=name, id=id, categories=categories,
                         supported_resources=supported_resources)

    def scan_resource_conf(self, conf):
        # Flag for manual review — LF grants must be verified in plan
        return CheckResult.FAILED

2. OPA / Conftest — Enforce LF Grant Existence in Terraform Plans

# policies/lakeformation_grant_required.rego
package terraform.lakeformation

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_glue_catalog_database"
  resource.change.actions[_] == "create"
  
  # Check no paired lakeformation_permissions resource targets this database
  not lf_grant_exists(resource.address)
  
  msg := sprintf(
    "[LF-001] aws_glue_catalog_database '%v' has no paired aws_lakeformation_permissions grant. CreateDatabase will fail if Lake Formation is enabled.",
    [resource.address]
  )
}

lf_grant_exists(db_address) {
  resource := input.resource_changes[_]
  resource.type == "aws_lakeformation_permissions"
  contains(resource.change.after.database[_].name, db_address)
}

3. Pipeline Gate — Validate LF Settings Before Apply

#!/bin/bash
# scripts/validate_lakeformation.sh — run before terraform apply in CI

set -euo pipefail

LF_SETTINGS=$(aws lakeformation get-data-lake-settings --region "${AWS_REGION}")
IAM_ALLOWED=$(echo "$LF_SETTINGS" | jq -r '.DataLakeSettings.CreateDatabaseDefaultPermissions | length')

if [ "$IAM_ALLOWED" -gt 0 ]; then
  echo "[WARN] Lake Formation has default catalog permissions set. Verify your Glue role has explicit LF grants before deploying."
fi

# Verify the deploying role has CREATE_DATABASE grant
ROLE_ARN=$(aws sts get-caller-identity --query Arn --output text)
PERMS=$(aws lakeformation list-permissions \
  --principal DataLakePrincipalIdentifier="$ROLE_ARN" \
  --resource '{"Catalog": {}}' \
  --query 'PrincipalResourcePermissions[].Permissions[]' \
  --output text)

if [[ ! "$PERMS" =~ "CREATE_DATABASE" ]]; then
  echo "[FATAL] Role $ROLE_ARN lacks CREATE_DATABASE on LF catalog. Apply will fail."
  exit 1
fi

echo "[OK] Lake Formation CREATE_DATABASE grant verified for $ROLE_ARN"

Wire this into your GitHub Actions or GitLab CI as a pre-terraform apply step. It catches the missing grant before your pipeline burns 10 minutes on a Terraform run that will hard-fail at the Glue resource.

Related Diagnostics

"Part of the Security Utility Matrix."

View all 140 Security Tools →