What causes the 'Lock ID mismatch' error in Terraform with S3 and DynamoDB?

The error occurs when Terraform's conditional write to DynamoDB (`attribute_not_exists(LockID)`) fails because a lock record already exists for that state path. Common causes: (1) a CI runner crashed or was killed mid-apply and never released the lock, (2) the `key` path in the S3 backend config was changed, creating a mismatch between the stored lock's path and the new lock request, or (3) two concurrent Terraform processes attempted to acquire the same lock simultaneously.

Is it safe to run `terraform force-unlock` or delete the DynamoDB item directly?

It is safe ONLY if you have confirmed with absolute certainty that no active Terraform process currently holds the lock. Check your CI/CD pipelines, running EC2 instances, and local developer machines. If any live process owns the lock and you delete it, that process will continue writing to state without a lock, risking state corruption if another process also starts. Always verify the lock's `Created` timestamp and `Who` field before force-unlocking.

How do I prevent orphaned Terraform state locks from blocking CI/CD pipelines in the future?

Three layers of defense: (1) Set `interruptible: false` in your CI job configuration to prevent mid-apply cancellations. (2) Add an `after_script` or post-job hook that calls `terraform force-unlock -force` with the stored lock ID on job failure. (3) Implement a DynamoDB TTL attribute on your lock table with a scheduled Lambda that stamps `ExpiresAt` on lock creation, ensuring locks auto-expire after a safe maximum window (e.g., 2 hours) even if the unlock call is never made.

How to Fix Terraform S3 Backend 'Failed to Lock State: Lock ID Mismatch' with DynamoDB

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 10–20 mins

TL;DR

What broke: Terraform attempted to acquire a DynamoDB state lock but found the LockID in the DynamoDB table doesn't match the lock it's trying to create or release — usually a stale lock from a crashed apply, a concurrent run, or a backend reconfiguration that changed the lock key path.
How to fix it: Identify the orphaned lock item in DynamoDB, verify no active Terraform process owns it, then force-unlock using terraform force-unlock <LOCK_ID> or delete the DynamoDB item directly via AWS CLI.
Fast path: Use our Client-Side Sandbox above to auto-refactor this — paste your backend block and the DynamoDB lock item JSON, and get the exact unlock commands and corrected HCL generated locally in your browser.

The Incident (What Does the Error Mean?)

Raw error output from a failed terraform apply or terraform plan:

Error: Failed to lock state

Error acquiring the state lock: ConditionalCheckFailedException: The conditional request failed

Lock Info:
  ID:        a3f1c2d4-7e89-4b12-a1d0-5f6e7c8b9012
  Path:      s3://my-tf-state-bucket/prod/terraform.tfstate
  Operation: OperationTypePlan
  Who:       ci-runner@gitlab-runner-prod
  Version:   1.6.3
  Created:   2024-11-14 03:22:11.408742 +0000 UTC
  Info:

Terraform acquires a state lock to protect the state from being written
by multiple users at the same time. Please resolve the issue above and
try again. For most commands, you can disable locking with the
"-lock=false" flag, but this is not recommended.

Immediate consequence: Every subsequent plan, apply, or destroy is blocked. Your CI/CD pipeline is dead. If this is a production environment, no infrastructure changes can be safely deployed until the lock is cleared. The DynamoDB ConditionalCheckFailedException means Terraform's PutItem with a condition expression (attribute_not_exists(LockID)) failed — a lock record already exists for this state path.

The Attack Vector / Blast Radius

This is not just an inconvenience — understand the full failure surface:

Scenario 1 — Stale lock from a crashed CI runner (most common): A GitLab/GitHub Actions runner was killed mid-apply (OOM kill, spot instance termination, pipeline timeout). Terraform never ran its deferred unlock. The DynamoDB item persists indefinitely. Every subsequent pipeline run hits ConditionalCheckFailedException.

Scenario 2 — Backend path drift: Someone changed the key parameter in the S3 backend config (e.g., renamed workspace path or environment prefix). Terraform now generates a different LockID composite key (<bucket>/<key>) but the old lock item in DynamoDB still holds the previous path. The table entry and the new lock request are for different logical paths — Terraform's lock ID validation fails.

Scenario 3 — Concurrent apply from two runners: Two CI pipelines triggered simultaneously (e.g., a merge and a manual run). One acquired the lock legitimately. The second is now blocked. This is the correct behavior — but if the first runner is stuck or dead, you still end up with an orphaned lock.

Blast radius if you use -lock=false carelessly:

Two concurrent apply operations write to the same .tfstate simultaneously → state file corruption.
Corrupted state = Terraform loses track of real infrastructure → duplicate resources, failed destroys, billing blowout.
In a worst case with remote state and no versioning on the S3 bucket, the corrupted state is unrecoverable without manual reconciliation.

DynamoDB table structure — what you're actually fighting:

The lock item stored in DynamoDB looks like this:

{
  "LockID": { "S": "my-tf-state-bucket/prod/terraform.tfstate" },
  "Info": {
    "S": "{\"ID\":\"a3f1c2d4-7e89-4b12-a1d0-5f6e7c8b9012\",\"Operation\":\"OperationTypePlan\",\"Who\":\"ci-runner@gitlab-runner-prod\",\"Version\":\"1.6.3\",\"Created\":\"2024-11-14T03:22:11.408742Z\"}"
  }
}

The LockID attribute is the partition key of the DynamoDB table. Terraform uses a conditional write (attribute_not_exists(LockID)) to acquire it atomically. A mismatch or stale entry breaks this atomic check.

How to Fix It (The Solution)

Step 1 — Confirm the lock is truly orphaned

Before you touch anything, verify no active Terraform process legitimately holds this lock:

# Inspect the current lock item directly
aws dynamodb get-item \
  --table-name terraform-state-locks \
  --key '{"LockID": {"S": "my-tf-state-bucket/prod/terraform.tfstate"}}' \
  --region us-east-1

Check the Created timestamp in the Info field. If it's older than your longest possible apply run time and no pipeline is currently active — it's stale. Kill it.

Basic Fix — `terraform force-unlock`

Use the Lock ID from the error output (the UUID, not the path):

terraform force-unlock a3f1c2d4-7e89-4b12-a1d0-5f6e7c8b9012

Terraform will prompt for confirmation. This calls DynamoDB DeleteItem on the lock record. Only run this if you are 100% certain no active process owns the lock.

Nuclear Option — Direct DynamoDB DeleteItem

Use this when terraform force-unlock itself fails (e.g., backend config is broken or the workspace is inaccessible):

aws dynamodb delete-item \
  --table-name terraform-state-locks \
  --key '{"LockID": {"S": "my-tf-state-bucket/prod/terraform.tfstate"}}' \
  --region us-east-1

Enterprise Best Practice — Fix the Root Cause (Backend Config Drift)

If the mismatch is caused by a changed key path in your backend config, the fix is to align the backend configuration. Here's the corrected diff:

 terraform {
   backend "s3" {
     bucket         = "my-tf-state-bucket"
-    key            = "environments/prod/terraform.tfstate"
+    key            = "prod/terraform.tfstate"
     region         = "us-east-1"
     dynamodb_table = "terraform-state-locks"
     encrypt        = true
+    kms_key_id     = "arn:aws:kms:us-east-1:123456789012:key/your-key-id"
   }
 }

After correcting the key path, run terraform init -reconfigure to force backend re-initialization:

terraform init -reconfigure

This will NOT destroy state. It re-registers the backend pointer. Then verify the old stale lock is gone and run your plan.

Enterprise Best Practice — DynamoDB Table with TTL-Based Auto-Expiry

Prevent permanent orphaned locks by adding a TTL attribute to your DynamoDB lock table. Terraform does not natively set TTL, but you can implement a Lambda or add it via IaC:

 resource "aws_dynamodb_table" "terraform_locks" {
   name         = "terraform-state-locks"
   billing_mode = "PAY_PER_REQUEST"
   hash_key     = "LockID"

   attribute {
     name = "LockID"
     type = "S"
   }

+  attribute {
+    name = "ExpiresAt"
+    type = "N"
+  }
+
+  ttl {
+    attribute_name = "ExpiresAt"
+    enabled        = true
+  }
+
   tags = {
     Name        = "terraform-state-locks"
     Environment = "prod"
   }
 }

Note: You'll need a sidecar Lambda or CI step that writes ExpiresAt = now() + 3600 when creating lock items, since Terraform's S3 backend does not write TTL natively. This is a custom wrapper pattern used in large-scale multi-team Terraform deployments.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Always wrap Terraform in a lock-aware CI pattern:

# .gitlab-ci.yml — safe Terraform apply pattern
terraform-apply:
  script:
    - terraform init
    - terraform plan -out=tfplan
    - terraform apply tfplan
  after_script:
    # Force-unlock on job cancellation or failure using stored lock ID
    - |
      LOCK_ID=$(cat .terraform/lock-id 2>/dev/null || echo "")
      if [ -n "$LOCK_ID" ]; then
        terraform force-unlock -force $LOCK_ID || true
      fi
  interruptible: false  # CRITICAL: prevents mid-apply cancellation

2. Enable S3 bucket versioning — non-negotiable:

 resource "aws_s3_bucket_versioning" "tf_state" {
   bucket = aws_s3_bucket.tf_state.id
   versioning_configuration {
-    status = "Disabled"
+    status = "Enabled"
   }
 }

Without versioning, a corrupted state write is permanent. With versioning, you can roll back to the last known-good state.

3. Checkov policy to enforce DynamoDB locking is configured:

checkov -d . --check CKV_TF_3  # Checks: ensure S3 backend uses DynamoDB for locking

4. OPA/Sentinel policy — enforce lock table is always set:

# opa/terraform_backend_lock.rego
package terraform.backend

deny[msg] {
  backend := input.configuration.backend_config
  backend.type == "s3"
  not backend.config.dynamodb_table
  msg := "S3 backend MUST specify a DynamoDB lock table. No exceptions."
}

5. Monitor for stale locks with a CloudWatch alarm:

Set up a scheduled Lambda that scans the DynamoDB lock table for items older than 2 hours and fires a CloudWatch alarm or PagerDuty alert. A lock older than your maximum apply window is definitionally orphaned and should auto-page on-call.