Why does terraform plan show destroy even though the resource still exists in AWS?

Terraform is comparing your `.tf` configuration against the state file, not directly against AWS. If the resource address was renamed, moved to a different module, or the state file is pointing at the wrong workspace/backend, Terraform sees 'resource defined in state, not in config = destroy'. The resource being live in AWS is irrelevant until you reconcile state with `terraform import` or `terraform state mv`.

Is it safe to use `terraform apply -refresh-only` to fix state drift?

`terraform apply -refresh-only` updates the state file to match real infrastructure — it does NOT change actual resources. It is safe for fixing attribute drift (e.g., a tag added manually). However, it will NOT fix a missing resource address (rename/refactor case) — for that you need `terraform state mv` or a `moved {}` block. Always run `terraform plan` after a refresh-only apply to confirm zero destroys before proceeding.

How do I prevent Terraform from ever destroying a production database?

Add `lifecycle { prevent_destroy = true }` to the resource block. This causes `terraform plan` to hard-error if any configuration change would result in that resource being destroyed. Pair it with an OPA/Sentinel policy in your CI pipeline that blocks any plan JSON containing `"delete"` actions on protected resource types, so even a misconfigured lifecycle block can't slip through.

How to Fix Terraform Plan Showing Destroy After State Drift (Without Nuking Production)

Threat/Impact Level: CRITICAL | Downtime Risk: HIGH | Time to Fix: 15–45 mins depending on resource type

TL;DR

What broke: Terraform's state file diverged from real infrastructure (manual change, out-of-band automation, or a botched state mv), so terraform plan now shows a -/+ destroy-and-recreate or a flat destroy for a live resource.
How to fix it: Reconcile state via terraform import, terraform state mv, or a targeted terraform refresh — then patch the config to match reality before running apply.
Fast path: Use our Client-Side Sandbox below to auto-refactor this — paste your plan output and .tf block and get the exact import command and corrected config without sending your ARNs or credentials anywhere.

The Incident (What Does the Error Mean?)

You run terraform plan and see this:

# aws_db_instance.primary will be destroyed
# (because aws_db_instance.primary is not in configuration)
  - resource "aws_db_instance" "primary" {
      - identifier        = "prod-postgres-01"
      - instance_class    = "db.r6g.2xlarge"
      - allocated_storage = 500
      ...
    }

Plan: 0 to add, 0 to change, 1 to destroy.

Or the subtler destroy-recreate variant:

# aws_security_group.app_sg must be replaced
-/+ resource "aws_security_group" "app_sg" {
      ~ id   = "sg-0abc123" -> (known after apply)  # forces replacement
      ~ name = "app-sg-prod" -> "app-sg-prod-v2"
    }

Immediate consequence: If terraform apply runs — in a pipeline, by a junior engineer, or via auto-apply in Terraform Cloud — that resource is gone or recreated. For stateful resources (RDS, ElastiCache, EBS volumes, security groups with live ingress rules), this is a production outage or a data-loss event.

The Attack Vector / Blast Radius

State drift is deceptively dangerous because the destroy is legitimate from Terraform's perspective — it is doing exactly what the math says. The blast radius depends on resource type:

Resource	Destroy Consequence
`aws_db_instance`	Data loss if final snapshot not forced; RTO in hours
`aws_security_group`	All dependent ENIs lose their SG; instant network blackout
`aws_iam_role`	Attached workloads lose permissions; cascading auth failures
`aws_s3_bucket`	Bucket deletion may purge objects if `force_destroy = true`
`aws_eks_node_group`	Node pool drained; workloads evicted

Root causes of drift — know which one you're dealing with:

Manual console/CLI change — someone added a tag, resized an instance, or renamed a resource outside Terraform.
State file desync — two engineers ran terraform apply against different state backends, or a remote state lock failed silently.
Resource rename/refactor in .tf — a moved {} block was missing after renaming a resource address.
Workspace or backend misconfiguration — plan is running against the wrong workspace, pointing at a stale state file.
Provider upgrade — new provider version reads an attribute differently, causing a computed diff that forces replacement.

How to Fix It (The Solution)

Step 0: Halt the Pipeline Immediately

If this is in CI/CD, kill the apply job now. Set a manual approval gate. Do not let auto-apply proceed.

Basic Fix — Re-import the Drifted Resource

If the resource exists in AWS but Terraform wants to destroy it because it fell out of state:

# Find the real resource ID from AWS CLI
aws rds describe-db-instances --query 'DBInstances[*].DBInstanceIdentifier'

# Re-import it into the correct state address
terraform import aws_db_instance.primary prod-postgres-01

# Re-run plan — verify zero destroys before apply
terraform plan -out=tfplan

If the resource was renamed in .tf without a moved block:

terraform state mv aws_db_instance.old_name aws_db_instance.primary

Enterprise Best Practice — Prevent Drift with `moved` Blocks + Lifecycle Guards

 resource "aws_db_instance" "primary" {
   identifier        = "prod-postgres-01"
   instance_class    = "db.r6g.2xlarge"
   allocated_storage = 500
   engine            = "postgres"
+
+  lifecycle {
+    prevent_destroy = true
+    ignore_changes  = [engine_version, maintenance_window]
+  }
 }

+# If you renamed from aws_db_instance.rds_main — always add this
+moved {
+  from = aws_db_instance.rds_main
+  to   = aws_db_instance.primary
+}

For security group replacement drift (name change forcing destroy):

 resource "aws_security_group" "app_sg" {
-  name        = "app-sg-prod-v2"
+  name        = "app-sg-prod"        # match the real existing name exactly
   description = "Application tier SG"
   vpc_id      = var.vpc_id
+
+  lifecycle {
+    create_before_destroy = true
+    prevent_destroy       = true
+  }
 }

For workspace/backend drift — always explicitly target:

# Confirm you are in the correct workspace before ANY plan
terraform workspace show
terraform workspace select prod

# Refresh state against real infra before planning
terraform apply -refresh-only

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Enforce `prevent_destroy` on Stateful Resources via OPA/Sentinel

# OPA policy: deny any plan containing destroys on protected resource types
package terraform.destroy_guard

protected_types := {"aws_db_instance", "aws_s3_bucket", "aws_elasticache_cluster", "aws_eks_cluster"}

deny[msg] {
  rc := input.resource_changes[_]
  rc.change.actions[_] == "delete"
  protected_types[rc.type]
  msg := sprintf("BLOCKED: destroy of protected resource %s (%s)", [rc.address, rc.type])
}

2. Run `terraform plan -detailed-exitcode` in CI and Gate on Destroys

# GitHub Actions example
- name: Terraform Plan
  run: |
    terraform plan -out=tfplan -detailed-exitcode
    terraform show -json tfplan | jq '[
      .resource_changes[] | select(.change.actions[] == "delete")
    ] | length' > destroy_count.txt
    DESTROYS=$(cat destroy_count.txt)
    if [ "$DESTROYS" -gt "0" ]; then
      echo "::error::Plan contains $DESTROYS destroy(s). Manual approval required."
      exit 1
    fi

3. Drift Detection on a Schedule

# Run refresh-only plan daily; alert if drift detected
terraform plan -refresh-only -detailed-exitcode
# Exit code 2 = diff detected. Wire this to PagerDuty or Slack.

4. Checkov Rule for Missing `prevent_destroy`

checkov -d . --check CKV_TF_1  # checks for lifecycle prevent_destroy on high-risk resources

5. Lock State Aggressively

Use DynamoDB state locking for S3 backends — never skip it.
Enable Terraform Cloud run triggers with lock timeouts.
Never manually edit .tfstate — use terraform state subcommands exclusively.