Initializing Enclave...

How to Fix pg_restore: error: input file does not appear to be a valid archive

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 10 mins

TL;DR

  • What broke: pg_restore received a plain-text SQL file or a corrupt/truncated binary dump — neither is a valid custom, directory, or tar archive.
  • How to fix it: Re-dump with pg_dump -Fc (custom format) and match the format flag on pg_restore, or pipe the plain SQL dump directly through psql instead.
  • Use our Client-Side Sandbox below to paste your pg_dump/pg_restore commands and auto-refactor them to the correct format flags.

The Incident (What Does the Error Mean?)

pg_restore: error: input file does not appear to be a valid archive

pg_restore reads a magic byte header at the start of the file to identify the archive format (PGDMP for custom/directory/tar). If that header is missing or malformed, it aborts immediately — zero data is restored. Your target database is left empty or in a partial state if you were mid-migration.

Three dominant causes in production:

  1. Plain SQL dump passed to pg_restore. pg_dump without a -F flag defaults to plain text SQL. pg_restore cannot process this — use psql instead.
  2. Truncated dump file. The dump was cut short: disk full during backup, S3 multipart upload failure, scp interrupted, or a pipeline that silently dropped bytes.
  3. Format mismatch. Dump was created with -Ft (tar) but pg_restore is being handed a .dump extension file that's actually custom format, or vice versa.

The Attack Vector / Blast Radius

This is a data availability failure, not a security exploit — but the blast radius in a production migration or DR scenario is severe:

  • Zero-downtime migration window blown. If you're mid-cutover and pg_restore exits with this error, your app is pointing at an empty database.
  • Silent corruption risk. A partial restore (if it somehow progressed before failing) leaves foreign key constraints violated, sequences unsynced, and indexes missing. Queries return wrong results without obvious errors.
  • Backup pipeline trust destroyed. If your nightly pg_dump is silently producing plain-text files but your runbook calls pg_restore, every single backup in that retention window is unrestorable. You discover this during an actual outage.
  • Compressed file misidentification. A .dump file compressed with gzip externally (not via pg_dump's internal compression) will also fail — pg_restore sees gzip headers, not PGDMP.

How to Fix It

Diagnose the File First

Before changing anything, identify what you actually have:

# Check magic bytes
file /var/backups/mydb.dump

# If it says "ASCII text" — it's a plain SQL dump. Use psql, not pg_restore.
# If it says "PostgreSQL custom database dump" — format is correct, file may be corrupt.
# Check if file is complete (compare to expected size or check exit code of original dump)
md5sum /var/backups/mydb.dump

Fix 1: Plain SQL Dump — Use psql Instead

If file reports ASCII text, your dump was created without a format flag.

- pg_restore -d mydb /var/backups/mydb.sql
+ psql -d mydb -f /var/backups/mydb.sql

Fix 2: Re-Dump with Explicit Custom Format

The permanent fix is enforcing -Fc at dump time so pg_restore always works.

- pg_dump mydb > /var/backups/mydb.sql
+ pg_dump -Fc -Z 9 -f /var/backups/mydb.dump mydb

Then restore with:

- pg_restore -d mydb /var/backups/mydb.sql
+ pg_restore -Fc -d mydb -j 4 /var/backups/mydb.dump

-j 4 enables parallel restore across 4 workers — critical for large databases.


Fix 3: Verify Dump Integrity Before Restore (Enterprise Best Practice)

Never trust a backup file without verification. Add this to your backup pipeline:

- pg_dump -Fc -f /var/backups/mydb.dump mydb
- pg_restore -d mydb /var/backups/mydb.dump

+ pg_dump -Fc -Z 9 -f /var/backups/mydb.dump mydb
+ # Verify the archive is readable before any restore attempt
+ pg_restore --list /var/backups/mydb.dump > /dev/null && echo "Archive OK" || echo "CORRUPT — ABORT"
+ pg_restore -Fc -d mydb -j 4 --no-owner --role=myapp_user /var/backups/mydb.dump

pg_restore --list parses the table of contents without restoring data. If this fails, the file is corrupt. Catch this in your backup job, not during the DR drill.


Fix 4: Corrupt/Truncated File — Recover What You Can

If the dump is genuinely corrupt and you have no clean copy:

# Attempt partial restore — pg_restore will skip corrupt objects
pg_restore -Fc --single-transaction -d mydb /var/backups/mydb.dump 2>restore_errors.log

# Review what failed
grep 'error' restore_errors.log

This is a last resort. Fix your backup pipeline immediately after.


💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.


Prevention in CI/CD

1. Enforce Format in Backup Scripts — Fail Loudly

#!/bin/bash
set -euo pipefail

DUMP_FILE="/var/backups/mydb_$(date +%Y%m%d).dump"

pg_dump -Fc -Z 9 -f "$DUMP_FILE" mydb

# Hard gate: verify before declaring backup success
if ! pg_restore --list "$DUMP_FILE" > /dev/null 2>&1; then
  echo "FATAL: Backup verification failed for $DUMP_FILE" >&2
  exit 1
fi

echo "Backup verified: $DUMP_FILE"

2. Add a Restore Smoke Test in CI

In your GitLab CI / GitHub Actions pipeline, after any schema migration:

# .github/workflows/db-backup-verify.yml
jobs:
  verify-backup:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_PASSWORD: test
    steps:
      - name: Dump and verify
        run: |
          pg_dump -Fc -f test.dump $DATABASE_URL
          pg_restore --list test.dump > /dev/null
          echo "Archive integrity: PASSED"

3. Checkov / Custom Policy Gate

If your backups are managed via Terraform (RDS snapshots, Cloud SQL exports), enforce format validation with a custom Checkov check or OPA policy that asserts export format is never plain and always includes a post-export integrity check Lambda/Cloud Function.

4. Monitor Dump File Size

A suddenly smaller-than-expected dump file is your earliest warning of truncation:

# Alert if dump is >20% smaller than the 7-day average
DUMP_SIZE=$(stat -c%s "$DUMP_FILE")
AVG_SIZE=$(cat /var/backups/.avg_size)
if [ "$DUMP_SIZE" -lt $((AVG_SIZE * 80 / 100)) ]; then
  echo "WARNING: Dump size anomaly detected" | pagerduty-alert
fi

Related Diagnostics

"Part of the Syntax Utility Matrix."

View all 153 Syntax Tools →