Fixing PostgreSQL 'out of shared memory' Error: max_locks_per_transaction Exceeded
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–15 mins (requires PostgreSQL restart or reload)
TL;DR
- What broke: PostgreSQL's shared lock table is full.
max_locks_per_transaction * (max_connections + max_prepared_transactions)defines the total lock table slots. You've exceeded that ceiling — every new transaction requiring a lock is now dead on arrival. - How to fix it: Increase
max_locks_per_transactioninpostgresql.confand reload/restart. The default of64is catastrophically undersized for workloads using partitioned tables, pg_dump, or large batch operations. - Shortcut: Use our Client-Side Sandbox above to auto-refactor your
postgresql.conf— paste your config and get a corrected file back instantly.
The Incident (What Does the Error Mean?)
Raw error output:
ERROR: out of shared memory
DETAIL: max_locks_per_transaction 64 exceeded
PostgreSQL maintains a fixed-size, shared-memory lock table allocated at startup. The total number of lock objects that can exist simultaneously is hard-capped at:
max_locks_per_transaction × (max_connections + max_prepared_transactions)
With defaults (max_locks_per_transaction = 64, max_connections = 100), that's 6,400 total lock slots. Sounds like a lot. It isn't. A single pg_dump on a schema with 200 tables burns through 200+ lock slots in one transaction. A partitioned table with 100 child partitions? Every query touching it locks every partition. You hit the wall. PostgreSQL cannot allocate a new lock object. The transaction dies. If this hits your primary write path, your application is down.
The Attack Vector / Blast Radius
This is not a subtle degradation. It is a hard failure. The blast radius depends on which transaction type hits the ceiling first:
Scenario 1 — Partitioned Tables (most common in modern schemas): PostgreSQL locks every partition when a query touches a partitioned table, even for a single-row lookup in some planner paths. A table with 128 partitions × 2 concurrent transactions = 256 locks. Add 10 concurrent users and you're at 2,560 locks for a single table. This is why teams migrating to declarative partitioning suddenly hit this error with no other config changes.
Scenario 2 — pg_dump / pg_restore in production:
pg_dump opens a single transaction and acquires ACCESS SHARE locks on every table in the target schema simultaneously. A schema with 300 tables will attempt to hold 300+ locks in one transaction. On a default config, this alone will trigger the error and block your backup pipeline silently — or worse, crash it mid-dump, leaving you with a corrupt backup file.
Scenario 3 — Long-running batch transactions:
Any ETL or migration script that opens a single transaction and touches hundreds of tables (e.g., TRUNCATE cascades, foreign key validation passes) will exhaust the lock table. The failure is non-deterministic — it depends on what else is running concurrently — making it intermittent and hellish to reproduce in staging.
Cascading failure path: Lock table exhaustion → new transactions fail with ERROR → application connection pool fills with failed/retrying connections → connection count spikes → max_connections exhausted → full application outage.
How to Fix It (The Solution)
Step 1: Audit Your Current Lock Pressure
Run this on your live instance before changing anything:
-- How many locks are currently held, grouped by transaction
SELECT
pid,
COUNT(*) AS lock_count,
usename,
application_name,
LEFT(query, 80) AS query_snippet
FROM pg_locks
JOIN pg_stat_activity USING (pid)
GROUP BY pid, usename, application_name, query
ORDER BY lock_count DESC
LIMIT 20;
-- Check your current effective lock table ceiling
SHOW max_locks_per_transaction;
SHOW max_connections;
SHOW max_prepared_transactions;
If any single PID is holding > 50 locks, that's your culprit workload.
Basic Fix — Increase max_locks_per_transaction
Edit postgresql.conf. The value must be set before startup — it controls shared memory allocation. A reload (pg_reload_conf()) is not sufficient; a full restart is required.
# postgresql.conf
- max_locks_per_transaction = 64
+ max_locks_per_transaction = 256
Then restart:
# Systemd
sudo systemctl restart postgresql
# pg_ctl
pg_ctl restart -D /var/lib/postgresql/data
# Docker / Kubernetes — update your env var or ConfigMap, then rolling restart
POSTGRES_MAX_LOCKS_PER_TRANSACTION=256
Sizing guidance:
| Workload Type | Recommended Value |
|---|---|
| Default OLTP, no partitioning | 64 (default, acceptable) |
| Partitioned tables (< 50 partitions) | 128 |
| Partitioned tables (50–200 partitions) | 256 |
| Heavy pg_dump, large schemas (200+ tables) | 512 |
| Extreme partitioning or multi-schema ETL | 1024 |
⚠️ Memory cost: Each increment of
max_locks_per_transactioncosts roughly(max_connections + max_prepared_transactions) × ~250 bytes. Increasing from 64 to 256 on a 100-connection instance costs ~4.8 MB of shared memory. This is negligible. Do not let this stop you.
Enterprise Best Practice — Structural Fixes (Don't Just Bump the Number)
Bumping max_locks_per_transaction is the immediate fix. It is not the root cause fix. In production, you must also address the workload pattern.
1. Isolate pg_dump to a dedicated replica:
# pg_dump command
- pg_dump -h primary-db.internal -U postgres mydb > backup.sql
+ pg_dump -h replica-db.internal -U postgres mydb > backup.sql
# --lock-wait-timeout also prevents dump from hanging on lock contention
+ pg_dump -h replica-db.internal --lock-wait-timeout=30s -U postgres mydb > backup.sql
2. For partitioned tables — use LOCK_TIMEOUT and connection-level settings:
# For batch jobs that touch many partitions, set at session level:
- BEGIN;
- -- touches 150 partitions, explodes lock table
- SELECT * FROM events WHERE created_at > NOW() - INTERVAL '1 year';
+ BEGIN;
+ -- Declare intent early, fail fast if lock ceiling is near
+ SET lock_timeout = '5s';
+ SET idle_in_transaction_session_timeout = '30s';
+ SELECT * FROM events WHERE created_at > NOW() - INTERVAL '1 year';
3. PgBouncer transaction pooling — reduces effective concurrent lock holders:
# pgbouncer.ini
- pool_mode = session
+ pool_mode = transaction
# Transaction mode means connections return to the pool between statements,
# dramatically reducing simultaneous lock holders at the PostgreSQL level.
4. Kubernetes / Helm — inject the parameter correctly:
# values.yaml (Bitnami PostgreSQL chart)
postgresql:
postgresqlConfiguration:
- max_locks_per_transaction: "64"
+ max_locks_per_transaction: "256"
+ max_prepared_transactions: "0" # Set to 0 if not using 2PC; reduces lock table size
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
This error is 100% preventable with automated config linting. The default of 64 should be treated as a misconfiguration in any environment using partitioning or schemas with > 100 tables.
1. Checkov — IaC Policy for Terraform aws_db_parameter_group
# checkov custom check: check_pg_max_locks.py
from checkov.common.models.enums import CheckResult, CheckCategories
from checkov.terraform.checks.resource.base_resource_check import BaseResourceCheck
class PGMaxLocksCheck(BaseResourceCheck):
def __init__(self):
name = "Ensure max_locks_per_transaction is >= 128 for production RDS"
id = "CKV_CUSTOM_PG_001"
supported_resources = ['aws_db_parameter_group']
categories = [CheckCategories.GENERAL_SECURITY]
super().__init__(name=name, id=id, categories=categories,
supported_resources=supported_resources)
def scan_resource_conf(self, conf):
params = conf.get('parameter', [])
for p in params:
if isinstance(p, list): p = p[0]
if p.get('name') == ['max_locks_per_transaction']:
val = int(p.get('value', [64])[0])
return CheckResult.PASSED if val >= 128 else CheckResult.FAILED
return CheckResult.FAILED # not set = default 64 = fail
2. OPA / Conftest Policy
# policy/pg_locks.rego
package postgresql
deny[msg] {
input.resource_type == "aws_db_parameter_group"
param := input.config.parameter[_]
param.name == "max_locks_per_transaction"
to_number(param.value) < 128
msg := sprintf("max_locks_per_transaction is %v — must be >= 128 for partitioned workloads", [param.value])
}
deny[msg] {
input.resource_type == "aws_db_parameter_group"
not any_locks_param
msg := "max_locks_per_transaction is not explicitly set — defaulting to 64 is unsafe for production"
}
any_locks_param {
param := input.config.parameter[_]
param.name == "max_locks_per_transaction"
}
3. GitHub Actions — Pre-merge Config Validation
# .github/workflows/pg-config-lint.yml
name: PostgreSQL Config Audit
on: [pull_request]
jobs:
pg-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Check max_locks_per_transaction
run: |
VALUE=$(grep 'max_locks_per_transaction' infra/postgresql.conf | awk -F= '{print $2}' | tr -d ' ')
if [ -z "$VALUE" ] || [ "$VALUE" -lt 128 ]; then
echo "::error::max_locks_per_transaction is '$VALUE' — must be >= 128. See runbook: docs/pg-locks-runbook.md"
exit 1
fi
4. Monitoring — Alert Before You Hit the Wall
Don't wait for the ERROR. Alert at 70% lock table utilization:
-- Prometheus / pg_exporter custom query
-- Add to queries.yaml
pg_lock_table_utilization:
query: |
SELECT
COUNT(*) AS total_locks,
current_setting('max_locks_per_transaction')::int *
(current_setting('max_connections')::int +
current_setting('max_prepared_transactions')::int) AS lock_table_ceiling,
ROUND(
COUNT(*) * 100.0 /
(current_setting('max_locks_per_transaction')::int *
(current_setting('max_connections')::int +
current_setting('max_prepared_transactions')::int)), 2
) AS utilization_pct
FROM pg_locks;
metrics:
- utilization_pct:
usage: GAUGE
description: "Lock table utilization percentage — alert at 70"
Set your alerting rule: fire PagerDuty at utilization_pct > 70. By the time you hit 100%, your on-call is already getting ERROR calls from the application layer.