Why does the 'role readonly does not exist' error appear even though I created the role previously?

PostgreSQL roles are cluster-scoped but you may be connecting to a different database, a read replica that was restored from a snapshot before the role was created, or a different RDS instance entirely. Run `SELECT rolname FROM pg_catalog.pg_roles WHERE rolname = 'readonly';` as a superuser on the exact host and port your connection string targets to confirm presence.

What is the difference between CREATE ROLE and CREATE USER in PostgreSQL for a readonly setup?

`CREATE USER` is syntactic sugar for `CREATE ROLE ... LOGIN`. Best practice is to create a `readonly` role with `NOLOGIN` (a permission group), then create a separate login user assigned to that role. This lets you rotate login credentials, add MFA, or revoke access without touching the underlying GRANT structure.

How do I prevent future tables from bypassing the readonly role's SELECT grants?

Run `ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readonly;` — this must be executed by the role that will be creating future tables (typically your migration runner role). Without this, any table created after the initial GRANT will be inaccessible to the readonly role until privileges are explicitly re-granted.

How to Fix PostgreSQL 'role readonly does not exist' Error: Role Provisioning & Least-Privilege Access Guide

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 5–15 mins

TL;DR

What broke: The PostgreSQL role readonly was never created in the target database cluster, or the connection is hitting a replica/different database where the role doesn't exist.
How to fix it: Create the role with CREATE ROLE readonly NOLOGIN;, grant it SELECT on the required schemas, then create a login user assigned to that role.
Shortcut: Use our Client-Side Sandbox below to auto-refactor your provisioning SQL or Terraform aws_db_instance / postgresql_role resource and generate the correct DDL instantly.

The Incident (What Does the Error Mean?)

psql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed:
FATAL:  role "readonly" does not exist

PostgreSQL's authentication pipeline checks pg_catalog.pg_roles before evaluating pg_hba.conf rules. If the role doesn't exist in pg_roles, the connection dies at the role-lookup stage — before any password is checked, before any SSL handshake completes, before your app pool even gets a connection object back. Every thread waiting on that pool slot is now blocked. In a high-concurrency app, this cascades into a full connection pool exhaustion within seconds.

The immediate consequences:

All application services using the readonly DSN string return 500 or equivalent DB connection errors.
Read replicas are inaccessible, pushing all query load to the primary.
Monitoring dashboards lose their read-path queries.

The Attack Vector / Blast Radius

This is a provisioning gap, not a runtime failure — which makes it more dangerous than it looks.

Scenario 1 — Partial Infrastructure Rollout: A Terraform apply or Ansible playbook ran the compute/RDS provisioning but the SQL role-creation step failed silently (e.g., wrong database target, migration runner lacked CREATEROLE privilege). The role exists in staging but not production. Your deploy pipeline showed green. Production is broken.

Scenario 2 — Cross-Database Role Assumption: Roles in PostgreSQL are cluster-level, but GRANT permissions are database-level. A DBA created readonly in database appdb but your connection string targets analyticsdb. The role doesn't exist there. This is the most common cause in multi-tenant RDS setups.

Scenario 3 — Security Regression via Role Drop: Someone ran DROP ROLE readonly during a cleanup sweep without auditing active pg_hba.conf entries or application DSNs. No foreign key constraint stops you from dropping a role that active connection strings depend on.

Blast radius: Read traffic fails entirely. If your application doesn't have a write-fallback circuit breaker, write traffic may also degrade as connection pool threads pile up waiting for the failed read connections to time out.

How to Fix It (The Solution)

Basic Fix — Create the Role Directly

Connect as a superuser (postgres or an IAM-authenticated admin on RDS):

- -- Nothing exists. Connection fails.
- psql -U readonly -d appdb

+ -- Step 1: Connect as superuser
+ psql -U postgres -d appdb
+
+ -- Step 2: Create the role (no direct login, principle of least privilege)
+ CREATE ROLE readonly NOLOGIN;
+
+ -- Step 3: Create a login user and assign the role
+ CREATE USER readonly_user WITH PASSWORD 'use-a-vault-managed-secret' IN ROLE readonly;
+
+ -- Step 4: Grant schema usage and select privileges
+ GRANT USAGE ON SCHEMA public TO readonly;
+ GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly;
+ ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readonly;

⚠️ RDS/Aurora note: Use rds_superuser-granted account. CREATE ROLE requires CREATEROLE privilege. On Aurora, use the master user.

Enterprise Best Practice — Terraform + Vault-Managed Credentials

- # No role resource defined. Applied infra without DB provisioning step.
- resource "aws_db_instance" "app" {
-   identifier = "appdb"
-   engine     = "postgres"
- }

+ resource "aws_db_instance" "app" {
+   identifier = "appdb"
+   engine     = "postgres"
+ }
+
+ # Use terraform-provider-postgresql
+ resource "postgresql_role" "readonly" {
+   name       = "readonly"
+   login      = false
+   superuser  = false
+   create_db  = false
+   replication = false
+ }
+
+ resource "postgresql_role" "readonly_user" {
+   name     = "readonly_user"
+   login    = true
+   password = var.readonly_user_password  # injected from Vault/AWS Secrets Manager
+   roles    = [postgresql_role.readonly.name]
+ }
+
+ resource "postgresql_grant" "readonly_tables" {
+   database    = "appdb"
+   role        = postgresql_role.readonly.name
+   schema      = "public"
+   object_type = "table"
+   privileges  = ["SELECT"]
+ }

Key enforcement points:

login = false on the role itself enforces the role/user separation pattern.
Password is never in state file — injected via var from Vault dynamic secrets.
ALTER DEFAULT PRIVILEGES must be run separately or via a postgresql_default_privileges resource to cover future tables.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Checkov — Scan Terraform for missing role resources:

# .checkov.yml
checks:
  - CKV_PG_1  # Enforce encrypted connections
# Add custom check: assert postgresql_role exists before postgresql_grant

2. OPA/Conftest policy — Block deploys if DB role resources are absent:

package terraform.postgresql

deny[msg] {
  resource := input.resource.postgresql_grant[_]
  role := resource.role
  not input.resource.postgresql_role[role]
  msg := sprintf("postgresql_grant references role '%v' with no corresponding postgresql_role resource.", [role])
}

3. Post-deploy smoke test in CI pipeline:

#!/bin/bash
# Run after terraform apply in pipeline
psql "$DB_ADMIN_DSN" -c "\du readonly" | grep -q readonly || {
  echo "FATAL: readonly role missing post-deploy" && exit 1
}

4. Audit trigger — Alert on DROP ROLE:

-- On PostgreSQL 15+ with pgaudit extension
SET pgaudit.log = 'role';
-- Routes DROP ROLE events to your SIEM via PostgreSQL log stream

5. Separate your DB provisioning Terraform from your infra Terraform. Use a dedicated terraform/db-roles module with its own state file and pipeline stage that runs after the RDS instance is confirmed healthy. Never bundle role creation in the same apply as instance creation — race conditions on RDS availability windows will silently skip role creation.