How to Fix PostgreSQL 'No Such File or Directory' Unix Domain Socket Error on Linux
Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–15 mins
TL;DR
- What broke: The PostgreSQL server process is not running, or it started but bound to a different socket directory — so
/var/run/postgresql/.s.PGSQL.5432does not exist. - How to fix it: Confirm the process is dead, start it, verify socket directory ownership, and align
unix_socket_directoriesinpostgresql.confwith what your client expects. - Shortcut: Use our Client-Side Sandbox below to auto-refactor your
postgresql.confandpg_hba.conf— secrets never leave your browser.
The Incident (What Does the Error Mean?)
Raw error:
psql: error: could not connect to server: No such file or directory
Is the server running locally and accepting connections on
Unix domain socket '/var/run/postgresql/.s.PGSQL.5432'?
This is not a credentials error. The socket file .s.PGSQL.5432 is created by the postgres process at startup inside the directory defined by unix_socket_directories. If that file is absent, the process either never started, crashed on startup, started as a different OS user who wrote the socket elsewhere, or the socket directory itself does not exist or has wrong permissions.
Immediate consequence: every application, migration runner, and health check that relies on local peer/ident authentication is dead. TCP connections to 127.0.0.1:5432 may still work if listen_addresses is set — but default installs use the socket.
The Attack Vector / Blast Radius
This is a full service outage vector, not a security exploit — but the cascading blast radius is severe:
- App tier: Connection pools (PgBouncer, HikariCP) exhaust retries and start rejecting inbound requests within seconds.
- Migration runners: Flyway/Liquibase fail on deploy, blocking CI/CD pipelines and potentially leaving schema in a half-migrated state if a prior run was interrupted.
- Monitoring blind spot: If your health check probes via the socket (common in Kubernetes liveness probes), the pod restarts in a crash loop, masking the actual root cause in logs.
- Data integrity risk: If PostgreSQL crashed mid-write, the next startup will run WAL recovery. Forcing a hard restart without checking logs can corrupt data if the disk is full or the
pg_waldirectory is damaged.
Always check logs before restarting blindly.
How to Fix It (The Solution)
Step 1 — Confirm the process state
# Is postgres even running?
systemctl status postgresql
# or for non-systemd / manual installs:
ps aux | grep postgres
# Check the last crash reason BEFORE touching anything
journalctl -u postgresql --since "10 minutes ago"
# or
tail -n 100 /var/log/postgresql/postgresql-*.log
Step 2 — Verify socket directory exists and has correct ownership
ls -ld /var/run/postgresql/
# Expected output:
# drwxrwsr-x 2 postgres postgres 60 ... /var/run/postgresql/
# If missing:
sudo mkdir -p /var/run/postgresql
sudo chown postgres:postgres /var/run/postgresql
sudo chmod 2775 /var/run/postgresql
Note: On systems using
tmpfsfor/var/run, this directory is wiped on reboot. Your init system (systemdRuntimeDirectory=) must recreate it. Check your service unit file.
Basic Fix — Start the service
sudo systemctl start postgresql
sudo systemctl enable postgresql # survive reboots
# Confirm socket appeared:
ls /var/run/postgresql/.s.PGSQL.5432
Enterprise Best Practice — Align socket directory in config
If you run multiple PostgreSQL versions or non-default installs, the socket path in postgresql.conf may not match what libpq defaults to.
# /etc/postgresql/15/main/postgresql.conf
- #unix_socket_directories = '/var/run/postgresql'
+ unix_socket_directories = '/var/run/postgresql'
- listen_addresses = 'localhost'
+ listen_addresses = 'localhost' # keep TCP as fallback for remote tools
# pg_hba.conf — ensure local peer auth is present for the postgres OS user
- # local all postgres ident
+ local all postgres peer
+ local all all md5
After editing:
sudo systemctl restart postgresql
psql -U postgres -c 'SELECT version();'
If PostgreSQL fails to start (disk full / WAL issue)
# Check disk
df -h /var/lib/postgresql
# If pg_wal is bloated:
du -sh /var/lib/postgresql/*/main/pg_wal/
# ONLY if you are certain there is no replication consumer lagging:
# pg_archivecleanup or reduce wal_keep_size in postgresql.conf
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. Systemd RuntimeDirectory — survive reboots
# /etc/systemd/system/postgresql.service.d/override.conf
[Service]
RuntimeDirectory=postgresql
RuntimeDirectoryMode=2775
This ensures /var/run/postgresql is always recreated by systemd before the process starts, even after a reboot that wipes tmpfs.
2. Docker / Kubernetes — readiness probe on the socket
# Kubernetes deployment snippet
readinessProbe:
exec:
command: ["pg_isready", "-U", "postgres", "-h", "/var/run/postgresql"]
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
Do not use a TCP probe on 5432 if your app connects via socket — they test different code paths.
3. Checkov / Ansible lint — catch socket misconfig in IaC
# Ansible task: assert socket directory is managed
- name: Assert PostgreSQL socket dir is present
ansible.builtin.file:
path: /var/run/postgresql
state: directory
owner: postgres
group: postgres
mode: '2775'
notify: restart postgresql
4. CI smoke test — fail the pipeline before it reaches prod
#!/bin/bash
# post-deploy smoke test
pg_isready -h /var/run/postgresql -U postgres || { echo "FATAL: PostgreSQL socket not ready"; exit 1; }
Plug this into your GitHub Actions post-deploy step or ArgoCD post-sync hook. A 2-second check here saves a 2 AM page.