Why does the socket file /var/run/postgresql/.s.PGSQL.5432 disappear after a reboot?

On most modern Linux distributions, /var/run is mounted as a tmpfs (RAM-backed filesystem) and is wiped clean on every reboot. The socket file is recreated only when PostgreSQL starts. If your systemd service unit does not declare RuntimeDirectory=postgresql, the directory itself may not exist when PostgreSQL tries to start, causing it to fail silently or write the socket to a fallback location. Fix this by adding a systemd override with RuntimeDirectory=postgresql and RuntimeDirectoryMode=2775.

PostgreSQL is running (I can see the process) but the socket file still does not exist — why?

The process is running but bound to a different socket directory than /var/run/postgresql. Check the unix_socket_directories parameter in your postgresql.conf (typically /etc/postgresql/ /main/postgresql.conf). Run: psql -U postgres -c 'SHOW unix_socket_directories;' via a TCP connection (psql -h 127.0.0.1) to confirm the actual path. Then either update unix_socket_directories to match /var/run/postgresql or update your application's connection string to match the actual socket path.

Can this error indicate a security incident or unauthorized process termination?

Rarely, but yes. If PostgreSQL was running and the socket vanished without a planned restart, check: (1) systemctl status postgresql for OOM-killer events (look for 'Killed' in journalctl), (2) /var/log/auth.log for unauthorized sudo or su activity, (3) audit logs if auditd is running. An OOM kill is the most common cause. A deliberate process kill by an attacker who already has local access is possible but would be a symptom of a larger compromise, not the entry point.

How to Fix PostgreSQL 'No Such File or Directory' Unix Domain Socket Error on Linux

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 5–15 mins

TL;DR

What broke: The PostgreSQL server process is not running, or it started but bound to a different socket directory — so /var/run/postgresql/.s.PGSQL.5432 does not exist.
How to fix it: Confirm the process is dead, start it, verify socket directory ownership, and align unix_socket_directories in postgresql.conf with what your client expects.
Shortcut: Use our Client-Side Sandbox below to auto-refactor your postgresql.conf and pg_hba.conf — secrets never leave your browser.

The Incident (What Does the Error Mean?)

Raw error:

psql: error: could not connect to server: No such file or directory
        Is the server running locally and accepting connections on
        Unix domain socket '/var/run/postgresql/.s.PGSQL.5432'?

This is not a credentials error. The socket file .s.PGSQL.5432 is created by the postgres process at startup inside the directory defined by unix_socket_directories. If that file is absent, the process either never started, crashed on startup, started as a different OS user who wrote the socket elsewhere, or the socket directory itself does not exist or has wrong permissions.

Immediate consequence: every application, migration runner, and health check that relies on local peer/ident authentication is dead. TCP connections to 127.0.0.1:5432 may still work if listen_addresses is set — but default installs use the socket.

The Attack Vector / Blast Radius

This is a full service outage vector, not a security exploit — but the cascading blast radius is severe:

App tier: Connection pools (PgBouncer, HikariCP) exhaust retries and start rejecting inbound requests within seconds.
Migration runners: Flyway/Liquibase fail on deploy, blocking CI/CD pipelines and potentially leaving schema in a half-migrated state if a prior run was interrupted.
Monitoring blind spot: If your health check probes via the socket (common in Kubernetes liveness probes), the pod restarts in a crash loop, masking the actual root cause in logs.
Data integrity risk: If PostgreSQL crashed mid-write, the next startup will run WAL recovery. Forcing a hard restart without checking logs can corrupt data if the disk is full or the pg_wal directory is damaged.

Always check logs before restarting blindly.

How to Fix It (The Solution)

Step 1 — Confirm the process state

# Is postgres even running?
systemctl status postgresql
# or for non-systemd / manual installs:
ps aux | grep postgres

# Check the last crash reason BEFORE touching anything
journalctl -u postgresql --since "10 minutes ago"
# or
tail -n 100 /var/log/postgresql/postgresql-*.log

Step 2 — Verify socket directory exists and has correct ownership

ls -ld /var/run/postgresql/
# Expected output:
# drwxrwsr-x 2 postgres postgres 60 ... /var/run/postgresql/

# If missing:
sudo mkdir -p /var/run/postgresql
sudo chown postgres:postgres /var/run/postgresql
sudo chmod 2775 /var/run/postgresql

Note: On systems using tmpfs for /var/run, this directory is wiped on reboot. Your init system (systemd RuntimeDirectory=) must recreate it. Check your service unit file.

Basic Fix — Start the service

sudo systemctl start postgresql
sudo systemctl enable postgresql   # survive reboots

# Confirm socket appeared:
ls /var/run/postgresql/.s.PGSQL.5432

Enterprise Best Practice — Align socket directory in config

If you run multiple PostgreSQL versions or non-default installs, the socket path in postgresql.conf may not match what libpq defaults to.

# /etc/postgresql/15/main/postgresql.conf

- #unix_socket_directories = '/var/run/postgresql'
+ unix_socket_directories = '/var/run/postgresql'

- listen_addresses = 'localhost'
+ listen_addresses = 'localhost'   # keep TCP as fallback for remote tools

# pg_hba.conf — ensure local peer auth is present for the postgres OS user

- # local   all   postgres   ident
+ local     all   postgres   peer
+ local     all   all        md5

After editing:

sudo systemctl restart postgresql
psql -U postgres -c 'SELECT version();'

If PostgreSQL fails to start (disk full / WAL issue)

# Check disk
df -h /var/lib/postgresql

# If pg_wal is bloated:
du -sh /var/lib/postgresql/*/main/pg_wal/

# ONLY if you are certain there is no replication consumer lagging:
# pg_archivecleanup or reduce wal_keep_size in postgresql.conf

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Systemd `RuntimeDirectory` — survive reboots

# /etc/systemd/system/postgresql.service.d/override.conf
[Service]
RuntimeDirectory=postgresql
RuntimeDirectoryMode=2775

This ensures /var/run/postgresql is always recreated by systemd before the process starts, even after a reboot that wipes tmpfs.

2. Docker / Kubernetes — readiness probe on the socket

# Kubernetes deployment snippet
readinessProbe:
  exec:
    command: ["pg_isready", "-U", "postgres", "-h", "/var/run/postgresql"]
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 3

Do not use a TCP probe on 5432 if your app connects via socket — they test different code paths.

3. Checkov / Ansible lint — catch socket misconfig in IaC

# Ansible task: assert socket directory is managed
- name: Assert PostgreSQL socket dir is present
  ansible.builtin.file:
    path: /var/run/postgresql
    state: directory
    owner: postgres
    group: postgres
    mode: '2775'
  notify: restart postgresql

4. CI smoke test — fail the pipeline before it reaches prod

#!/bin/bash
# post-deploy smoke test
pg_isready -h /var/run/postgresql -U postgres || { echo "FATAL: PostgreSQL socket not ready"; exit 1; }

Plug this into your GitHub Actions post-deploy step or ArgoCD post-sync hook. A 2-second check here saves a 2 AM page.