Why does 'endpoint with name already exists' happen after docker-compose down?

If docker-compose down is interrupted — by a SIGKILL, OOM event, or CI runner timeout — Docker may not complete the network disconnect sequence for all containers. The endpoint record persists in libnetwork's state store even though the container is gone. The next docker-compose up finds the orphaned record and refuses to create a duplicate endpoint name on the same network.

Is 'docker network disconnect --force' safe to run in production?

Yes, but with a caveat. If the container is genuinely running and attached, --force will disconnect a live container from the network, severing its communication. Always verify with 'docker ps' that the container is stopped or missing before using --force. On a confirmed stale/orphaned endpoint, it is the correct and safe remediation command.

How do I prevent this error in Docker Swarm overlay networks?

Swarm overlay networks have a delayed garbage collection cycle for endpoint records, making this more common. The recommended fix is to use 'docker service update --force ' to trigger a task re-scheduling, which re-registers endpoints cleanly. For persistent orphans, remove and recreate the overlay network during a maintenance window, ensuring all services are redeployed against the fresh network state.

How to Fix 'Endpoint With Name Already Exists in Network' in Docker

Threat/Impact Level: HIGH | Exploitability/Downtime Risk: HIGH | Time to Fix: 5–10 mins

TL;DR

What broke: Docker refused to attach a container to a network because a stale or orphaned endpoint with the same name already occupies that network namespace.
How to fix it: Identify and forcibly remove the stale endpoint by disconnecting the ghost container from the network, then prune orphaned network state.
Fast path: Use our Client-Side Sandbox below to auto-refactor this — paste your docker-compose.yml or docker run command and get the corrected config instantly.

The Incident (What Does the Error Mean?)

Raw error output:

Error response from daemon: endpoint with name my-service already exists in network my-app-network

Docker's libnetwork maintains an internal endpoint registry per network. When a container is removed uncleanly — OOM kill, docker kill without docker rm, host crash, or a botched docker-compose down — the endpoint record persists in the network's state store even though the container is gone. The next docker run or docker-compose up attempts to register the same endpoint name and hits the existing tombstone. The container fails to start. Your service is dead.

This is most common when:

docker-compose down was interrupted mid-execution
A container was killed with SIGKILL before Docker could clean up network state
You're running Docker Swarm or a CI runner that reuses container names across pipeline runs
Overlay networks on Swarm have a known delayed GC for endpoint records

The Attack Vector / Blast Radius

This is not a security exploit vector, but the blast radius in a production deployment pipeline is severe:

Full service outage for the affected container. It cannot attach to the network, so it cannot communicate with any peer service.
Cascading dependency failures. If my-service is a reverse proxy, message broker, or database — everything upstream that depends on it starts throwing connection refused errors within seconds.
Silent CI/CD pipeline failures. In ephemeral runners (GitHub Actions, GitLab CI, Jenkins agents), stale Docker state from a prior job can block the next deployment entirely. The pipeline reports a generic failure with no obvious root cause.
Swarm-specific amplification. On Docker Swarm overlay networks, the stale endpoint can block the entire service replica from scheduling across all nodes that share the network, not just the originating host.

How to Fix It (The Solution)

Step 1: Identify the Stale Endpoint

# List all containers attached to the network (including stopped ones)
docker network inspect my-app-network --format '{{json .Containers}}' | jq .

# Find orphaned/stopped containers by name
docker ps -a --filter name=my-service

If docker network inspect shows a container ID under Containers but docker ps -a shows that container as Exited or missing entirely, you have a stale endpoint.

Basic Fix: Force-Disconnect the Stale Endpoint

# Disconnect the ghost endpoint from the network
docker network disconnect --force my-app-network my-service

# Remove the dead container record if it exists
docker rm -f my-service 2>/dev/null || true

# Now re-run your container
docker-compose up -d

If docker network disconnect --force fails with "container not found", the endpoint is fully orphaned. Proceed to the nuclear option:

# Prune all unused networks (removes orphaned endpoint state)
docker network prune -f

# If on Swarm overlay — leave and rejoin the network
docker network rm my-app-network
docker network create --driver overlay my-app-network

Enterprise Best Practice: Idempotent Compose with Explicit Cleanup

The root cause is non-idempotent container lifecycle management. Fix it at the compose level:

# docker-compose.yml

 services:
   my-service:
     image: my-org/my-service:latest
-    container_name: my-service
+    container_name: my-service-${DEPLOY_ENV:-prod}
     networks:
-      - my-app-network
+      my-app-network:
+        aliases:
+          - my-service
+    restart: unless-stopped

 networks:
   my-app-network:
-    external: true
+    external: false
+    name: my-app-network

And in your deployment script:

- docker-compose up -d
+ docker-compose down --remove-orphans && docker-compose up -d --force-recreate

The --remove-orphans flag is the critical addition. It instructs Compose to disconnect and remove any container connected to the project's networks that is no longer defined in the current compose file — exactly the condition that produces stale endpoints.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

The stale endpoint problem is a CI/CD hygiene failure. Bake these controls in:

1. Always Use `--remove-orphans` in Pipeline Teardown

# .github/workflows/deploy.yml (example)
- name: Teardown previous deployment
  run: docker-compose -f docker-compose.yml down --remove-orphans --volumes

- name: Deploy
  run: docker-compose up -d --force-recreate --build

2. Add a Pre-flight Network Cleanup Step

#!/bin/bash
# pre-deploy-cleanup.sh — run before every docker-compose up in CI
CONTAINER_NAME="${1:?Container name required}"
NETWORK_NAME="${2:?Network name required}"

if docker network inspect "$NETWORK_NAME" \
  --format '{{json .Containers}}' 2>/dev/null | grep -q "$CONTAINER_NAME"; then
  echo "[WARN] Stale endpoint detected. Forcing disconnect."
  docker network disconnect --force "$NETWORK_NAME" "$CONTAINER_NAME"
fi

3. Enforce with a Checkov Custom Policy

Checkov doesn't natively lint Docker Compose for this, but you can enforce the --remove-orphans pattern via a shell linter rule or a custom OPA policy on your CI pipeline definitions:

# opa/policies/docker_compose_deploy.rego
package docker_compose

deny[msg] {
  input.steps[_].run
  cmd := input.steps[_].run
  contains(cmd, "docker-compose up")
  not contains(cmd, "--remove-orphans")
  msg := "docker-compose up must include --remove-orphans to prevent stale endpoint conflicts"
}

4. Use Docker Contexts + Named Volumes Over Ephemeral Networks in Long-lived Environments

For staging environments that are never fully torn down, prefer named networks with explicit lifecycle management over letting Compose auto-create/destroy them. Treat the network as infrastructure, not an application artifact.