Why does docker cp work for small files but fail with unexpected EOF for large ones?

Small transfers complete before any proxy or NAT idle timeout fires. Large files take longer, and any intermediary — nginx TCP proxy, AWS NLB, SSH session without keepalives — will close the idle-looking connection mid-stream. The daemon's tar stream is then truncated, producing unexpected EOF. The threshold is roughly: file_size_bytes / available_bandwidth_bytes_per_second > proxy_idle_timeout_seconds.

Is DOCKER_HOST=ssh:// slower than tcp:// for large file transfers?

Marginally, due to SSH encryption overhead, but in practice the difference is negligible compared to network bandwidth. More importantly, SSH transport uses system-level keepalives and does not go through application proxies, making it dramatically more reliable for large transfers. Disable SSH compression (Compression no in ~/.ssh/config) for already-compressed files to avoid CPU stalls.

How do I confirm the file was fully copied and not silently truncated?

Always run a checksum comparison: generate sha256sum locally before the copy, then run docker exec sha256sum -c inside the container after. If the hashes match, the transfer was clean. Never assume a zero exit code from docker cp alone is sufficient — some transport errors are swallowed depending on daemon version and proxy behavior.

Fixing 'docker cp unexpected EOF' for Large Files on a Remote Docker Daemon

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 15–45 mins

TL;DR

What broke: docker cp opens a raw HTTP/TCP stream to the remote daemon; any intermediary (nginx proxy, SSH timeout, NAT gateway, low ulimit) that closes the connection mid-transfer kills the tar stream with unexpected EOF.
How to fix it: Switch transport to SSH (DOCKER_HOST=ssh://user@host), tune proxy/daemon timeouts, or stream via docker exec + tar pipe instead of docker cp.
Use our Client-Side Sandbox below to paste your docker cp invocation and daemon.json — it will auto-refactor the transport and timeout config without sending your host addresses anywhere.

The Incident (What Does the Error Mean?)

You ran something like:

docker -H tcp://10.0.1.55:2376 cp ./backup-20GB.tar.gz mycontainer:/data/

And got:

Error response from daemon: unexpected EOF
# or
error during connect: Post "http://10.0.1.55:2376/v1.41/containers/mycontainer/archive?path=%2Fdata%2F": EOF

Immediate consequence: The tar archive the daemon was streaming into the container is truncated. The file at the destination is corrupt or zero-byte. If this was a DB restore or artifact deploy, your container is now in an undefined state. The docker cp command exits non-zero but CI pipelines configured with || true will silently continue past this.

The Attack Vector / Blast Radius

This is not a security exploit — it is a silent data corruption vector with serious operational blast radius:

Corrupt restores: A truncated SQL dump or binary artifact loaded into a container will cause application startup failures that look like app bugs, not infra failures. Debugging time: hours.
CI/CD false positives: docker cp returns exit code 1, but many pipeline templates swallow this. Downstream steps (docker exec mycontainer /app/migrate) then run against a broken filesystem state.
TCP proxy timeout cascade: If you're routing the Docker daemon through an nginx TCP proxy or an AWS NLB with idle timeout set to 60s, any file transfer exceeding ~60s at your available bandwidth will always fail — this becomes a hard ceiling on deployable artifact size.
TLS over raw TCP: Running tcp:// without TLS means the stream is unauthenticated. A network-level attacker can inject a FIN packet mid-transfer, triggering the EOF. This is both a reliability and integrity issue.

How to Fix It

Root Cause Checklist (run these first)

# 1. Check daemon reachability and version
docker -H tcp://10.0.1.55:2376 info 2>&1 | grep -E 'Server Version|Operating System|Total Memory'

# 2. Check available memory on daemon host — tar buffers large files in memory
ssh [email protected] 'free -h && df -h /var/lib/docker'

# 3. Check ulimits on the daemon process
ssh [email protected] 'cat /proc/$(pgrep dockerd)/limits | grep -E "open files|file size"'

Basic Fix 1 — Switch to SSH Transport (Eliminates Proxy Timeout Layer)

- docker -H tcp://10.0.1.55:2376 cp ./backup-20GB.tar.gz mycontainer:/data/
+ DOCKER_HOST=ssh://[email protected] docker cp ./backup-20GB.tar.gz mycontainer:/data/

SSH transport uses the system SSH connection (with keepalives) instead of a raw TCP stream through whatever proxy sits in front of port 2376. This alone fixes 70% of EOF cases.

Enable SSH keepalives to prevent idle disconnection on long transfers:

# ~/.ssh/config on the CLIENT machine
+ Host 10.0.1.55
+   ServerAliveInterval 15
+   ServerAliveCountMax 8
+   Compression no

Compression no is intentional — SSH compression on already-compressed archives wastes CPU and can cause buffer stalls on large streams.

Basic Fix 2 — Stream via tar pipe instead of docker cp

When docker cp is unreliable, bypass it entirely:

- docker -H tcp://10.0.1.55:2376 cp ./backup-20GB.tar.gz mycontainer:/data/
+ cat ./backup-20GB.tar.gz | \
+   DOCKER_HOST=ssh://[email protected] \
+   docker exec -i mycontainer bash -c 'cat > /data/backup-20GB.tar.gz'

For directories, use tar-over-exec which is what docker cp does internally, but you control the stream:

tar czf - ./mydir | DOCKER_HOST=ssh://[email protected] docker exec -i mycontainer tar xzf - -C /data/

Enterprise Best Practice — Tune daemon.json and Proxy Timeouts

If you must use TCP transport behind nginx:

# /etc/nginx/nginx.conf (TCP stream proxy block)
stream {
  server {
    listen 2376;
    proxy_pass 10.0.1.55:2376;
-   # no timeout configured — defaults to 60s idle
+   proxy_timeout 3600s;
+   proxy_connect_timeout 10s;
  }
}

# /etc/docker/daemon.json on the remote host
{
+ "max-concurrent-downloads": 3,
+ "max-concurrent-uploads": 3,
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  }
}

For AWS NLB: set idle timeout to 3600 seconds in the Target Group attributes. The default 350s will kill any transfer of a file >~500MB on a 10Mbps link.

💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.

Prevention in CI/CD

1. Validate docker cp exit codes explicitly

# .gitlab-ci.yml / GitHub Actions step
- docker cp ./artifact.tar mycontainer:/app/
+ docker cp ./artifact.tar mycontainer:/app/ || { echo "docker cp failed — aborting"; exit 1; }

2. Add a post-copy checksum gate

# Generate checksum locally
sha256sum ./artifact.tar > artifact.tar.sha256

# Copy both
DOCKER_HOST=ssh://deploy@host docker cp ./artifact.tar mycontainer:/app/
DOCKER_HOST=ssh://deploy@host docker cp ./artifact.tar.sha256 mycontainer:/app/

# Verify inside container
DOCKER_HOST=ssh://deploy@host docker exec mycontainer bash -c \
  'cd /app && sha256sum -c artifact.tar.sha256 || exit 1'

3. OPA/Conftest policy — enforce SSH transport in pipeline configs

# policy/docker_transport.rego
package docker.transport

deny[msg] {
  input.env.DOCKER_HOST
  startswith(input.env.DOCKER_HOST, "tcp://")
  msg := "DOCKER_HOST must use ssh:// transport. Raw TCP is prohibited in CI pipelines."
}

4. Checkov custom check for daemon.json in IaC

If you manage daemon.json via Ansible/Terraform, add a Checkov custom check asserting proxy_timeout is configured when a TCP proxy resource exists in the same plan. This prevents silent regressions when infra is rebuilt.

Bottom line: docker cp over raw TCP is fragile at scale. SSH transport + explicit exit code checking + checksum validation is the production-grade pattern. The tar-pipe approach is your escape hatch when nothing else works.