Docker

Every developer has hit this wall: code that runs perfectly on your laptop fails in production. The Python version is different. A system library is missing. An environment variable was never set. Your colleague's machine has a conflicting dependency. This is the "works on my machine" problem, and it has plagued software teams for decades.

Docker solves it by bundling your application and everything it needs — the runtime, libraries, configuration, and filesystem — into a single portable unit called a container. That container runs identically whether it's on your laptop, a CI server, or a cloud VM running in a different country.

This section covers Docker from the ground up: what it is, how it works, how to write good Dockerfiles, and how to avoid the traps that catch most developers.

Containers vs. Virtual Machines#

Before Docker, the standard solution to environment consistency was the virtual machine (VM). A VM emulates an entire computer — it boots its own OS kernel, allocates dedicated RAM and CPU, and runs as if it were a separate physical machine.

Containers take a different approach. They share the host machine's OS kernel and only isolate the user-space components (files, processes, network). The result: containers are dramatically lighter.

Containers vs. Virtual Machines

VMs virtualize the entire hardware stack and boot a full OS. Containers share the host kernel and only package the application and its user-space dependencies. This makes containers start in milliseconds and use a fraction of the memory that VMs require.

Rendering diagram...

Docker Architecture#

Docker uses a client-server architecture. When you type docker run, you are not running the container directly — you are sending an instruction to a background daemon that manages the actual work.

Rendering diagram...

Docker Client (docker) — The CLI tool you interact with. Every docker build, docker run, docker push command is forwarded to the daemon via a REST API over a Unix socket.

Docker Daemon (dockerd) — The background process that does the actual work: building images, starting containers, managing networks and volumes. It runs as a privileged process on the host.

Container Registry — A storage service for Docker images. The default public registry is Docker Hub (hub.docker.com). Cloud providers offer managed registries: Amazon ECR, Google GCR, and GitHub Container Registry (ghcr.io). You push images to a registry from your CI pipeline and pull them in production.

Core Concepts: Images, Layers, and Containers#

Images#

A Docker image is a read-only template that defines what your container will contain. It specifies the OS base, installed packages, application code, and the command that runs at startup. You build an image from a Dockerfile.

Images are layered. Every instruction in a Dockerfile that modifies the filesystem produces a new read-only layer stacked on top of the previous ones. When Docker builds or pulls an image, it only transfers layers it does not already have locally — this is what makes builds fast and registries efficient.

Rendering diagram...

Layer caching is critical for fast builds. When you rebuild an image, Docker reuses every layer from the cache until it finds a layer whose instruction has changed — then it rebuilds that layer and every layer above it. This means instruction order matters enormously. Place the things that change least frequently (base image, system packages) at the top, and the things that change most frequently (your application code) at the bottom.

Containers#

A container is a running instance of an image. When Docker starts a container, it takes the image's read-only layers and adds a thin writable layer on top. Any files written inside a running container go into this writable layer and are discarded when the container is removed unless you explicitly persist them with a volume. A stopped container still retains its writable layer — the data survives a docker stop and is only permanently lost when you delete the container with docker rm.

You can run many containers from the same image simultaneously — each gets its own isolated writable layer, network namespace, and process space.

Writing a Dockerfile#

A Dockerfile is a plain text file with instructions that Docker executes top-to-bottom to build an image. Here is a realistic example for a Python web API:

# 1. Choose a base image
FROM python:3.12-slim

# 2. Set working directory inside the container
WORKDIR /app

# 3. Create a non-root user early, before any files are copied
#    Creating the user here ensures subsequent COPY --chown works correctly
RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser

# 4. Install dependencies FIRST (before copying application code)
#    This layer is cached as long as requirements.txt doesn't change
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# 5. Copy application code last (changes frequently)
#    --chown transfers ownership to the non-root user at copy time
COPY --chown=appuser:appgroup app/ ./app/

# 6. Document which port the app listens on (informational only)
EXPOSE 8000

# 7. Switch to the non-root user before the process starts
USER appuser

# 8. Define the default startup command
CMD ["python", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

.dockerignore#

A .dockerignore file in your project root works like .gitignore — it tells Docker which files and directories to exclude from the build context. The build context is the set of files sent to the Docker daemon when you run docker build. Without it, Docker sends everything in the current directory to the daemon, including node_modules, .git, build artifacts, and .env files, even if your Dockerfile never copies them.

# .dockerignore
.git
.env
node_modules
dist
__pycache__
*.pyc
*.log

This matters for two reasons: builds are faster because less data is transferred to the daemon, and you avoid accidentally bundling sensitive files like .env or credentials into the image.

Key Dockerfile Instructions#

Instruction	Purpose	Key Detail
FROM	Sets the base image for the build	Use official images from Docker Hub. Prefer slim or Alpine variants to reduce size. Always pin to a specific version tag (e.g., `python:3.12-slim`) rather than `latest` to ensure reproducible builds.
WORKDIR	Sets the working directory inside the container for all subsequent instructions	Always use absolute paths. Prefer WORKDIR over `RUN cd /some/path` — it is clearer and composable across multi-stage builds.
COPY	Copies files from the build context (your local machine) into the image	Prefer COPY over ADD for local files. ADD additionally supports remote URLs and automatic tar extraction — use it only when you specifically need those features. Use `COPY --chown=user:group` to set file ownership at copy time when running as a non-root user.
RUN	Executes a command during the build and commits the result as a new layer	Combine related commands with `&&` to keep them in a single layer. Always clean up package caches in the same RUN instruction that installs them (e.g., `rm -rf /var/lib/apt/lists/*`).
ENV	Sets environment variables that persist into the running container	Each ENV line creates a new layer. Avoid storing secrets in ENV — they are visible in the image's metadata and history.
EXPOSE	Documents that the container listens on a given port	This is documentation only — it does not actually publish the port. You publish ports at runtime with `docker run -p 8080:8000`.
USER	Switches the user for all subsequent RUN, CMD, and ENTRYPOINT instructions	Containers run as root by default. Always switch to a non-root user before CMD to reduce the blast radius of a container escape. Create the user early in the Dockerfile (before COPY) so you can use `COPY --chown` to assign correct file ownership.
CMD	Specifies the default command to run when the container starts	Use the exec form (`CMD ["executable", "arg1"]`), not the shell form (`CMD executable arg1`). The exec form is more predictable and handles signals correctly.
ENTRYPOINT	Sets the fixed executable that always runs when the container starts	Combine with CMD to make the image behave like a binary: `ENTRYPOINT ["myapp"]` with `CMD ["--help"]` runs `myapp --help` by default but lets users override arguments.
HEALTHCHECK	Defines a command Docker runs periodically to verify the container is still functioning correctly	Example: `HEALTHCHECK --interval=30s --timeout=3s CMD curl -f http://localhost:8000/health \|\| exit 1`. Docker marks the container as `unhealthy` if the command exits with a non-zero code. Docker Compose uses this status for `condition: service_healthy` in `depends_on`.

CMD vs. ENTRYPOINT#

This is one of the most frequently confused aspects of Dockerfiles.

CMD defines the default command. It is entirely replaceable — docker run myimage bash replaces the CMD with bash.

ENTRYPOINT defines the fixed executable. It always runs. Arguments passed via docker run are appended to ENTRYPOINT rather than replacing it.

# CMD-only: docker run myimage ls -la  → replaces CMD, runs "ls -la"
CMD ["python", "app.py"]

# ENTRYPOINT + CMD: docker run myimage --port 9000  → runs "gunicorn --port 9000"
ENTRYPOINT ["gunicorn"]
CMD ["--port", "8000"]

The most common pattern for long-running services: use CMD alone with the full startup command. Use ENTRYPOINT + CMD together when you want the container to behave like a single-purpose command-line tool.

Multi-Stage Builds#

A compiled language like Go or Java requires a full build toolchain (compiler, build system, headers) to produce a binary — but none of that toolchain is needed at runtime. Without multi-stage builds, your production image carries all that dead weight.

Multi-stage builds let you split the build process into multiple FROM stages. You copy only the final artifact from the build stage into a minimal runtime stage. The build tools never make it into the final image.

# Stage 1: Build
FROM golang:1.24-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /bin/server ./cmd/server

# Stage 2: Runtime (only the binary)
FROM scratch
COPY --from=builder /bin/server /bin/server
EXPOSE 8080
CMD ["/bin/server"]

The golang:1.24-alpine image is roughly 250 MB (the non-Alpine golang:1.24 is closer to 800 MB). The scratch image is 0 bytes — it is completely empty. The resulting production image is just the size of the compiled binary, often 5–15 MB. That is a 95%+ reduction in image size, which translates directly to faster pulls, a smaller attack surface, and lower storage costs.

The same pattern applies to Node.js (build with full Node, run with node:alpine), Python (build wheels with full image, install into slim image), and Java (compile with JDK, run with JRE).

Multi-Stage Build: Before and After

Without multi-stage builds, every build tool, compiler, and intermediate artifact ships in the production image. Multi-stage builds solve this by using one stage to compile and a second minimal stage that only receives the finished artifact.

Rendering diagram...

Base Image Selection#

Your choice of base image sets the floor for your image size and security posture. The most common options:

Base Image	Typical Size	When to Use
`ubuntu:24.04`	~80 MB	When you need a familiar Debian/Ubuntu environment with apt-get, common tools, and wide library compatibility. Good for development images.
`debian:bookworm-slim`	~75 MB	A smaller Debian image without many optional packages. Good balance of compatibility and size.
`python:3.12-slim`	~130 MB	Official Python image based on Debian Slim. The default choice for Python applications.
`python:3.12-alpine`	~50 MB	Python on Alpine Linux. Smaller but uses musl libc instead of glibc — some Python packages with C extensions require extra build steps.
`alpine:3.21`	~7 MB	A full Linux distribution in under 7 MB. Ideal as a runtime base when your application has minimal system dependencies.
`scratch`	0 MB	A completely empty filesystem. Only viable for statically compiled binaries (Go, Rust) that have no runtime dependencies at all.
`gcr.io/distroless/base`	~20 MB	Google's distroless images contain only the runtime (libc, SSL certs) with no shell, package manager, or OS utilities — minimal attack surface.

Rule of thumb: Use the official language image in the -slim or -alpine variant as your starting point. Use multi-stage builds to separate the build environment from the runtime environment. Pin to a specific version tag — never use :latest in production Dockerfiles.

Docker Networking#

By default, Docker creates an isolated network for your containers. Understanding networking is essential for connecting services together.

Network Drivers#

Driver	How It Works	Common Use
bridge (default)	Docker creates a virtual network on the host. Containers on the same bridge can communicate using their IP addresses. Containers on the default bridge must use IP addresses; containers on a user-defined bridge can use container names as hostnames.	Single-host development and production. User-defined bridge networks are recommended over the default bridge.
host	The container shares the host's network namespace directly — no network isolation. The container's ports are the host's ports with no mapping needed.	Performance-critical workloads where the bridge overhead matters, or tooling that needs direct access to host network interfaces.
overlay	A virtual network that spans multiple Docker hosts. Used in Docker Swarm for cross-node container communication.	Multi-host Docker Swarm deployments.
none	No networking configured. The container is completely network-isolated.	Security-sensitive batch jobs that must not have any network access.

Port Mapping#

Containers are isolated — a process listening on port 8000 inside a container is not reachable from outside unless you explicitly publish it:

# Map host port 8080 → container port 8000
docker run -p 8080:8000 myapp

# Now accessible at http://localhost:8080 on the host machine

User-Defined Bridge Networks#

The default bridge network is limited: containers can only communicate via IP address. Create a user-defined bridge and containers can resolve each other by name:

docker network create myapp-network

docker run --network myapp-network --name api myapi-image
docker run --network myapp-network --name db postgres:16

# Now the api container can reach the database at hostname "db"
# e.g., DATABASE_URL=postgres://user:pass@db:5432/mydb

This name-based DNS resolution within user-defined networks is one of the primary reasons Docker Compose is so convenient — it creates a user-defined network automatically for all services in a docker-compose.yml.

Data Persistence: Volumes and Bind Mounts#

Containers are ephemeral by default. Any files written inside a running container are stored in the container's writable layer and are permanently lost when the container is removed. For anything that needs to outlive a container — database files, uploaded assets, logs — you need external storage.

Docker provides three mechanisms:

Volumes vs. Bind Mounts vs. tmpfs

Volumes are managed by Docker and stored in Docker's own directory on the host. Bind mounts expose a specific host path into the container. tmpfs mounts live only in host RAM — data disappears when the container stops.

Rendering diagram...

Docker Compose: Multi-Container Applications#

Real applications are rarely a single container. A typical web application might involve an API server, a database, a cache, a background worker, and a reverse proxy. Managing each of these with individual docker run commands — with the right ports, networks, volumes, and environment variables — quickly becomes unmanageable.

Docker Compose solves this with a single YAML file (docker-compose.yml or compose.yml) that describes the entire application stack, and a single command to start it.

services:
  api:
    build: .                      # Build image from Dockerfile in current directory
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgres://user:pass@db:5432/mydb
      - REDIS_URL=redis://cache:6379
    depends_on:
      - db
      - cache
    volumes:
      - ./app:/app                # Bind mount for local development

  db:
    image: postgres:16-alpine
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
      - POSTGRES_DB=mydb
    volumes:
      - postgres-data:/var/lib/postgresql/data   # Named volume for persistence

  cache:
    image: redis:7-alpine
    volumes:
      - redis-data:/data

volumes:
  postgres-data:
  redis-data:

With this file in place, docker compose up starts all three services, creates a shared network, and connects them by name. The api container reaches the database at the hostname db — no IP addresses, no manual network setup.

One important caveat: depends_on controls startup order, not service readiness. Docker starts the db container before api, but it does not wait for PostgreSQL to finish initializing and be ready to accept connections. If your API connects immediately at startup, it may fail because the database is still booting. The robust solution is to build retry logic into your application startup, or use depends_on with a condition: service_healthy and a healthcheck defined on the db service:

  db:
    image: postgres:16-alpine
    # ... other config ...
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user -d mydb"]
      interval: 5s
      timeout: 3s
      retries: 10

  api:
    depends_on:
      db:
        condition: service_healthy   # Waits until db passes its healthcheck

pg_isready is a small utility bundled with PostgreSQL that exits successfully once the server is ready to accept connections. The retries: 10 setting gives PostgreSQL up to 50 seconds (10 attempts × 5-second interval) to become ready before Docker marks it as unhealthy.

Key Docker Compose commands:

docker compose up -d        # Start all services in the background
docker compose up --build   # Rebuild images before starting (picks up code changes)
docker compose down         # Stop and remove containers (volumes preserved)
docker compose down -v      # Stop and remove containers AND volumes (destructive — data is lost)
docker compose logs -f api  # Stream logs from the api service
docker compose exec api sh  # Open a shell inside the running api container
docker compose ps           # Show running services and their status
docker compose build        # Rebuild images without starting containers

Compose is also the standard tool for local development environments that mirror production. The same docker-compose.yml that a new engineer uses to spin up the full stack on their laptop on day one is a form of executable documentation — it encodes exactly what services the application needs and how they connect.

Common Anti-Patterns#

Running as Root#

By default, processes inside a Docker container run as root (UID 0). If an attacker exploits a vulnerability in your application and breaks out of the container, they have root-level access to the host. Always create and switch to a non-root user, and create that user before you copy application files so you can use --chown to assign proper ownership:

RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser
COPY --chown=appuser:appgroup app/ ./app/
USER appuser

Secrets Baked into Layers#

A common mistake is setting a secret in an early ENV instruction:

# WRONG: the secret is permanently embedded in this layer
ENV API_KEY=supersecretkey123

Even if you unset it in a later layer, the original value remains readable in the image history via docker image history or docker inspect. Never put secrets in ENV, COPY, or RUN instructions that could be inspected post-build. Use runtime environment variables (injected by your orchestrator) or Docker secrets for sensitive values.

Fat Images from Poor Layering#

# WRONG: each apt-get is a separate RUN — wasted layers, and apt cache is kept
RUN apt-get update
RUN apt-get install -y curl wget git
RUN rm -rf /var/lib/apt/lists/*   # Too late — previous layers still have the cache

# CORRECT: single RUN, combined, and cache cleaned in the same layer
RUN apt-get update && apt-get install -y --no-install-recommends \
    curl \
    wget \
    git \
  && rm -rf /var/lib/apt/lists/*

Cache-Busting Your Dependencies#

# WRONG: copying everything before installing dependencies
# Any code change invalidates the pip install layer
COPY . .
RUN pip install -r requirements.txt

# CORRECT: copy only the dependency manifest first
# pip install is only re-run when requirements.txt changes
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

Using `latest` Tags#

# WRONG: unpredictable — "latest" today may be a different version next week
FROM node:latest

# CORRECT: reproducible builds
FROM node:22-alpine

Anti-Pattern	Risk	Fix
Running as root	Container escape gives attacker root on the host	Add `USER` instruction with a non-root user
Secrets in ENV/RUN layers	Secrets visible in image history and registries	Inject secrets at runtime via orchestrator or Docker secrets
COPY all code before dependency install	Every code change re-runs the full dependency install	Copy dependency manifests first, install, then copy code
Separate RUN for each apt install	Extra layers, apt cache retained, slower builds	Combine into one RUN with `&&` and clean up in the same step
Using `:latest` tag	Non-reproducible builds; unexpected breakage on updates	Pin to specific version tags (e.g., `python:3.12-slim`)
Build tools in production image	Oversized image, larger attack surface	Use multi-stage builds to leave build tools behind
Storing data in container writable layer	Data lost when container is removed	Use named volumes or bind mounts for persistent data

Docker in CI/CD Pipelines#

Docker is the lingua franca of modern CI/CD. Every major CI platform — GitHub Actions, GitLab CI, CircleCI, Jenkins — has first-class support for building, testing, and pushing Docker images.

A typical pipeline:

Rendering diagram...

A minimal GitHub Actions workflow:

name: Build and Push

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Log in to Amazon ECR
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build and push
        run: |
          docker build -t $ECR_REGISTRY/myapp:${{ github.sha }} .
          docker push $ECR_REGISTRY/myapp:${{ github.sha }}

Image tagging strategy: Tag images with the Git commit SHA (${{ github.sha }}). This makes every deployment traceable to a specific commit. Avoid reusing mutable tags like latest in production — if a deployment fails, you need to know exactly which code version is running.

Layer caching in CI: CI runners start from a clean environment on each run. Without cache configuration, every build downloads base images and reinstalls dependencies from scratch. Use --cache-from or GitHub Actions' cache action to persist layer cache between runs — this can reduce build times from several minutes to under 30 seconds for a warm cache.

What AI Agents Get Wrong with Docker#

AI Agents and Dockerfile Anti-Patterns

AI agents can generate functional Dockerfiles quickly, but they consistently reproduce several anti-patterns that cause security vulnerabilities, bloated images, and slow CI pipelines. Knowing these patterns helps you review AI-generated Dockerfiles effectively.

Rendering diagram...

Summary#

Concept	Key Takeaway
Container vs. VM	Containers share the host OS kernel and start in milliseconds. VMs boot a full guest OS and take 30–60 seconds. Containers are lighter and more portable; VMs provide stronger isolation.
Image layers	Each Dockerfile instruction that modifies the filesystem produces a cached layer. Instructions that change rarely (base image, dependencies) go first; instructions that change often (application code) go last.
Multi-stage builds	Use one stage to build, a second minimal stage to run. A Go app can shrink from ~800 MB (with compiler) to ~8 MB (binary only). Always use multi-stage builds for compiled languages.
Base image choice	Alpine (~7 MB) for minimal runtime images. `-slim` variants for language images that need system libraries. Pin to specific version tags — never `:latest` in production.
Port mapping	`-p host_port:container_port` publishes a container port to the host. EXPOSE in the Dockerfile is documentation only — it does not publish anything.
Volumes vs. bind mounts	Use named volumes for production data persistence. Use bind mounts during local development to reflect code changes without rebuilding.
User-defined networks	Containers on a user-defined bridge can resolve each other by name. Docker Compose creates one automatically for all services in a `compose.yml`.
Docker Compose	Defines and manages multi-container stacks in a single YAML file. `docker compose up` starts everything; services communicate by service name.
Security baseline	Always run as a non-root user. Never embed secrets in image layers. Pin base image versions. Use multi-stage builds to remove build tools from production images.
CI/CD role	Build image → run tests inside container → push to registry (ECR, GCR, GHCR) → deploy by pulling the image. Tag with Git SHA for traceability.

Sources:

PreviousCoordination

NextContainer Orchestration

Docker

Containers vs. Virtual Machines

Multi-Stage Build: Before and After

Volumes vs. Bind Mounts vs. tmpfs

AI Agents and Dockerfile Anti-Patterns

Arch Advisor