Docker Fundamentals — Images, Containers, Layers, and the Express API Dockerfile

📚 MEAN Stack 📂 Chapter 20: Docker 📄 Lesson 20010 Intermediate 🕒 May 1, 2024

Docker solves the “works on my machine” problem permanently. By packaging an application and all its dependencies — Node.js runtime, npm packages, environment variables, file system structure — into a self-contained image, Docker guarantees that the application behaves identically on every developer’s laptop, in every CI pipeline, and on every production server. For a MEAN Stack application, Docker means the Express API, Angular build, MongoDB, and Redis all run in isolation with defined interfaces between them — regardless of the host operating system or what else is installed.

Docker Core Concepts

Concept	What It Is	Analogy
Image	Read-only template containing OS, runtime, code, and dependencies	A class definition — the blueprint
Container	A running instance of an image — isolated process with its own filesystem	An instance of a class — the running thing
Dockerfile	Recipe for building an image — ordered list of instructions	The source code for an image
Registry	Storage and distribution service for images (Docker Hub, AWS ECR, GitHub GHCR)	npm registry for Docker images
Volume	Persistent storage mounted into a container — survives container restarts	An external hard drive
Network	Virtual network that containers can join — DNS by service name	A private LAN
Layer	Each Dockerfile instruction creates an immutable layer — layers are cached and reused	Git commits stacked on each other

Essential Docker Commands

Command	Purpose
`docker build -t name:tag .`	Build an image from the Dockerfile in the current directory
`docker run -p 3000:3000 name`	Run a container, mapping host port to container port
`docker ps`	List running containers
`docker ps -a`	List all containers (including stopped)
`docker logs container_name`	View container stdout/stderr
`docker exec -it container_name sh`	Open a shell inside a running container
`docker stop / rm container_name`	Stop or remove a container
`docker images`	List locally available images
`docker rmi image_name`	Remove an image
`docker volume ls / prune`	List or clean up volumes

Note: Docker images are built in layers — each instruction in the Dockerfile (FROM, COPY, RUN) creates a new layer. Layers are cached: if a layer’s instruction has not changed since the last build, Docker reuses the cached layer and skips re-executing it. This makes subsequent builds dramatically faster. The critical implication: put instructions that change frequently (copying source code) after instructions that change rarely (installing dependencies). If COPY package*.json comes before RUN npm ci, then a code change only re-runs from COPY src onward — not npm ci.

Tip: Always use a .dockerignore file to exclude files that should not be in the image. Without it, COPY . . copies node_modules/ (hundreds of megabytes), .git/ (potentially gigabytes), dist/ and test files into the image. A minimal .dockerignore for a Node.js project: node_modules, .git, *.log, dist, coverage, .env. This keeps images small and prevents sensitive local configuration from leaking into images.

Warning: Never bake secrets into Docker images. Setting ENV JWT_SECRET=mysecret in a Dockerfile embeds the secret in every image layer — visible to anyone who pulls the image or runs docker history. Pass secrets at runtime: docker run -e JWT_SECRET=value or via Docker secrets / Kubernetes secrets. In production, use a secrets manager (AWS Secrets Manager, HashiCorp Vault) and inject at runtime — the image itself should contain no secrets.

Express API Dockerfile

# ── Express API Dockerfile ────────────────────────────────────────────────
# Multi-stage build: build stage installs all deps, production stage copies only what's needed

# ── Stage 1: dependencies ─────────────────────────────────────────────────
FROM node:20-alpine AS deps
WORKDIR /app

# Copy package files first — layer caching means npm ci only re-runs if these change
COPY package.json package-lock.json ./
RUN npm ci --only=production          # install only production deps, not devDependencies

# ── Stage 2: development (for local dev with hot reload) ──────────────────
FROM node:20-alpine AS development
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci                            # all deps including devDependencies
COPY . .
EXPOSE 3000
CMD ["npm", "run", "dev"]             # nodemon for hot reload

# ── Stage 3: production ───────────────────────────────────────────────────
FROM node:20-alpine AS production
WORKDIR /app

# Create non-root user for security
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

# Copy production dependencies from deps stage
COPY --from=deps /app/node_modules ./node_modules
COPY --chown=appuser:appgroup . .

# Remove files not needed in production
RUN rm -rf __tests__ *.test.js .env.example

USER appuser                          # run as non-root
EXPOSE 3000

# Healthcheck — Docker will restart unhealthy containers
HEALTHCHECK --interval=30s --timeout=10s --start-period=10s --retries=3 \
    CMD wget -qO- http://localhost:3000/api/v1/health || exit 1

CMD ["node", "src/server.js"]

# .dockerignore for Express API
node_modules
.git
.gitignore
*.log
.env
.env.*
!.env.example
dist
coverage
__tests__
*.test.js
*.spec.js
README.md
.eslintrc*
.prettierrc*

How It Works

Step 1 — Multi-Stage Builds Keep Production Images Small

A multi-stage Dockerfile uses multiple FROM instructions. Each stage can copy files from previous stages with COPY --from=stagename. The final image only contains what the last stage has — earlier stages and their layers are discarded. For Node.js, the deps stage installs only production dependencies; the production stage copies them in. Build tools, test frameworks, and source maps never make it into the production image.

Step 2 — Layer Order Determines Cache Effectiveness

Docker re-executes all layers from the first changed layer onward. Placing COPY package*.json ./ before RUN npm ci means that npm ci only re-runs when package.json or package-lock.json changes. If only application source files changed, Docker reuses the cached npm ci layer — turning a 2-minute install into a sub-second cache hit. The pattern is: copy dependency manifests → install → copy source code.

Step 3 — Alpine Images Reduce Attack Surface and Size

node:20-alpine is based on Alpine Linux — a minimal Linux distribution of about 5MB, versus 80MB for Debian-based images. Fewer packages mean fewer potential security vulnerabilities, faster pulls, and lower storage costs. The trade-off is that some npm packages require native modules that need build tools not present in Alpine. If you encounter gyp errors, install build tools with RUN apk add --no-cache python3 make g++ in the deps stage.

Step 4 — Non-Root User Limits Container Breach Impact

By default, processes inside Docker containers run as root. If the application is compromised, an attacker has root access within the container — and potentially a path to the host. Creating a dedicated user (adduser appuser) and switching with USER appuser means the application process runs with minimal privileges. It cannot write to system directories, install packages, or access other users’ files within the container.

Step 5 — HEALTHCHECK Enables Automatic Recovery

The HEALTHCHECK instruction configures Docker to periodically test whether the container is functioning correctly. A failing health check causes orchestrators (Docker Swarm, Kubernetes) to restart the container or stop routing traffic to it. The health check uses wget to hit the /health endpoint — a lightweight Express route that returns 200 if the server and database connection are healthy.

Common Mistakes

Mistake 1 — Copying node_modules into the image

❌ Wrong — node_modules from the host are copied, ignoring the Dockerfile’s npm ci:

COPY . .        # copies node_modules from host if no .dockerignore!
RUN npm ci      # wasted — already have (wrong) node_modules

✅ Correct — .dockerignore node_modules, let Docker install clean:

# .dockerignore:
node_modules   # host node_modules excluded — Docker installs clean ones

Mistake 2 — Copying source code before installing dependencies

❌ Wrong — every source change triggers npm ci:

COPY . .           # copies everything including source — cache invalidated!
RUN npm ci         # re-runs on EVERY code change, even typo fixes

✅ Correct — copy manifests first for cache efficiency:

COPY package*.json ./
RUN npm ci         # cached until package.json changes
COPY . .           # source change only affects this layer onward

Mistake 3 — Baking .env secrets into the image

❌ Wrong — .env is baked into image layers:

COPY .env .        # secret in the image — visible via docker history!

✅ Correct — pass secrets at runtime only:

docker run -e JWT_SECRET=$JWT_SECRET -e MONGO_URI=$MONGO_URI myapp

Quick Reference

Task	Command / Instruction
Build image	`docker build -t myapp:latest .`
Run container	`docker run -p 3000:3000 --env-file .env myapp:latest`
Shell into container	`docker exec -it container_name sh`
View logs	`docker logs -f container_name`
Multi-stage base	`FROM node:20-alpine AS stagename`
Copy from stage	`COPY --from=stagename /app/node_modules ./node_modules`
Non-root user	`RUN adduser -S appuser && USER appuser`
Healthcheck	`HEALTHCHECK CMD wget -qO- http://localhost:3000/health \|\| exit 1`

Docker Core Concepts #

Essential Docker Commands #

Express API Dockerfile #

How It Works #

Step 1 — Multi-Stage Builds Keep Production Images Small #

Step 2 — Layer Order Determines Cache Effectiveness #

Step 3 — Alpine Images Reduce Attack Surface and Size #

Step 4 — Non-Root User Limits Container Breach Impact #

Step 5 — HEALTHCHECK Enables Automatic Recovery #

Common Mistakes #

Mistake 1 — Copying node_modules into the image #

Mistake 2 — Copying source code before installing dependencies #

Mistake 3 — Baking .env secrets into the image #

Quick Reference #

🧠 Test Yourself #

📚 More in this Tutorial Series