Schema Evolution — ALTER TABLE and Migration Strategies

A database schema is never finished — business requirements evolve, performance needs change, and data models need refinement. ALTER TABLE lets you modify an existing table’s structure: adding columns, changing types, renaming things, adding or dropping constraints. The challenge in production is doing this safely — some operations acquire table-level locks that block all reads and writes for the duration. Understanding which operations are safe to run on a live database and which require a maintenance window is essential for operating FastAPI applications without downtime.

Safe ALTER TABLE Operations

-- ── Adding a column ───────────────────────────────────────────────────────────
-- Safe (no rewrite required): adding a nullable column with no default
ALTER TABLE posts ADD COLUMN subtitle TEXT;

-- Safe in PostgreSQL 11+: adding a NOT NULL column WITH a constant default
-- Pre-11: rewrote the entire table (locks all reads/writes!)
-- Post-11: stores the default in catalogue, no row rewrite needed
ALTER TABLE posts ADD COLUMN is_featured BOOLEAN NOT NULL DEFAULT FALSE;

-- ── Renaming ──────────────────────────────────────────────────────────────────
-- Fast — only updates catalogue, no row data touched
ALTER TABLE posts RENAME COLUMN subtitle TO sub_heading;
ALTER TABLE posts RENAME TO blog_posts;   -- rename the table itself

-- ── Dropping a column ─────────────────────────────────────────────────────────
-- Fast — marks as dropped in catalogue (data stays until VACUUM FULL)
ALTER TABLE posts DROP COLUMN sub_heading;
ALTER TABLE posts DROP COLUMN sub_heading CASCADE;   -- also drops dependent views

-- ── Adding a constraint ───────────────────────────────────────────────────────
-- NOT VALID: adds constraint without scanning existing rows (fast)
-- Then VALIDATE: scans existing rows without full lock
ALTER TABLE posts
    ADD CONSTRAINT posts_author_fk
    FOREIGN KEY (author_id) REFERENCES users(id)
    NOT VALID;   -- fast — no scan of existing rows

-- Later, validate existing rows (holds a lighter ShareUpdateExclusiveLock)
ALTER TABLE posts VALIDATE CONSTRAINT posts_author_fk;
Note: The most dangerous ALTER TABLE operations in production are those that require rewriting table data: changing a column type (e.g., VARCHAR to TEXT is safe; INTEGER to TEXT requires a rewrite), adding a NOT NULL column with a computed default (before PostgreSQL 11), or adding a CHECK constraint that validates all existing rows. These operations hold an AccessExclusiveLock — no reads or writes are allowed until the operation completes. On a large table (millions of rows), this can take minutes or hours.
Tip: The safe way to add a NOT NULL constraint to an existing column (that currently allows NULL) is a three-step process: (1) Add the column as nullable (ADD COLUMN col type), (2) backfill existing rows (UPDATE table SET col = default WHERE col IS NULL), (3) add the NOT NULL constraint with ALTER TABLE VALIDATE CONSTRAINT (or SET NOT NULL which requires a scan but is faster than a full rewrite). This avoids the long lock because each step is fast.
Warning: Never run long-running migrations directly on a production database without a maintenance window or without verifying the operation is lock-safe. Tools like pg_repack and pglogical can rewrite tables without blocking reads. The database migration tool Alembic (covered in Part 3) runs your migrations but does not automatically make them lock-safe — you must design the SQL in your migration scripts to be safe.

Changing Column Types

-- ── Safe type changes (no rewrite) ───────────────────────────────────────────
ALTER TABLE posts ALTER COLUMN title TYPE TEXT;         -- VARCHAR → TEXT: safe
ALTER TABLE posts ALTER COLUMN id TYPE BIGINT;           -- INT → BIGINT: safe

-- ── Unsafe type changes (require rewrite) ────────────────────────────────────
-- These require full table rewrite (lock all reads/writes):
-- TEXT → INTEGER (needs USING clause + all values must be valid integers)
-- INTEGER → BOOLEAN (needs USING clause)
-- TIMESTAMPTZ → TEXT

-- Safe pattern: add new column, backfill, switch application, drop old
-- Step 1: add new column
ALTER TABLE posts ADD COLUMN view_count_new BIGINT;
-- Step 2: backfill
UPDATE posts SET view_count_new = view_count;
-- Step 3: add constraint
ALTER TABLE posts ALTER COLUMN view_count_new SET NOT NULL;
-- Step 4: application now uses new column
-- Step 5: drop old column
ALTER TABLE posts DROP COLUMN view_count;
-- Step 6: rename new column to original name
ALTER TABLE posts RENAME COLUMN view_count_new TO view_count;

Backward-Compatible Schema Changes

-- ── Zero-downtime deployment strategy ────────────────────────────────────────
-- When deploying new application code alongside a database migration:
-- The database change must be backward-compatible with the OLD code
-- during the deployment window when both old and new code may be running

-- Safe: adding a nullable column (old code ignores it, new code uses it)
ALTER TABLE posts ADD COLUMN published_at TIMESTAMPTZ;

-- Safe: adding a table (old code doesn't use it, new code does)
CREATE TABLE post_reactions (
    post_id    BIGINT REFERENCES posts(id),
    user_id    BIGINT REFERENCES users(id),
    reaction   TEXT,
    PRIMARY KEY (post_id, user_id)
);

-- Safe: adding an index (old and new code still work during index build)
CREATE INDEX CONCURRENTLY idx_posts_published_at ON posts(published_at);
-- CONCURRENTLY: builds index without holding table lock (slower but non-blocking)

-- UNSAFE: renaming a column (old code breaks immediately)
-- Strategy: add a new column + trigger to keep both in sync during migration,
-- update application, then drop old column

Common Mistakes

Mistake 1 — ALTER TABLE without CONCURRENTLY on production index

❌ Wrong — blocks all reads/writes during index build:

CREATE INDEX idx_posts_status ON posts(status);   -- holds lock entire build time!

✅ Correct — non-blocking index build:

CREATE INDEX CONCURRENTLY idx_posts_status ON posts(status);   -- ✓ non-blocking

Mistake 2 — Renaming a column without a compatibility period

❌ Wrong — instant breakage of running application:

ALTER TABLE posts RENAME COLUMN body TO content;   -- old code referencing 'body' breaks!

✅ Correct — gradual migration (add alias column, update app, rename later).

Mistake 3 — Adding NOT NULL without backfilling existing rows

❌ Wrong — fails if any existing row has NULL:

ALTER TABLE posts ALTER COLUMN excerpt SET NOT NULL;
-- ERROR: column "excerpt" contains null values

✅ Correct — backfill first:

UPDATE posts SET excerpt = '' WHERE excerpt IS NULL;
ALTER TABLE posts ALTER COLUMN excerpt SET NOT NULL;   -- ✓ no NULLs to violate

Quick Reference

Operation Lock Level Safe Online?
Add nullable column AccessExclusive (instant) Yes
Add NOT NULL + default (PG11+) AccessExclusive (instant) Yes
Drop column AccessExclusive (instant) Yes
Rename column/table AccessExclusive (instant) Yes (breaks old app code)
Change type (compatible) AccessExclusive Yes for VARCHAR→TEXT
Change type (incompatible) AccessExclusive (full rewrite) No — needs maintenance window
CREATE INDEX ShareLock (blocks writes) Use CONCURRENTLY
CREATE INDEX CONCURRENTLY ShareUpdateExclusive Yes
Add FK NOT VALID + VALIDATE Two light locks Yes

🧠 Test Yourself

Your production posts table has 5 million rows. You need to add an index on status for a new FastAPI query. How do you do this without taking the site offline?