Migration Patterns — Safe Online Changes and Data Migrations

📚 React + FastAPI 📂 Chapter 25: Alembic Migrations 📄 Lesson 25040 Intermediate 🕒 March 17, 2026

Not all migrations are equal in terms of safety. Some operations — dropping a column, changing a NOT NULL constraint — require table rewrites or exclusive locks that block all reads and writes on a busy production table. Others — adding a nullable column, creating an index with CONCURRENTLY — are safe to run on a live database with no downtime. Understanding which operations are safe online and how to use PostgreSQL’s transactional DDL and non-locking alternatives is the difference between a migration that takes milliseconds and one that causes a 60-second outage.

Safe Online Operations

from alembic import op
import sqlalchemy as sa

# ── Safe: Add nullable column (instant in PostgreSQL 11+) ────────────────────
def upgrade() -> None:
    op.add_column("posts",
        sa.Column("subtitle", sa.Text(), nullable=True)
    )

def downgrade() -> None:
    op.drop_column("posts", "subtitle")

# ── Safe: Add NOT NULL column WITH server_default (PG11+) ────────────────────
# PostgreSQL 11+: stored as a catalogue default — no table rewrite!
def upgrade() -> None:
    op.add_column("posts",
        sa.Column("is_featured", sa.Boolean(),
                  nullable=False, server_default=sa.text("false"))
    )

def downgrade() -> None:
    op.drop_column("posts", "is_featured")

# ── Safe: Create index CONCURRENTLY (non-blocking) ────────────────────────────
# Cannot run inside a transaction — use op.execute() directly
def upgrade() -> None:
    op.execute(
        "CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_posts_status "
        "ON posts (status) WHERE deleted_at IS NULL"
    )

def downgrade() -> None:
    op.execute("DROP INDEX CONCURRENTLY IF EXISTS idx_posts_status")

Note: PostgreSQL’s DDL is transactional — CREATE TABLE, ADD COLUMN, DROP INDEX etc. can be rolled back if the migration fails. This is a major advantage over MySQL/MariaDB where DDL is not transactional. However, CREATE INDEX CONCURRENTLY is an exception — it cannot run inside a transaction. Alembic wraps migrations in transactions by default. To run a non-transactional statement, you must either use op.execute() outside the migration’s auto-transaction or set transactional_ddl=False in the migration script.

Tip: The safest pattern for adding a NOT NULL constraint to an existing column is a three-step migration: (1) add as nullable with a default, (2) backfill existing NULL rows, (3) add the NOT NULL constraint using op.alter_column(nullable=False) which uses PostgreSQL’s ALTER TABLE ... SET NOT NULL. This avoids locking the table during the backfill because each step is fast. The three steps can be in separate migration files for extra safety on large tables.

Warning: Changing a column’s type (e.g., INTEGER to BIGINT, VARCHAR(100) to VARCHAR(200)) may require PostgreSQL to rewrite the entire table, acquiring an AccessExclusiveLock for the duration. On a large production table with millions of rows, this can take minutes and block all reads and writes. The safe alternative is a multi-step migration: add a new column, backfill via a background job, switch the application to use the new column, then drop the old column.

Adding a NOT NULL Constraint Safely

from alembic import op
import sqlalchemy as sa

# ── Step 1: Add column as nullable ───────────────────────────────────────────
def upgrade() -> None:
    op.add_column("posts",
        sa.Column("word_count", sa.Integer(), nullable=True)
    )

# ── Step 2 (separate migration): Backfill existing rows ──────────────────────
def upgrade() -> None:
    # Backfill in batches to avoid locking large tables for too long
    op.execute("""
        UPDATE posts
        SET word_count = array_length(regexp_split_to_array(trim(body), '\s+'), 1)
        WHERE word_count IS NULL
    """)

# ── Step 3 (separate migration): Apply NOT NULL constraint ────────────────────
def upgrade() -> None:
    # Uses SET NOT NULL which validates in-place — faster than full rewrite on PG12+
    op.alter_column("posts", "word_count", nullable=False)

def downgrade() -> None:
    op.alter_column("posts", "word_count", nullable=True)

Data Migrations — Transforming Data

from alembic import op
import sqlalchemy as sa
from sqlalchemy import text

# ── Migrate data as part of a schema change ───────────────────────────────────
# Scenario: split full_name column into first_name + last_name
def upgrade() -> None:
    # Add new columns
    op.add_column("users", sa.Column("first_name", sa.String(100), nullable=True))
    op.add_column("users", sa.Column("last_name",  sa.String(100), nullable=True))

    # Migrate data from old column to new columns
    conn = op.get_bind()
    conn.execute(text("""
        UPDATE users
        SET
            first_name = split_part(full_name, ' ', 1),
            last_name  = NULLIF(split_part(full_name, ' ', 2), '')
        WHERE full_name IS NOT NULL
    """))

    # Make them NOT NULL now that all rows have values
    op.alter_column("users", "first_name", nullable=False)

    # Drop the old column in a later migration after app code is updated
    # op.drop_column("users", "full_name")   ← do this in a separate deploy

def downgrade() -> None:
    # Recombine into full_name before dropping new columns
    conn = op.get_bind()
    conn.execute(text(
        "UPDATE users SET full_name = CONCAT_WS(' ', first_name, last_name)"
    ))
    op.drop_column("users", "last_name")
    op.drop_column("users", "first_name")

# ── Seeding initial data from a migration ─────────────────────────────────────
def upgrade() -> None:
    op.create_table("roles",
        sa.Column("id",   sa.Integer(), primary_key=True),
        sa.Column("name", sa.String(50), nullable=False, unique=True),
    )
    # Insert initial data
    op.execute(
        "INSERT INTO roles (name) VALUES ('user'), ('editor'), ('admin')"
    )

def downgrade() -> None:
    op.drop_table("roles")

Lock-Safety Summary

Operation	Lock Type	Safe Online?
Add nullable column	AccessExclusive (instant)	✅ Yes
Add NOT NULL + server_default (PG11+)	AccessExclusive (instant)	✅ Yes
Drop column	AccessExclusive (instant)	✅ Usually (instant)
CREATE INDEX	ShareLock (blocks writes)	❌ Use CONCURRENTLY
CREATE INDEX CONCURRENTLY	ShareUpdateExclusive	✅ Yes (slower)
Add NOT NULL (no default)	AccessExclusive + table scan	❌ Use 3-step
Change column type	AccessExclusive + table rewrite	❌ Use multi-step
Add FK NOT VALID + VALIDATE	Two light locks	✅ Yes

Common Mistakes

Mistake 1 — Adding NOT NULL without backfilling first

❌ Wrong — fails if any existing row has NULL:

op.add_column("posts", sa.Column("word_count", sa.Integer(), nullable=False))
# ERROR: column "word_count" contains null values (for existing rows)

✅ Correct — add nullable, backfill, then add NOT NULL constraint.

Mistake 2 — op.create_index() without CONCURRENTLY on large tables

❌ Wrong — blocks all writes during index build:

op.create_index("idx_posts_status", "posts", ["status"])
# Takes minutes on large table — all writes blocked!

✅ Correct:

op.execute("CREATE INDEX CONCURRENTLY idx_posts_status ON posts(status)")   # ✓

Mistake 3 — Data migration in Python loop (too slow)

❌ Wrong — UPDATE one row at a time:

for post in conn.execute("SELECT id, body FROM posts"):
    wc = len(post.body.split())
    conn.execute("UPDATE posts SET word_count = ? WHERE id = ?", wc, post.id)
# 1M rows = 1M round-trips to DB!

✅ Correct — single SQL UPDATE:

conn.execute(text("UPDATE posts SET word_count = array_length(regexp_split_to_array(trim(body), '\\s+'), 1)"))

Safe Online Operations #

Adding a NOT NULL Constraint Safely #

Data Migrations — Transforming Data #

Lock-Safety Summary #

Common Mistakes #

Mistake 1 — Adding NOT NULL without backfilling first #

Mistake 2 — op.create_index() without CONCURRENTLY on large tables #

Mistake 3 — Data migration in Python loop (too slow) #

🧠 Test Yourself #

📚 More in this Tutorial Series