EXPLAIN and EXPLAIN ANALYZE

📚 React + FastAPI 📂 Chapter 18: Indexes, EXPLAIN and Query Performance 📄 Lesson 18030 Intermediate 🕒 March 17, 2026

EXPLAIN shows the query plan PostgreSQL will use without executing the query. EXPLAIN ANALYZE executes the query and shows both the estimated and actual costs — the most powerful tool for diagnosing slow queries. Learning to read query plans reveals why a query is slow, whether an index is being used, where most of the time is spent, and whether the planner’s estimates are accurate. For a FastAPI developer, EXPLAIN ANALYZE is the primary diagnostic tool for database performance problems.

-- EXPLAIN: show the plan (no execution)
EXPLAIN SELECT * FROM posts WHERE status = 'published' ORDER BY created_at DESC LIMIT 10;

-- Sample output:
-- Limit  (cost=0.43..1.23 rows=10 width=256)
--   ->  Index Scan using idx_posts_status_created on posts
--         (cost=0.43..124.56 rows=1000 width=256)
--         Index Cond: ((status)::text = 'published'::text)

-- EXPLAIN ANALYZE: execute AND show actual timings
EXPLAIN ANALYZE
SELECT * FROM posts WHERE status = 'published' ORDER BY created_at DESC LIMIT 10;

-- Sample output:
-- Limit  (cost=0.43..1.23 rows=10 width=256) (actual time=0.052..0.067 rows=10 loops=1)
--   ->  Index Scan using idx_posts_status_created on posts
--         (cost=0.43..124.56 rows=1000 width=256) (actual time=0.048..0.062 rows=10 loops=1)
--         Index Cond: ((status)::text = 'published'::text)
-- Planning Time: 0.312 ms
-- Execution Time: 0.089 ms

-- EXPLAIN (FORMAT JSON) — machine-readable, useful for pg_plan_viz tools
EXPLAIN (ANALYZE, FORMAT JSON) SELECT ...;

Note: The cost numbers in EXPLAIN are arbitrary units, not milliseconds. They represent the planner’s estimated work — a cost of 100 is roughly 10x more expensive than a cost of 10, but the absolute values vary by PostgreSQL configuration. The two numbers in (cost=start..total) are the startup cost (time before first row is returned) and the total cost (time to return all rows). Always use EXPLAIN ANALYZE (which shows actual milliseconds) when diagnosing performance issues rather than relying on cost estimates alone.

Tip: The most actionable number in EXPLAIN ANALYZE output is the comparison between estimated rows and actual rows. If the planner estimated 10 rows but actually got 100,000 rows, it made decisions based on wrong assumptions — the wrong join order, wrong index choice, wrong loop strategy. This “row estimate skew” is usually caused by stale statistics. Fix it with VACUUM ANALYZE tablename to update the statistics the planner uses.

Warning: EXPLAIN ANALYZE actually executes the query — including any data modifications. Never run EXPLAIN ANALYZE on an INSERT, UPDATE, or DELETE in production without wrapping it in a transaction you plan to roll back: BEGIN; EXPLAIN ANALYZE DELETE FROM posts WHERE ...; ROLLBACK;. For SELECT queries, this warning does not apply since they have no side effects.

Reading Plan Nodes

-- ── Common plan node types ────────────────────────────────────────────────────

-- Seq Scan (Sequential Scan)
-- Reads every row in the table — appropriate for small tables or large % of rows
-- Seq Scan on posts (cost=0.00..45.20 rows=2000 width=256)
--   Filter: (status = 'published')

-- Index Scan
-- Uses an index to find matching rows, then fetches from heap (table)
-- Index Scan using idx_posts_status on posts (cost=0.43..12.45 rows=50 width=256)
--   Index Cond: (status = 'published'::text)

-- Index Only Scan
-- All needed columns are in the index — never accesses the heap
-- Index Only Scan using idx_posts_list on posts (cost=0.43..8.21 rows=50 width=64)
--   Heap Fetches: 0   -- zero heap reads! Fastest possible read

-- Bitmap Index Scan + Bitmap Heap Scan
-- Used when multiple index conditions apply — combines results in memory
-- Bitmap Heap Scan on posts
--   Recheck Cond: (status = 'published') AND (author_id = 5)
--   ->  BitmapAnd
--         ->  Bitmap Index Scan on idx_posts_status
--         ->  Bitmap Index Scan on idx_posts_author_id

-- Hash Join / Nested Loop / Merge Join
-- Three strategies for JOIN execution
-- Nested Loop: for small tables or highly selective inner query
-- Hash Join: for larger tables — builds hash table of smaller side
-- Merge Join: when both sides are already sorted on join key

Identifying Problems in Plans

-- ── Problem 1: Sequential scan on large table ─────────────────────────────────
EXPLAIN ANALYZE SELECT * FROM posts WHERE author_id = 5;
-- Seq Scan on posts (cost=0.00..45000.00 rows=50 width=256)
--                             ^^^^^ high cost = sequential scan on 1M rows
-- Fix: CREATE INDEX idx_posts_author_id ON posts(author_id);

-- ── Problem 2: Wrong row estimate (stale statistics) ─────────────────────────
EXPLAIN ANALYZE SELECT * FROM posts WHERE status = 'published';
-- Seq Scan on posts (cost=0.00..45.00 rows=10 ...)  -- estimated 10 rows
--                                     (actual time=0.05..89.12 rows=50000 ...)  -- actual 50000!
-- Planner thinks table has 100 rows when it has 1M rows
-- Fix: VACUUM ANALYZE posts;

-- ── Problem 3: Sort without index ────────────────────────────────────────────
EXPLAIN ANALYZE SELECT * FROM posts ORDER BY view_count DESC LIMIT 10;
-- Sort  (cost=120000.00..122500.00 rows=1000000)
--   Sort Key: view_count DESC
--   Sort Method: external merge  Disk: 8192kB  -- spilling to disk!
-- Fix: CREATE INDEX idx_posts_view_count ON posts(view_count DESC);

-- ── Problem 4: Poor join order due to wrong estimates ─────────────────────────
-- If planner chooses Hash Join when Nested Loop would be faster:
-- Check if estimated rows match actual rows on each side of the join
-- If estimates are wrong: VACUUM ANALYZE both joined tables

EXPLAIN Cheatsheet

-- Most useful EXPLAIN options
EXPLAIN SELECT ...;                    -- just the plan (fast)
EXPLAIN ANALYZE SELECT ...;            -- plan + actual times
EXPLAIN (ANALYZE, BUFFERS) SELECT ...; -- + buffer hit/miss counts
EXPLAIN (ANALYZE, VERBOSE) SELECT ...; -- + output columns, schema
EXPLAIN (ANALYZE, FORMAT JSON) SELECT ...; -- JSON output for tools

-- Always use ANALYZE for real diagnosis
-- Buffers shows if data came from cache (shared hit) or disk (read)
-- High "blocks read" with low "blocks hit" means data is cold (not cached)

Common Mistakes

Mistake 1 — Running EXPLAIN ANALYZE on a destructive query without a transaction

❌ Wrong — actually deletes data:

EXPLAIN ANALYZE DELETE FROM posts WHERE created_at < '2020-01-01';
-- Rows are ACTUALLY deleted!

✅ Correct — wrap in transaction for analysis:

BEGIN;
EXPLAIN ANALYZE DELETE FROM posts WHERE created_at < '2020-01-01';
ROLLBACK;   -- ✓ nothing actually deleted

Mistake 2 — Trusting cost estimates without ANALYZE

❌ Wrong — making decisions based on estimated costs alone:

EXPLAIN SELECT ...;
-- cost=100 — but this estimate is meaningless without real timing

✅ Correct — always use ANALYZE for real performance data:

EXPLAIN ANALYZE SELECT ...;
-- actual time=0.052 ms — real milliseconds ✓

Mistake 3 — Ignoring row estimate vs actual row discrepancy

❌ Wrong — plan looks fine, ignoring the mismatch:

-- (cost=... rows=10 ...) (actual ... rows=50000 ...)
-- Planner estimated 10, got 50000 — planner is flying blind!

✅ Correct — run VACUUM ANALYZE when estimates are wildly off:

VACUUM ANALYZE posts;   -- ✓ updates statistics so planner makes better decisions

Quick Reference — Plan Node Meanings

Node	Meaning	Good When
Seq Scan	Read all rows	Small table or >15–20% of rows needed
Index Scan	Index + heap fetch	Selective WHERE condition
Index Only Scan	Index only, no heap	All cols in index (INCLUDE)
Bitmap Heap Scan	Multiple indexes combined	Two conditions each with their own index
Nested Loop	For each outer row, scan inner	Small outer result set
Hash Join	Build hash table of smaller side	Large joins without sorted inputs
Merge Join	Merge two sorted streams	Both sides already sorted
Sort	Sort output (may use disk)	Unavoidable; add index if repeated