Window functions compute a value for each row based on a group of related rows — the window — without collapsing those rows into a single aggregate result. Unlike GROUP BY which turns N rows into 1, a window function keeps all rows and adds a computed column. This enables powerful analytics that are difficult or impossible with GROUP BY: ranking posts by views within each category, computing running totals, finding the difference from the previous row, and calculating moving averages. In FastAPI, window functions are most useful for dashboard endpoints that return ranked or time-series data.
The OVER() Clause
-- Window functions use OVER() to define the window
-- OVER() with no arguments: the entire result set is the window
-- Rank all posts by view count (no grouping — all rows kept)
SELECT
id,
title,
view_count,
RANK() OVER (ORDER BY view_count DESC) AS rank,
DENSE_RANK() OVER (ORDER BY view_count DESC) AS dense_rank,
ROW_NUMBER() OVER (ORDER BY view_count DESC) AS row_num,
PERCENT_RANK() OVER (ORDER BY view_count DESC) AS percentile
FROM posts
WHERE status = 'published';
-- RANK vs DENSE_RANK vs ROW_NUMBER:
-- If two posts have the same view_count (say 1000):
-- RANK(): 1, 2, 2, 4 (gap after ties)
-- DENSE_RANK(): 1, 2, 2, 3 (no gap after ties)
-- ROW_NUMBER(): 1, 2, 3, 4 (no ties, arbitrary order for equals)
WHERE RANK() OVER (...) <= 3. To filter by a window function result, wrap the query in a subquery or CTE: SELECT * FROM (SELECT ..., RANK() OVER (...) AS r FROM ...) sub WHERE r <= 3. This is one of the most common beginner mistakes with window functions.PARTITION BY inside OVER() divides the window into groups — like GROUP BY but without collapsing rows. Ranking by view count within each author’s posts: RANK() OVER (PARTITION BY author_id ORDER BY view_count DESC) — each author gets their own ranking from 1. Without PARTITION BY, all rows compete in the same global ranking. Think of PARTITION BY as “reset the window for each group.”PARTITION BY — Per-Group Windows
-- PARTITION BY: reset the window for each unique value
-- Rank posts within each author's own post list
SELECT
p.id,
p.title,
p.view_count,
u.name AS author_name,
RANK() OVER (PARTITION BY p.author_id ORDER BY p.view_count DESC) AS author_rank,
COUNT(*) OVER (PARTITION BY p.author_id) AS author_total_posts,
SUM(p.view_count) OVER (PARTITION BY p.author_id) AS author_total_views
FROM posts p
JOIN users u ON p.author_id = u.id
WHERE p.status = 'published'
ORDER BY u.name, author_rank;
-- Get top-1 post per author (most-viewed)
SELECT *
FROM (
SELECT
p.id, p.title, p.view_count, p.author_id,
ROW_NUMBER() OVER (PARTITION BY p.author_id ORDER BY p.view_count DESC) AS rn
FROM posts p WHERE p.status = 'published'
) ranked
WHERE rn = 1 -- only the top post per author
ORDER BY view_count DESC;
Aggregate Window Functions
-- Running total of views (sorted by date)
SELECT
id,
title,
view_count,
created_at,
SUM(view_count) OVER (ORDER BY created_at) AS running_total_views,
AVG(view_count) OVER (ORDER BY created_at
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
AS moving_avg_7_posts,
view_count - LAG(view_count) OVER (ORDER BY created_at) AS change_from_prev,
view_count - LEAD(view_count) OVER (ORDER BY created_at) AS change_to_next
FROM posts
WHERE status = 'published'
ORDER BY created_at;
-- LAG / LEAD: access values from previous/next rows
SELECT
DATE_TRUNC('month', created_at) AS month,
COUNT(*) AS post_count,
LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', created_at))
AS prev_month_count,
COUNT(*) - LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', created_at))
AS month_over_month_change
FROM posts
WHERE status = 'published'
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month;
Window Frame Specification
-- ROWS BETWEEN defines which rows are included in the frame
-- Default frame when ORDER BY present: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
-- Running total from start to current row
SUM(view_count) OVER (ORDER BY created_at ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
-- Moving average of last 7 rows (sliding window)
AVG(view_count) OVER (ORDER BY created_at ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
-- Centered moving average (3 rows before, current, 3 rows after)
AVG(view_count) OVER (ORDER BY created_at ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING)
-- Entire partition (same result as aggregate without GROUP BY)
SUM(view_count) OVER (PARTITION BY author_id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
-- NTILE: divide rows into N equal buckets
NTILE(4) OVER (ORDER BY view_count DESC) AS quartile
-- 1=top 25%, 2=next 25%, 3=next 25%, 4=bottom 25%
Common Mistakes
Mistake 1 — Using window function in WHERE clause
❌ Wrong — cannot filter on window function result directly:
SELECT * FROM posts WHERE RANK() OVER (ORDER BY view_count DESC) <= 5;
-- ERROR: window functions not allowed in WHERE
✅ Correct — wrap in subquery or CTE:
SELECT * FROM (
SELECT *, RANK() OVER (ORDER BY view_count DESC) AS r FROM posts
) sub WHERE r <= 5; -- ✓
Mistake 2 — Confusing RANK with ROW_NUMBER for top-N queries
❌ Wrong — RANK skips numbers after ties, so <= 3 may return 4+ rows:
WHERE RANK() OVER (PARTITION BY author_id ORDER BY views DESC) <= 1
-- If two posts tie for first: returns 2 rows per author (both have RANK=1)
✅ Correct — use ROW_NUMBER for strictly one row per partition:
WHERE ROW_NUMBER() OVER (PARTITION BY author_id ORDER BY views DESC) = 1 -- ✓
Mistake 3 — LAG/LEAD returning NULL at boundaries
❌ Wrong — first row’s LAG is NULL, causing NULL in change calculation:
view_count - LAG(view_count) OVER (ORDER BY created_at) -- NULL for first row
✅ Correct — use LAG with a default:
view_count - LAG(view_count, 1, 0) OVER (ORDER BY created_at) -- ✓ 0 for first row
Quick Reference
| Function | Description |
|---|---|
ROW_NUMBER() |
Unique row number, no ties |
RANK() |
Rank with gaps after ties |
DENSE_RANK() |
Rank without gaps after ties |
NTILE(n) |
Divide into n equal buckets |
LAG(col, n, default) |
Value from n rows before |
LEAD(col, n, default) |
Value from n rows ahead |
SUM() OVER (...) |
Running/windowed total |
AVG() OVER (...) |
Running/windowed average |
FIRST_VALUE(col) |
First value in window frame |
LAST_VALUE(col) |
Last value in window frame |