Window Functions — ROW_NUMBER, RANK, LEAD and LAG

📋 Table of Contents
  1. Window Functions
  2. Common Mistakes

Window functions perform calculations across a set of rows related to the current row without collapsing them into groups like GROUP BY does. ROW_NUMBER() assigns a unique sequential number to each row. RANK() and DENSE_RANK() handle tied positions differently. LAG() and LEAD() access adjacent rows without a self-join. These functions are the clean solution to many “top N per group” and “compare to previous” reporting queries that are awkward with GROUP BY alone.

Window Functions

-- ── ROW_NUMBER — unique sequential number per partition ────────────────────
SELECT
    p.Id, p.Title, p.CategoryId, p.ViewCount,
    ROW_NUMBER() OVER (
        PARTITION BY p.CategoryId   -- restart numbering per category
        ORDER BY p.ViewCount DESC   -- highest views = 1
    ) AS RankInCategory
FROM dbo.Posts p
WHERE p.IsPublished = 1;

-- Get only the TOP 3 posts per category:
SELECT Id, Title, CategoryId, ViewCount, RankInCategory
FROM (
    SELECT
        p.Id, p.Title, p.CategoryId, p.ViewCount,
        ROW_NUMBER() OVER (
            PARTITION BY p.CategoryId ORDER BY p.ViewCount DESC
        ) AS RankInCategory
    FROM dbo.Posts p WHERE p.IsPublished = 1
) ranked
WHERE RankInCategory <= 3;

-- ── RANK vs DENSE_RANK — tied positions ───────────────────────────────────
-- ViewCount: 1000, 1000, 800, 600
-- ROW_NUMBER:  1,    2,   3,   4   -- unique, no ties
-- RANK:        1,    1,   3,   4   -- ties get same rank, next rank skips
-- DENSE_RANK:  1,    1,   2,   3   -- ties get same rank, next rank consecutive

SELECT
    p.Title, p.ViewCount,
    ROW_NUMBER()  OVER (ORDER BY p.ViewCount DESC) AS RowNum,
    RANK()        OVER (ORDER BY p.ViewCount DESC) AS Rank,
    DENSE_RANK()  OVER (ORDER BY p.ViewCount DESC) AS DenseRank
FROM dbo.Posts p WHERE p.IsPublished = 1;

-- ── LAG and LEAD — compare to adjacent rows ───────────────────────────────
SELECT
    p.Id,
    p.Title,
    p.ViewCount,
    p.PublishedAt,
    LAG(p.ViewCount,  1, 0) OVER (ORDER BY p.PublishedAt) AS PrevPostViews,
    LEAD(p.ViewCount, 1, 0) OVER (ORDER BY p.PublishedAt) AS NextPostViews,
    p.ViewCount - LAG(p.ViewCount, 1, 0) OVER (ORDER BY p.PublishedAt) AS ViewsDiff
FROM dbo.Posts p
WHERE p.IsPublished = 1
ORDER BY p.PublishedAt;

-- ── Running totals — SUM OVER with frame ──────────────────────────────────
SELECT
    CAST(p.PublishedAt AS DATE) AS PublishedDate,
    COUNT(*) AS PostsOnDate,
    SUM(COUNT(*)) OVER (
        ORDER BY CAST(p.PublishedAt AS DATE)
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS RunningTotal
FROM   dbo.Posts p
WHERE  p.IsPublished = 1
GROUP  BY CAST(p.PublishedAt AS DATE)
ORDER  BY PublishedDate;

-- ── NTILE — divide into N equal buckets ──────────────────────────────────
SELECT
    p.Title, p.ViewCount,
    NTILE(4) OVER (ORDER BY p.ViewCount DESC) AS Quartile
    -- 1=top 25%, 2=next 25%, 3=next 25%, 4=bottom 25%
FROM dbo.Posts p WHERE p.IsPublished = 1;
Note: The OVER() clause defines the window (the set of rows) for the calculation. PARTITION BY divides the rows into groups (the function restarts for each partition). ORDER BY within OVER defines the row ordering for functions like ROW_NUMBER and LAG. ROWS BETWEEN defines a sliding frame relative to the current row (used for running totals and moving averages). Not all window functions support all OVER clause options — ROW_NUMBER requires ORDER BY; SUM/AVG/COUNT can use ROWS/RANGE frames.
Tip: Use DENSE_RANK() over RANK() for user-facing leaderboards where gaps in rank numbers (1, 1, 3, 4) would confuse users. DENSE_RANK() gives consecutive ranks even with ties (1, 1, 2, 3), which is more intuitive in displays like "2nd place". Use RANK() for competition scoring where the convention is "if two people tie for 1st, the next rank is 3rd". Use ROW_NUMBER() when you need a unique identifier per row regardless of ties — for example, selecting exactly the first row of each group.
Warning: Window functions cannot be used directly in WHERE clauses — they are evaluated after filtering. To filter on a window function result (e.g., only rows where ROW_NUMBER() = 1), wrap the window function in a subquery or CTE and filter in the outer query. This is why the "top N per group" pattern requires either a subquery or a CTE to apply the ROW_NUMBER() and then filter where rn <= N in the outer query.

Common Mistakes

Mistake 1 — Using window function in WHERE clause (syntax error)

❌ Wrong — WHERE ROW_NUMBER() OVER(...) = 1; window functions not allowed in WHERE.

✅ Correct — wrap in CTE or subquery, then filter: WITH r AS (SELECT ..., ROW_NUMBER() OVER(...) rn ...) SELECT * FROM r WHERE rn = 1.

Mistake 2 — RANK() when unique sequential numbers needed (gaps in numbering)

❌ Wrong — RANK() produces 1, 1, 3, 4 for tied values; downstream code expecting 1, 2, 3, 4 breaks.

✅ Correct — ROW_NUMBER() for unique sequential numbers regardless of ties.

🧠 Test Yourself

Posts have ViewCounts: 5000, 3000, 3000, 1000. DENSE_RANK() OVER (ORDER BY ViewCount DESC) assigns what ranks?