Common Table Expressions — WITH Clause for Readable Recursive Queries

📋 Table of Contents
  1. CTEs and Recursive CTEs
  2. Common Mistakes

Common Table Expressions (CTEs) give a name to a subquery and make complex queries readable. Instead of deeply nested subqueries, you define named intermediate result sets with WITH and reference them by name. Recursive CTEs are particularly powerful — they traverse hierarchical data (category trees, organisation charts) by repeatedly joining a table against itself until no more rows match. SQL Server’s recursive CTE is the standard approach for hierarchy navigation in T-SQL.

CTEs and Recursive CTEs

-- ── Basic CTE — named intermediate result ─────────────────────────────────
WITH PublishedPostStats AS (
    SELECT
        AuthorId,
        COUNT(*)         AS PostCount,
        SUM(ViewCount)   AS TotalViews,
        MAX(ViewCount)   AS MaxViews,
        AVG(ViewCount * 1.0) AS AvgViews
    FROM dbo.Posts
    WHERE IsPublished = 1
    GROUP BY AuthorId
)
SELECT
    u.DisplayName,
    s.PostCount,
    s.TotalViews,
    s.AvgViews
FROM PublishedPostStats s
JOIN dbo.Users u ON u.Id = s.AuthorId
WHERE s.PostCount >= 3
ORDER BY s.TotalViews DESC;

-- ── Multiple CTEs in one statement ────────────────────────────────────────
WITH
TopAuthors AS (
    SELECT AuthorId, SUM(ViewCount) AS TotalViews
    FROM   dbo.Posts WHERE IsPublished = 1
    GROUP  BY AuthorId
    HAVING SUM(ViewCount) > 10000
),
RecentPosts AS (
    SELECT Id, AuthorId, Title, PublishedAt
    FROM   dbo.Posts
    WHERE  IsPublished = 1
      AND  PublishedAt >= DATEADD(MONTH, -3, SYSUTCDATETIME())
)
-- Join the two CTEs
SELECT u.DisplayName, r.Title, r.PublishedAt, ta.TotalViews
FROM   TopAuthors ta
JOIN   dbo.Users u  ON u.Id       = ta.AuthorId
JOIN   RecentPosts r ON r.AuthorId = ta.AuthorId
ORDER  BY ta.TotalViews DESC, r.PublishedAt DESC;

-- ── Recursive CTE — category hierarchy traversal ─────────────────────────
-- Assumes Categories has: Id, Name, Slug, ParentId (nullable)
WITH CategoryHierarchy AS (
    -- Anchor: root categories (no parent)
    SELECT
        Id, Name, Slug, ParentId,
        0         AS Level,
        CAST(Name AS NVARCHAR(MAX)) AS Path
    FROM dbo.Categories
    WHERE ParentId IS NULL

    UNION ALL

    -- Recursive: child categories
    SELECT
        c.Id, c.Name, c.Slug, c.ParentId,
        ch.Level + 1,
        CAST(ch.Path + ' > ' + c.Name AS NVARCHAR(MAX))
    FROM dbo.Categories c
    JOIN CategoryHierarchy ch ON ch.Id = c.ParentId
)
SELECT Id, Name, Level, Path
FROM   CategoryHierarchy
ORDER  BY Path;
-- Result: Technology (Level 0) > Software (Level 1) > Web Development (Level 2)

-- ── Generate a date range with recursive CTE ──────────────────────────────
DECLARE @Start DATE = '2024-01-01';
DECLARE @End   DATE = '2024-12-31';

WITH DateRange AS (
    SELECT @Start AS DateVal
    UNION ALL
    SELECT DATEADD(DAY, 1, DateVal)
    FROM   DateRange
    WHERE  DateVal < @End
)
SELECT dr.DateVal, COUNT(p.Id) AS PostsPublished
FROM DateRange dr
LEFT JOIN dbo.Posts p ON CAST(p.PublishedAt AS DATE) = dr.DateVal
                     AND p.IsPublished = 1
GROUP BY dr.DateVal
ORDER BY dr.DateVal
OPTION (MAXRECURSION 400);   -- default is 100; set to 0 for unlimited
Note: Recursive CTEs have two required parts separated by UNION ALL: the anchor (non-recursive query that seeds the recursion — typically the root nodes) and the recursive member (query that JOINs the CTE to itself to get the next level). SQL Server executes the anchor once, then repeatedly executes the recursive member until no new rows are returned. The default recursion limit is 100 levels — increase with OPTION (MAXRECURSION n) or set to 0 for unlimited (risky for cyclic data).
Tip: CTEs and derived tables (subqueries in FROM) are functionally equivalent for non-recursive cases — SQL Server's query optimizer inlines both into the main query plan. The difference is purely syntactic: CTEs improve readability for complex multi-step logic, while derived tables are inline and harder to read when nested. For simple cases, either works. Use CTEs when the intermediate result is referenced more than once (but note SQL Server may execute it multiple times — use a temp table if the CTE is expensive and referenced multiple times).
Warning: A non-recursive CTE is NOT materialised — it is not executed once and cached. SQL Server inlines the CTE into the main query and optimises the whole thing together. If you reference the same CTE twice in the outer query, SQL Server may execute the CTE logic twice. For expensive CTEs that are referenced multiple times, insert the CTE results into a #temp table first, then query the temp table multiple times. The temp table is materialised once and reused.

Common Mistakes

Mistake 1 — Exceeding default MAXRECURSION limit (query aborted)

❌ Wrong — recursive CTE on deep hierarchy exceeds 100 levels; SQL Server throws "Maximum recursion 100 exceeded".

✅ Correct — add OPTION (MAXRECURSION 1000) for deeper hierarchies; use 0 only for controlled, non-cyclic data.

Mistake 2 — Referencing a CTE multiple times expecting materialisation (executes twice)

❌ Wrong — expensive CTE referenced twice; executed twice because CTEs are not cached by default.

✅ Correct — insert expensive CTE results into a #temp table; query the temp table multiple times.

🧠 Test Yourself

A recursive CTE traverses a category tree. A bug introduced a cycle (Category A's parent is Category B, and Category B's parent is Category A). What happens?