Advanced BlogApp Queries — CTEs, Window Functions and Analytical Patterns

📋 Table of Contents
  1. Advanced BlogApp Queries
  2. Common Mistakes

Combining CTEs, window functions, JOINs, and subqueries unlocks analytical queries that would be impossible or extremely slow with basic SQL. The BlogApp’s analytics — related posts, author leaderboards, trending content, and activity summaries — all require these advanced patterns. These queries are also what EF Core cannot express efficiently with LINQ, making raw SQL (via FromSqlInterpolated or a stored procedure) the right choice for these specific scenarios.

Advanced BlogApp Queries

-- ── 1. Related posts — posts sharing the most tags ─────────────────────────
DECLARE @CurrentPostId INT = 42;

WITH CurrentPostTags AS (
    SELECT TagId FROM dbo.PostTags WHERE PostId = @CurrentPostId
),
RelatedPostScores AS (
    SELECT
        pt.PostId,
        COUNT(*) AS SharedTagCount
    FROM   dbo.PostTags pt
    JOIN   CurrentPostTags cpt ON cpt.TagId = pt.TagId
    WHERE  pt.PostId <> @CurrentPostId
    GROUP  BY pt.PostId
)
SELECT TOP 5
    p.Id, p.Title, p.Slug, p.ViewCount,
    rps.SharedTagCount
FROM RelatedPostScores rps
JOIN dbo.Posts p ON p.Id = rps.PostId
WHERE p.IsPublished = 1
ORDER BY rps.SharedTagCount DESC, p.ViewCount DESC;

-- ── 2. Author leaderboard with DENSE_RANK ─────────────────────────────────
WITH AuthorStats AS (
    SELECT
        u.Id          AS AuthorId,
        u.DisplayName,
        COUNT(p.Id)         AS PostCount,
        SUM(p.ViewCount)    AS TotalViews,
        AVG(p.ViewCount * 1.0) AS AvgViews
    FROM   dbo.Users u
    JOIN   dbo.Posts p ON p.AuthorId = u.Id
    WHERE  p.IsPublished = 1
    GROUP  BY u.Id, u.DisplayName
)
SELECT
    DisplayName,
    PostCount,
    TotalViews,
    AvgViews,
    DENSE_RANK() OVER (ORDER BY TotalViews DESC) AS Rank
FROM AuthorStats
ORDER BY TotalViews DESC;

-- ── 3. Trending posts — views gained in last 7 days vs prior 7 days ────────
-- Requires a PostViewLog table: (PostId INT, ViewedAt DATETIME2)
WITH RecentViews AS (
    SELECT PostId, COUNT(*) AS ViewsLast7Days
    FROM   dbo.PostViewLog
    WHERE  ViewedAt >= DATEADD(DAY, -7, SYSUTCDATETIME())
    GROUP  BY PostId
),
PriorViews AS (
    SELECT PostId, COUNT(*) AS ViewsPrior7Days
    FROM   dbo.PostViewLog
    WHERE  ViewedAt >= DATEADD(DAY, -14, SYSUTCDATETIME())
      AND  ViewedAt  < DATEADD(DAY,  -7, SYSUTCDATETIME())
    GROUP  BY PostId
)
SELECT TOP 10
    p.Id, p.Title, p.Slug,
    ISNULL(rv.ViewsLast7Days,  0) AS ViewsLast7Days,
    ISNULL(pv.ViewsPrior7Days, 0) AS ViewsPrior7Days,
    ISNULL(rv.ViewsLast7Days,  0) - ISNULL(pv.ViewsPrior7Days, 0) AS ViewsGrowth
FROM   dbo.Posts p
LEFT   JOIN RecentViews rv ON rv.PostId = p.Id
LEFT   JOIN PriorViews  pv ON pv.PostId = p.Id
WHERE  p.IsPublished = 1
ORDER  BY ViewsGrowth DESC;

-- ── 4. Monthly post count with running total ──────────────────────────────
WITH MonthlyStats AS (
    SELECT
        DATEPART(YEAR,  PublishedAt) AS Yr,
        DATEPART(MONTH, PublishedAt) AS Mo,
        COUNT(*) AS PostsPublished
    FROM   dbo.Posts
    WHERE  IsPublished = 1
    GROUP  BY DATEPART(YEAR, PublishedAt), DATEPART(MONTH, PublishedAt)
)
SELECT
    Yr, Mo, PostsPublished,
    SUM(PostsPublished) OVER (
        ORDER BY Yr, Mo
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS RunningTotal
FROM MonthlyStats
ORDER BY Yr, Mo;
Note: The related posts query uses a CTE to isolate the current post’s tags, then joins PostTags against those tags to count how many tags each other post shares. This avoids a more complex self-join. The ISNULL(rv.ViewsLast7Days, 0) in the trending query handles posts that have no views in a period — the LEFT JOIN produces NULL which ISNULL converts to 0 for the arithmetic. Always handle NULLs explicitly in arithmetic expressions — NULL - 5 is NULL, not -5.
Tip: For complex analytical queries that are called frequently, consider creating them as views or stored procedures. A view encapsulates the query logic and gives it a name — EF Core can query views via modelBuilder.Entity<PostSummaryView>().ToView("vw_PostSummary"). A stored procedure can accept parameters and return complex result sets. Both options separate query logic from application code, enable the DBA to optimise independently, and make the query reusable across the application.
Warning: The trending posts query joining PostViewLog is only feasible if PostViewLog has indexes on (PostId, ViewedAt). A table tracking every view of every post can grow very large very quickly — even a modest blog with 10,000 daily visitors produces 3.65 million rows per year. Index correctly from the start, implement a data retention policy (archive or delete old view logs), or use a pre-aggregated summary table that is updated by a background job rather than querying the raw log for every trending request.

Common Mistakes

❌ Wrong — rv.ViewsLast7Days - pv.ViewsPrior7Days AS Growth when either can be NULL; result is NULL.

✅ Correct — ISNULL(rv.ViewsLast7Days, 0) - ISNULL(pv.ViewsPrior7Days, 0).

❌ Wrong — trending query on a 10M-row unindexed PostViewLog; full scan takes minutes.

✅ Correct — index: CREATE INDEX IX_PostViewLog_PostId_ViewedAt ON PostViewLog (PostId, ViewedAt DESC).

🧠 Test Yourself

The related posts CTE query references CurrentPostTags twice: once in the CTE definition and once in the main query. Is CurrentPostTags executed once or twice?