Index Fundamentals — Clustered, Non-Clustered and Composite Indexes

📚 DotNet Fullstack + Angular 📂 Chapter 67: Indexes, Execution Plans and Query Optimisation 📄 Lesson 67010 Advanced 🕒 March 17, 2026

An index is a data structure that allows SQL Server to find rows without scanning the entire table. The clustered index defines the physical order of rows in the table — it IS the table. The non-clustered index is a separate B-tree structure that points to rows in the clustered index. Choosing the right index for each query is the single highest-impact database performance skill — the difference between a 10ms query and a 10-second query on a 1-million-row table is almost always an index.

Index Types and Design

-- ── Clustered index — created automatically with PRIMARY KEY ──────────────
-- Posts.Id is the clustered index key — rows are physically ordered by Id
-- Only ONE clustered index per table
CREATE TABLE dbo.Posts (
    Id    INT NOT NULL IDENTITY(1,1),
    ...
    CONSTRAINT PK_Posts PRIMARY KEY CLUSTERED (Id)  -- this IS the clustered index
);

-- ── Non-clustered index — for common query patterns ───────────────────────
-- Query: WHERE IsPublished = 1 ORDER BY PublishedAt DESC
CREATE NONCLUSTERED INDEX IX_Posts_IsPublished_PublishedAt
ON dbo.Posts (IsPublished, PublishedAt DESC)
INCLUDE (Title, Slug, Excerpt, ViewCount, AuthorId, CategoryId);
-- INCLUDE columns are stored in the leaf level — no key lookup needed
-- for queries that need these columns alongside the filter/sort columns

-- ── Unique non-clustered index for slug lookup ────────────────────────────
-- Query: WHERE Slug = @Slug  (very frequent — every page view)
CREATE UNIQUE NONCLUSTERED INDEX UX_Posts_Slug
ON dbo.Posts (Slug)
INCLUDE (Id, Title, IsPublished, AuthorId, CategoryId);

-- ── Filtered index — partial index on a subset of rows ───────────────────
-- Index only published posts (WHERE IsPublished = 1)
-- Much smaller than a full-table index — faster seeks, less storage
CREATE NONCLUSTERED INDEX IX_Posts_Published_Date
ON dbo.Posts (PublishedAt DESC)
INCLUDE (Title, Slug, AuthorId, CategoryId, ViewCount)
WHERE IsPublished = 1;   -- only indexes published posts

-- ── Composite index — the leading column rule ─────────────────────────────
-- Index: (IsPublished, CategoryId, PublishedAt)
-- ✅ Uses index: WHERE IsPublished = 1 AND CategoryId = 3
-- ✅ Uses index: WHERE IsPublished = 1
-- ❌ Cannot use index: WHERE CategoryId = 3 (leading column missing)
-- ❌ Cannot use index: WHERE PublishedAt > '2024-01-01' (first two missing)
CREATE NONCLUSTERED INDEX IX_Posts_Category_Date
ON dbo.Posts (IsPublished, CategoryId, PublishedAt DESC)
INCLUDE (Title, Slug, ViewCount, AuthorId);

-- ── PostTags index for tag-filtered queries ───────────────────────────────
-- Query: WHERE pt.TagId = @TagId AND p.IsPublished = 1
CREATE NONCLUSTERED INDEX IX_PostTags_TagId_PostId
ON dbo.PostTags (TagId, PostId);
-- PostId is the FK to Posts — joins to Posts clustered index efficiently

-- ── View all indexes on a table ───────────────────────────────────────────
SELECT
    i.name          AS IndexName,
    i.type_desc     AS IndexType,
    i.is_unique,
    i.is_primary_key,
    STRING_AGG(c.name, ', ') WITHIN GROUP (ORDER BY ic.key_ordinal) AS KeyColumns
FROM sys.indexes i
JOIN sys.index_columns ic ON ic.object_id = i.object_id AND ic.index_id = i.index_id
JOIN sys.columns       c  ON c.object_id  = i.object_id AND c.column_id  = ic.column_id
WHERE i.object_id = OBJECT_ID('dbo.Posts')
  AND ic.is_included_column = 0
GROUP BY i.name, i.type_desc, i.is_unique, i.is_primary_key
ORDER BY i.name;

Note: The INCLUDE clause adds columns to the leaf level of a non-clustered index without making them part of the sort key. This creates a covering index — a query can be satisfied entirely from the index without a “key lookup” back to the clustered index (main table). Include columns that appear in SELECT lists but are not needed for filtering or sorting. The trade-off: included columns increase the index’s storage size and the cost of INSERT/UPDATE/DELETE on those columns.

Tip: Use filtered indexes when a large portion of your queries target only a subset of rows. The published posts list only needs to query WHERE IsPublished = 1 — a filtered index on that subset is smaller (fewer rows to store), faster to seek, and cheaper to maintain (only updated when published posts change). For the BlogApp, published posts might be 60% of all posts; a filtered index is 40% smaller than a full-table index and covers the vast majority of read queries.

Warning: Every non-clustered index is a cost on write operations. Every INSERT into Posts must also insert into every non-clustered index. Every UPDATE that changes an indexed column must update that index. Every DELETE removes the row from every index. Over-indexing — adding an index “just in case” — degrades write performance without helping read performance. Design indexes based on actual query patterns, not speculation. Start lean and add indexes when query analysis shows they are needed.

Clustered Index Key Choice

Key Type	Fragmentation	Storage	Recommendation
INT IDENTITY	Minimal — appends only	4 bytes	✅ Best default
BIGINT IDENTITY	Minimal	8 bytes	✅ High-volume tables
GUID (random)	Severe — random inserts	16 bytes	❌ Avoid as clustered key
NEWSEQUENTIALID()	Minimal — sequential GUIDs	16 bytes	⚠️ Only if GUID required
DateTime	Varies	8 bytes	⚠️ Only for time-series data

Common Mistakes

Mistake 1 — GUID primary key as clustered index (severe fragmentation)

❌ Wrong — Id UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY CLUSTERED; random GUIDs cause page splits on every insert.

✅ Correct — use NEWSEQUENTIALID() for GUIDs as clustered keys, or use INT IDENTITY with a separate GUID for external exposure.

Mistake 2 — Over-indexing write-heavy tables (every index adds write cost)

❌ Wrong — 15 indexes on a PostViewLog table that receives 10,000 inserts/minute; each insert updates all 15 indexes.

✅ Correct — minimal indexes on high-write tables; index only the columns used in the most critical read queries.

🧠 Test Yourself

An index on Posts is defined as (IsPublished, CategoryId, PublishedAt). A query has WHERE CategoryId = 3 ORDER BY PublishedAt DESC. Does this query use the index?

Yes — any column in the index can be used as the entry point
Yes — SQL Server automatically reorders the WHERE clause to match the index
Partially — it uses the CategoryId and PublishedAt portions but skips IsPublished
No — the leading column rule requires the query to reference the first column (IsPublished) for SQL Server to perform an index seek. Without IsPublished in the WHERE clause, SQL Server cannot navigate to the correct starting point in the B-tree and must perform a full index scan or ignore the index entirely. Fix: add IsPublished = 1 to the query, or create a separate index with CategoryId as the leading column

Index Types and Design #

Clustered Index Key Choice #

Common Mistakes #

Mistake 1 — GUID primary key as clustered index (severe fragmentation) #

Mistake 2 — Over-indexing write-heavy tables (every index adds write cost) #

🧠 Test Yourself #

📚 More in this Tutorial Series

Index Types and Design

Clustered Index Key Choice

Common Mistakes

Mistake 1 — GUID primary key as clustered index (severe fragmentation)

Mistake 2 — Over-indexing write-heavy tables (every index adds write cost)

🧠 Test Yourself