An index is a data structure that allows SQL Server to find rows without scanning the entire table. The clustered index defines the physical order of rows in the table — it IS the table. The non-clustered index is a separate B-tree structure that points to rows in the clustered index. Choosing the right index for each query is the single highest-impact database performance skill — the difference between a 10ms query and a 10-second query on a 1-million-row table is almost always an index.
Index Types and Design
-- ── Clustered index — created automatically with PRIMARY KEY ──────────────
-- Posts.Id is the clustered index key — rows are physically ordered by Id
-- Only ONE clustered index per table
CREATE TABLE dbo.Posts (
Id INT NOT NULL IDENTITY(1,1),
...
CONSTRAINT PK_Posts PRIMARY KEY CLUSTERED (Id) -- this IS the clustered index
);
-- ── Non-clustered index — for common query patterns ───────────────────────
-- Query: WHERE IsPublished = 1 ORDER BY PublishedAt DESC
CREATE NONCLUSTERED INDEX IX_Posts_IsPublished_PublishedAt
ON dbo.Posts (IsPublished, PublishedAt DESC)
INCLUDE (Title, Slug, Excerpt, ViewCount, AuthorId, CategoryId);
-- INCLUDE columns are stored in the leaf level — no key lookup needed
-- for queries that need these columns alongside the filter/sort columns
-- ── Unique non-clustered index for slug lookup ────────────────────────────
-- Query: WHERE Slug = @Slug (very frequent — every page view)
CREATE UNIQUE NONCLUSTERED INDEX UX_Posts_Slug
ON dbo.Posts (Slug)
INCLUDE (Id, Title, IsPublished, AuthorId, CategoryId);
-- ── Filtered index — partial index on a subset of rows ───────────────────
-- Index only published posts (WHERE IsPublished = 1)
-- Much smaller than a full-table index — faster seeks, less storage
CREATE NONCLUSTERED INDEX IX_Posts_Published_Date
ON dbo.Posts (PublishedAt DESC)
INCLUDE (Title, Slug, AuthorId, CategoryId, ViewCount)
WHERE IsPublished = 1; -- only indexes published posts
-- ── Composite index — the leading column rule ─────────────────────────────
-- Index: (IsPublished, CategoryId, PublishedAt)
-- ✅ Uses index: WHERE IsPublished = 1 AND CategoryId = 3
-- ✅ Uses index: WHERE IsPublished = 1
-- ❌ Cannot use index: WHERE CategoryId = 3 (leading column missing)
-- ❌ Cannot use index: WHERE PublishedAt > '2024-01-01' (first two missing)
CREATE NONCLUSTERED INDEX IX_Posts_Category_Date
ON dbo.Posts (IsPublished, CategoryId, PublishedAt DESC)
INCLUDE (Title, Slug, ViewCount, AuthorId);
-- ── PostTags index for tag-filtered queries ───────────────────────────────
-- Query: WHERE pt.TagId = @TagId AND p.IsPublished = 1
CREATE NONCLUSTERED INDEX IX_PostTags_TagId_PostId
ON dbo.PostTags (TagId, PostId);
-- PostId is the FK to Posts — joins to Posts clustered index efficiently
-- ── View all indexes on a table ───────────────────────────────────────────
SELECT
i.name AS IndexName,
i.type_desc AS IndexType,
i.is_unique,
i.is_primary_key,
STRING_AGG(c.name, ', ') WITHIN GROUP (ORDER BY ic.key_ordinal) AS KeyColumns
FROM sys.indexes i
JOIN sys.index_columns ic ON ic.object_id = i.object_id AND ic.index_id = i.index_id
JOIN sys.columns c ON c.object_id = i.object_id AND c.column_id = ic.column_id
WHERE i.object_id = OBJECT_ID('dbo.Posts')
AND ic.is_included_column = 0
GROUP BY i.name, i.type_desc, i.is_unique, i.is_primary_key
ORDER BY i.name;
INCLUDE clause adds columns to the leaf level of a non-clustered index without making them part of the sort key. This creates a covering index — a query can be satisfied entirely from the index without a “key lookup” back to the clustered index (main table). Include columns that appear in SELECT lists but are not needed for filtering or sorting. The trade-off: included columns increase the index’s storage size and the cost of INSERT/UPDATE/DELETE on those columns.WHERE IsPublished = 1 — a filtered index on that subset is smaller (fewer rows to store), faster to seek, and cheaper to maintain (only updated when published posts change). For the BlogApp, published posts might be 60% of all posts; a filtered index is 40% smaller than a full-table index and covers the vast majority of read queries.Posts must also insert into every non-clustered index. Every UPDATE that changes an indexed column must update that index. Every DELETE removes the row from every index. Over-indexing — adding an index “just in case” — degrades write performance without helping read performance. Design indexes based on actual query patterns, not speculation. Start lean and add indexes when query analysis shows they are needed.Clustered Index Key Choice
| Key Type | Fragmentation | Storage | Recommendation |
|---|---|---|---|
| INT IDENTITY | Minimal — appends only | 4 bytes | ✅ Best default |
| BIGINT IDENTITY | Minimal | 8 bytes | ✅ High-volume tables |
| GUID (random) | Severe — random inserts | 16 bytes | ❌ Avoid as clustered key |
| NEWSEQUENTIALID() | Minimal — sequential GUIDs | 16 bytes | ⚠️ Only if GUID required |
| DateTime | Varies | 8 bytes | ⚠️ Only for time-series data |
Common Mistakes
Mistake 1 — GUID primary key as clustered index (severe fragmentation)
❌ Wrong — Id UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY CLUSTERED; random GUIDs cause page splits on every insert.
✅ Correct — use NEWSEQUENTIALID() for GUIDs as clustered keys, or use INT IDENTITY with a separate GUID for external exposure.
Mistake 2 — Over-indexing write-heavy tables (every index adds write cost)
❌ Wrong — 15 indexes on a PostViewLog table that receives 10,000 inserts/minute; each insert updates all 15 indexes.
✅ Correct — minimal indexes on high-write tables; index only the columns used in the most critical read queries.