Without indexes, every query MongoDB executes performs a collection scan — it reads every single document in the collection to find matches. On a collection of one million tasks, a query like { userId: '...' } reads all one million documents even if only ten match. An index on the userId field lets MongoDB jump directly to the matching documents in milliseconds. Indexes are the single most impactful performance optimisation you can apply to a MongoDB database — more than hardware upgrades, more than query rewrites. Understanding when to create them, what types exist, and how to verify they are being used is a core backend skill.
Index Types
| Index Type | Created With | Supports |
|---|---|---|
| Single Field | { userId: 1 } |
Equality, range, sort on one field |
| Compound | { userId: 1, status: 1, createdAt: -1 } |
Multi-field queries, covered queries |
| Multikey | { tags: 1 } on array field |
Array element queries — one index entry per element |
| Text | { title: 'text', description: 'text' } |
Full-text search with $text and relevance scoring |
| Geospatial (2dsphere) | { location: '2dsphere' } |
Near, within, intersects queries on GeoJSON |
| Hashed | { _id: 'hashed' } |
Equality only — used for sharding |
| Wildcard | { '$**': 1 } |
All fields — flexible schemas |
| Partial | { status: 1 }, { partialFilterExpression: { active: true } } |
Index only a subset of documents |
| Sparse | { phone: 1 }, { sparse: true } |
Only documents where field exists |
| TTL | { expiresAt: 1 }, { expireAfterSeconds: 0 } |
Automatic document expiration |
| Unique | { email: 1 }, { unique: true } |
Enforces uniqueness — like a DB constraint |
The ESR Rule — Compound Index Field Order
| Position | Letter | Field Type | Example |
|---|---|---|---|
| 1st | E — Equality | Fields used with exact match | userId: 1 |
| 2nd | S — Sort | Fields used in .sort() |
createdAt: -1 |
| 3rd | R — Range | Fields used with $gt, $lt, $in |
status: 1 |
{ userId: 1, status: 1, createdAt: -1 } can support queries that filter by userId, or by userId + status, or by userId + status + createdAt. But it cannot support a query that filters only by status without also filtering by userId. MongoDB uses the index from left to right — it can use a prefix of the index but not skip fields. The ESR rule (Equality, Sort, Range) gives you the optimal field ordering for most compound indexes.{ userId: 1 } and project { title: 1, status: 1, _id: 0 }, an index on { userId: 1, title: 1, status: 1 } covers the query completely. Look for covered queries in the Explain Plan output: totalDocsExamined: 0.explain('executionStats'). A collection with 20 indexes will have significantly slower write throughput than one with 5 well-chosen indexes. Audit and drop unused indexes regularly using db.collection.aggregate([{ $indexStats: {} }]) to see which indexes have zero usage.Creating and Managing Indexes
// ── Single field index ────────────────────────────────────────────────────
taskSchema.index({ userId: 1 }); // ascending
taskSchema.index({ createdAt: -1 }); // descending — newest first
// ── Unique index ──────────────────────────────────────────────────────────
userSchema.index({ email: 1 }, { unique: true }); // no duplicate emails
// ── Compound index — ESR rule: Equality, Sort, Range ──────────────────────
// Query: { userId, status } sorted by createdAt descending
taskSchema.index({ userId: 1, createdAt: -1 }); // equality + sort
taskSchema.index({ userId: 1, status: 1, createdAt: -1 }); // equality + equality + sort
// ── Partial index — only index active, non-deleted tasks ──────────────────
taskSchema.index(
{ userId: 1, dueDate: 1 },
{ partialFilterExpression: { status: { $ne: 'completed' }, deletedAt: { $exists: false } } }
);
// Much smaller index — only covers the documents that matter
// ── TTL index — auto-delete expired sessions ──────────────────────────────
sessionSchema.index(
{ expiresAt: 1 },
{ expireAfterSeconds: 0 } // delete when expiresAt date is reached
);
// ── Text index — full-text search ─────────────────────────────────────────
taskSchema.index(
{ title: 'text', description: 'text' },
{ weights: { title: 10, description: 3 }, name: 'task_text_search' }
);
// Using text search:
const tasks = await Task.find(
{ $text: { $search: 'client meeting deadline' } },
{ score: { $meta: 'textScore' } }
).sort({ score: { $meta: 'textScore' } });
// ── Sparse index — only for documents that have the field ─────────────────
userSchema.index({ githubId: 1 }, { sparse: true, unique: true });
// Only indexes users with a githubId — doesn't require all users to have one
// ── Multikey index — array field ──────────────────────────────────────────
taskSchema.index({ tags: 1 });
// Query: { tags: 'urgent' } — uses the multikey index efficiently
// Query: { tags: { $all: ['urgent', 'Q4'] } } — also uses multikey index
// ── Geospatial index ──────────────────────────────────────────────────────
locationSchema.index({ coordinates: '2dsphere' });
// coordinates stored as GeoJSON: { type: 'Point', coordinates: [lng, lat] }
// Find locations within 5km of a point
const nearby = await Location.find({
coordinates: {
$near: {
$geometry: { type: 'Point', coordinates: [-73.97, 40.77] },
$maxDistance: 5000, // metres
},
},
});
Analysing Query Performance with explain()
// ── explain() — understand how MongoDB executes a query ───────────────────
// Mongoose explain
const explanation = await Task
.find({ userId: '64a1f...', status: 'pending' })
.sort({ createdAt: -1 })
.explain('executionStats');
console.log(JSON.stringify(explanation, null, 2));
// Key fields to examine:
// winningPlan.inputStage.stage
// 'COLLSCAN' → full collection scan (BAD — needs an index)
// 'IXSCAN' → index scan (GOOD — using an index)
// 'FETCH' → fetching documents after index scan (normal)
// 'SORT' → in-memory sort (may need index on sort field)
// executionStats:
// nReturned — documents returned
// totalDocsExamined — documents MongoDB had to look at
// totalKeysExamined — index entries examined
// executionTimeMillis — query duration
// A good query: nReturned == totalDocsExamined (ratio 1:1)
// A bad query: nReturned = 10, totalDocsExamined = 100,000 (ratio 1:10000)
// ── Index usage statistics — find unused indexes ──────────────────────────
const indexStats = await Task.aggregate([{ $indexStats: {} }]);
indexStats.forEach(stat => {
console.log(`${stat.name}: ${stat.accesses.ops} operations since server restart`);
});
// Indexes with 0 ops are candidates for removal
// ── Check existing indexes ────────────────────────────────────────────────
const indexes = await Task.collection.getIndexes();
console.log(indexes);
How It Works
Step 1 — B-Tree Structure Enables Fast Lookups
Most MongoDB indexes use a B-tree (balanced tree) data structure. The B-tree stores field values in sorted order. To find { userId: 'abc' }, MongoDB navigates the B-tree from the root, making comparisons at each node until it finds the matching leaf. This takes O(log n) time — for 1 million documents, that is about 20 comparisons instead of 1 million. The tree is kept balanced automatically on every write.
Step 2 — The Query Planner Chooses the Best Index
When multiple indexes could satisfy a query, MongoDB’s query planner runs them in parallel in a trial period, counts how many documents each approach examines, and picks the winner. The winning plan is cached for that query shape. You can see the winning and rejected plans in explain('allPlansExecution'). The planner re-evaluates periodically or after a certain number of writes to adapt to changing data distributions.
Step 3 — Compound Indexes Serve Multiple Query Patterns
A single compound index on { userId: 1, status: 1, createdAt: -1 } can efficiently serve these queries: { userId }, { userId, status }, { userId, status, createdAt }, and sort by createdAt when filtering on userId. This is the index prefix property — MongoDB can use any left-anchored prefix of the index. Planning compound indexes thoughtfully means fewer total indexes while serving more query patterns.
Step 4 — Text Indexes Enable Language-Aware Search
A text index tokenises the indexed fields — splitting text into words, removing stopwords (“the”, “a”, “is”), and applying stemming (reducing “running” and “runs” to “run”). When you query { $text: { $search: 'running tasks' } }, MongoDB searches for stemmed forms of each word. Relevance scores are computed based on term frequency and the field weights you assign. This is simpler but less powerful than a dedicated search engine like Elasticsearch.
Step 5 — TTL Indexes Automate Document Expiry
MongoDB runs a background thread every 60 seconds that checks TTL-indexed fields. Documents where the indexed date field is older than expireAfterSeconds are automatically deleted. This is ideal for session tokens, password reset tokens, rate-limit records, and audit logs. Setting expireAfterSeconds: 0 means “delete when the indexed date is reached” — the field value itself is the expiry timestamp.
Real-World Example: Index Strategy for Task Manager
// models/task.model.js — complete index strategy
const taskSchema = new mongoose.Schema({ /* ... fields ... */ }, { timestamps: true });
// 1. Primary access pattern: all tasks for a user, sorted by date
taskSchema.index({ user: 1, createdAt: -1 });
// 2. Filter by user + status (most common list query)
taskSchema.index({ user: 1, status: 1, createdAt: -1 });
// 3. Filter by user + priority
taskSchema.index({ user: 1, priority: 1, createdAt: -1 });
// 4. Overdue tasks: user + dueDate (for due date queries)
taskSchema.index({ user: 1, dueDate: 1 });
// 5. Partial index: only active tasks (excludes completed — smaller, faster)
taskSchema.index(
{ user: 1, createdAt: -1 },
{
partialFilterExpression: { status: { $ne: 'completed' } },
name: 'active_tasks_by_user',
}
);
// 6. Full-text search
taskSchema.index(
{ title: 'text', description: 'text' },
{ weights: { title: 10, description: 3 } }
);
// 7. Tags multikey (for tag filtering)
taskSchema.index({ tags: 1 });
// models/user.model.js
userSchema.index({ email: 1 }, { unique: true }); // unique email
userSchema.index({ createdAt: -1 }); // list users by date
// models/session.model.js
sessionSchema.index(
{ expiresAt: 1 },
{ expireAfterSeconds: 0, name: 'session_ttl' } // auto-delete expired sessions
);
sessionSchema.index({ userId: 1 }); // find sessions by user
sessionSchema.index({ token: 1 }, { unique: true }); // lookup by token
Common Mistakes
Mistake 1 — Creating indexes on every field “just in case”
❌ Wrong — over-indexed collection has slow writes:
// 15 indexes on a write-heavy collection
taskSchema.index({ title: 1 });
taskSchema.index({ description: 1 });
taskSchema.index({ priority: 1 });
taskSchema.index({ status: 1 });
taskSchema.index({ dueDate: 1 });
// ... 10 more
// Every insert/update must update all 15 B-trees — writes are 3x slower
✅ Correct — only index fields that appear in real, frequent queries:
// Profile real queries first, then create targeted compound indexes
taskSchema.index({ user: 1, status: 1, createdAt: -1 }); // covers the main list query
Mistake 2 — Wrong compound index field order
❌ Wrong — range field first blocks use for equality + sort queries:
taskSchema.index({ createdAt: -1, userId: 1 });
// Query: { userId: 'x' } sorted by createdAt — CANNOT use this index efficiently
// MongoDB must scan all entries sorted by createdAt to find userId matches
✅ Correct — equality field first, sort field last (ESR rule):
taskSchema.index({ userId: 1, createdAt: -1 });
// Query: { userId: 'x' } sorted by createdAt — uses index perfectly
Mistake 3 — Not including all query fields in the index
❌ Wrong — index on userId only, query filters on status too:
taskSchema.index({ userId: 1 });
// Query: { userId: 'x', status: 'pending' }
// MongoDB uses the index to find all tasks for userId, then filters status in memory
// If userId has 10,000 tasks, 10,000 docs fetched just to find 20 pending ones
✅ Correct — include status in the compound index:
taskSchema.index({ userId: 1, status: 1 });
// MongoDB goes directly to userId+status='pending' entries — reads only matching docs
Quick Reference
| Index Type | Schema Definition | Use For |
|---|---|---|
| Single field | schema.index({ field: 1 }) |
Frequent single-field queries |
| Unique | schema.index({ email: 1 }, { unique: true }) |
Enforce uniqueness |
| Compound (ESR) | schema.index({ user: 1, status: 1, date: -1 }) |
Multi-field queries |
| Text | schema.index({ title: 'text', body: 'text' }) |
Full-text search |
| Partial | schema.index({ f: 1 }, { partialFilterExpression: {...} }) |
Subset of documents |
| TTL | schema.index({ expiresAt: 1 }, { expireAfterSeconds: 0 }) |
Auto-delete by date |
| Sparse | schema.index({ field: 1 }, { sparse: true }) |
Optional fields |
| Geospatial | schema.index({ loc: '2dsphere' }) |
Location queries |
| Explain query | Model.find(filter).explain('executionStats') |
Verify index usage |