Indexes — Single Field, Compound, Text, and Geospatial

Without indexes, every query MongoDB executes performs a collection scan — it reads every single document in the collection to find matches. On a collection of one million tasks, a query like { userId: '...' } reads all one million documents even if only ten match. An index on the userId field lets MongoDB jump directly to the matching documents in milliseconds. Indexes are the single most impactful performance optimisation you can apply to a MongoDB database — more than hardware upgrades, more than query rewrites. Understanding when to create them, what types exist, and how to verify they are being used is a core backend skill.

Index Types

Index Type	Created With	Supports
Single Field	`{ userId: 1 }`	Equality, range, sort on one field
Compound	`{ userId: 1, status: 1, createdAt: -1 }`	Multi-field queries, covered queries
Multikey	`{ tags: 1 }` on array field	Array element queries — one index entry per element
Text	`{ title: 'text', description: 'text' }`	Full-text search with `$text` and relevance scoring
Geospatial (2dsphere)	`{ location: '2dsphere' }`	Near, within, intersects queries on GeoJSON
Hashed	`{ _id: 'hashed' }`	Equality only — used for sharding
Wildcard	`{ '$**': 1 }`	All fields — flexible schemas
Partial	`{ status: 1 }, { partialFilterExpression: { active: true } }`	Index only a subset of documents
Sparse	`{ phone: 1 }, { sparse: true }`	Only documents where field exists
TTL	`{ expiresAt: 1 }, { expireAfterSeconds: 0 }`	Automatic document expiration
Unique	`{ email: 1 }, { unique: true }`	Enforces uniqueness — like a DB constraint

The ESR Rule — Compound Index Field Order

Position	Letter	Field Type	Example
1st	E — Equality	Fields used with exact match	`userId: 1`
2nd	S — Sort	Fields used in `.sort()`	`createdAt: -1`
3rd	R — Range	Fields used with `$gt`, `$lt`, `$in`	`status: 1`

Note: The order of fields in a compound index is critical. An index on { userId: 1, status: 1, createdAt: -1 } can support queries that filter by userId, or by userId + status, or by userId + status + createdAt. But it cannot support a query that filters only by status without also filtering by userId. MongoDB uses the index from left to right — it can use a prefix of the index but not skip fields. The ESR rule (Equality, Sort, Range) gives you the optimal field ordering for most compound indexes.

Tip: A covered query is one where MongoDB can answer the query entirely from the index without reading any documents. This is the fastest possible query — no document I/O at all. Create a covered query by including all projected fields in the index: if you query { userId: 1 } and project { title: 1, status: 1, _id: 0 }, an index on { userId: 1, title: 1, status: 1 } covers the query completely. Look for covered queries in the Explain Plan output: totalDocsExamined: 0.

Warning: Every index consumes disk space and slows down writes — MongoDB must update every index on every insert, update, and delete. Do not create indexes speculatively. Create them for queries you can actually measure as slow using explain('executionStats'). A collection with 20 indexes will have significantly slower write throughput than one with 5 well-chosen indexes. Audit and drop unused indexes regularly using db.collection.aggregate([{ $indexStats: {} }]) to see which indexes have zero usage.

Creating and Managing Indexes

// ── Single field index ────────────────────────────────────────────────────
taskSchema.index({ userId: 1 });                     // ascending
taskSchema.index({ createdAt: -1 });                  // descending — newest first

// ── Unique index ──────────────────────────────────────────────────────────
userSchema.index({ email: 1 }, { unique: true });    // no duplicate emails

// ── Compound index — ESR rule: Equality, Sort, Range ──────────────────────
// Query: { userId, status } sorted by createdAt descending
taskSchema.index({ userId: 1, createdAt: -1 });       // equality + sort
taskSchema.index({ userId: 1, status: 1, createdAt: -1 }); // equality + equality + sort

// ── Partial index — only index active, non-deleted tasks ──────────────────
taskSchema.index(
    { userId: 1, dueDate: 1 },
    { partialFilterExpression: { status: { $ne: 'completed' }, deletedAt: { $exists: false } } }
);
// Much smaller index — only covers the documents that matter

// ── TTL index — auto-delete expired sessions ──────────────────────────────
sessionSchema.index(
    { expiresAt: 1 },
    { expireAfterSeconds: 0 }   // delete when expiresAt date is reached
);

// ── Text index — full-text search ─────────────────────────────────────────
taskSchema.index(
    { title: 'text', description: 'text' },
    { weights: { title: 10, description: 3 }, name: 'task_text_search' }
);

// Using text search:
const tasks = await Task.find(
    { $text: { $search: 'client meeting deadline' } },
    { score: { $meta: 'textScore' } }
).sort({ score: { $meta: 'textScore' } });

// ── Sparse index — only for documents that have the field ─────────────────
userSchema.index({ githubId: 1 }, { sparse: true, unique: true });
// Only indexes users with a githubId — doesn't require all users to have one

// ── Multikey index — array field ──────────────────────────────────────────
taskSchema.index({ tags: 1 });
// Query: { tags: 'urgent' } — uses the multikey index efficiently
// Query: { tags: { $all: ['urgent', 'Q4'] } } — also uses multikey index

// ── Geospatial index ──────────────────────────────────────────────────────
locationSchema.index({ coordinates: '2dsphere' });
// coordinates stored as GeoJSON: { type: 'Point', coordinates: [lng, lat] }

// Find locations within 5km of a point
const nearby = await Location.find({
    coordinates: {
        $near: {
            $geometry:    { type: 'Point', coordinates: [-73.97, 40.77] },
            $maxDistance: 5000,   // metres
        },
    },
});

Analysing Query Performance with explain()

// ── explain() — understand how MongoDB executes a query ───────────────────

// Mongoose explain
const explanation = await Task
    .find({ userId: '64a1f...', status: 'pending' })
    .sort({ createdAt: -1 })
    .explain('executionStats');

console.log(JSON.stringify(explanation, null, 2));

// Key fields to examine:
// winningPlan.inputStage.stage
//   'COLLSCAN' → full collection scan (BAD — needs an index)
//   'IXSCAN'   → index scan (GOOD — using an index)
//   'FETCH'    → fetching documents after index scan (normal)
//   'SORT'     → in-memory sort (may need index on sort field)

// executionStats:
//   nReturned           — documents returned
//   totalDocsExamined   — documents MongoDB had to look at
//   totalKeysExamined   — index entries examined
//   executionTimeMillis — query duration

// A good query: nReturned == totalDocsExamined (ratio 1:1)
// A bad query:  nReturned = 10, totalDocsExamined = 100,000 (ratio 1:10000)

// ── Index usage statistics — find unused indexes ──────────────────────────
const indexStats = await Task.aggregate([{ $indexStats: {} }]);
indexStats.forEach(stat => {
    console.log(`${stat.name}: ${stat.accesses.ops} operations since server restart`);
});
// Indexes with 0 ops are candidates for removal

// ── Check existing indexes ────────────────────────────────────────────────
const indexes = await Task.collection.getIndexes();
console.log(indexes);

How It Works

Step 1 — B-Tree Structure Enables Fast Lookups

Most MongoDB indexes use a B-tree (balanced tree) data structure. The B-tree stores field values in sorted order. To find { userId: 'abc' }, MongoDB navigates the B-tree from the root, making comparisons at each node until it finds the matching leaf. This takes O(log n) time — for 1 million documents, that is about 20 comparisons instead of 1 million. The tree is kept balanced automatically on every write.

Step 2 — The Query Planner Chooses the Best Index

When multiple indexes could satisfy a query, MongoDB’s query planner runs them in parallel in a trial period, counts how many documents each approach examines, and picks the winner. The winning plan is cached for that query shape. You can see the winning and rejected plans in explain('allPlansExecution'). The planner re-evaluates periodically or after a certain number of writes to adapt to changing data distributions.

Step 3 — Compound Indexes Serve Multiple Query Patterns

A single compound index on { userId: 1, status: 1, createdAt: -1 } can efficiently serve these queries: { userId }, { userId, status }, { userId, status, createdAt }, and sort by createdAt when filtering on userId. This is the index prefix property — MongoDB can use any left-anchored prefix of the index. Planning compound indexes thoughtfully means fewer total indexes while serving more query patterns.

Step 4 — Text Indexes Enable Language-Aware Search

A text index tokenises the indexed fields — splitting text into words, removing stopwords (“the”, “a”, “is”), and applying stemming (reducing “running” and “runs” to “run”). When you query { $text: { $search: 'running tasks' } }, MongoDB searches for stemmed forms of each word. Relevance scores are computed based on term frequency and the field weights you assign. This is simpler but less powerful than a dedicated search engine like Elasticsearch.

Step 5 — TTL Indexes Automate Document Expiry

MongoDB runs a background thread every 60 seconds that checks TTL-indexed fields. Documents where the indexed date field is older than expireAfterSeconds are automatically deleted. This is ideal for session tokens, password reset tokens, rate-limit records, and audit logs. Setting expireAfterSeconds: 0 means “delete when the indexed date is reached” — the field value itself is the expiry timestamp.

Real-World Example: Index Strategy for Task Manager

// models/task.model.js — complete index strategy

const taskSchema = new mongoose.Schema({ /* ... fields ... */ }, { timestamps: true });

// 1. Primary access pattern: all tasks for a user, sorted by date
taskSchema.index({ user: 1, createdAt: -1 });

// 2. Filter by user + status (most common list query)
taskSchema.index({ user: 1, status: 1, createdAt: -1 });

// 3. Filter by user + priority
taskSchema.index({ user: 1, priority: 1, createdAt: -1 });

// 4. Overdue tasks: user + dueDate (for due date queries)
taskSchema.index({ user: 1, dueDate: 1 });

// 5. Partial index: only active tasks (excludes completed — smaller, faster)
taskSchema.index(
    { user: 1, createdAt: -1 },
    {
        partialFilterExpression: { status: { $ne: 'completed' } },
        name: 'active_tasks_by_user',
    }
);

// 6. Full-text search
taskSchema.index(
    { title: 'text', description: 'text' },
    { weights: { title: 10, description: 3 } }
);

// 7. Tags multikey (for tag filtering)
taskSchema.index({ tags: 1 });

// models/user.model.js
userSchema.index({ email: 1 }, { unique: true });       // unique email
userSchema.index({ createdAt: -1 });                     // list users by date

// models/session.model.js
sessionSchema.index(
    { expiresAt: 1 },
    { expireAfterSeconds: 0, name: 'session_ttl' }       // auto-delete expired sessions
);
sessionSchema.index({ userId: 1 });                      // find sessions by user
sessionSchema.index({ token: 1 }, { unique: true });     // lookup by token

Common Mistakes

Mistake 1 — Creating indexes on every field “just in case”

❌ Wrong — over-indexed collection has slow writes:

// 15 indexes on a write-heavy collection
taskSchema.index({ title: 1 });
taskSchema.index({ description: 1 });
taskSchema.index({ priority: 1 });
taskSchema.index({ status: 1 });
taskSchema.index({ dueDate: 1 });
// ... 10 more
// Every insert/update must update all 15 B-trees — writes are 3x slower

✅ Correct — only index fields that appear in real, frequent queries:

// Profile real queries first, then create targeted compound indexes
taskSchema.index({ user: 1, status: 1, createdAt: -1 });  // covers the main list query

Mistake 2 — Wrong compound index field order

❌ Wrong — range field first blocks use for equality + sort queries:

taskSchema.index({ createdAt: -1, userId: 1 });
// Query: { userId: 'x' } sorted by createdAt — CANNOT use this index efficiently
// MongoDB must scan all entries sorted by createdAt to find userId matches

✅ Correct — equality field first, sort field last (ESR rule):

taskSchema.index({ userId: 1, createdAt: -1 });
// Query: { userId: 'x' } sorted by createdAt — uses index perfectly

Mistake 3 — Not including all query fields in the index

❌ Wrong — index on userId only, query filters on status too:

taskSchema.index({ userId: 1 });
// Query: { userId: 'x', status: 'pending' }
// MongoDB uses the index to find all tasks for userId, then filters status in memory
// If userId has 10,000 tasks, 10,000 docs fetched just to find 20 pending ones

✅ Correct — include status in the compound index:

taskSchema.index({ userId: 1, status: 1 });
// MongoDB goes directly to userId+status='pending' entries — reads only matching docs

Quick Reference

Index Type	Schema Definition	Use For
Single field	`schema.index({ field: 1 })`	Frequent single-field queries
Unique	`schema.index({ email: 1 }, { unique: true })`	Enforce uniqueness
Compound (ESR)	`schema.index({ user: 1, status: 1, date: -1 })`	Multi-field queries
Text	`schema.index({ title: 'text', body: 'text' })`	Full-text search
Partial	`schema.index({ f: 1 }, { partialFilterExpression: {...} })`	Subset of documents
TTL	`schema.index({ expiresAt: 1 }, { expireAfterSeconds: 0 })`	Auto-delete by date
Sparse	`schema.index({ field: 1 }, { sparse: true })`	Optional fields
Geospatial	`schema.index({ loc: '2dsphere' })`	Location queries
Explain query	`Model.find(filter).explain('executionStats')`	Verify index usage

Index Types #

The ESR Rule — Compound Index Field Order #

Creating and Managing Indexes #

Analysing Query Performance with explain() #

How It Works #

Step 1 — B-Tree Structure Enables Fast Lookups #

Step 2 — The Query Planner Chooses the Best Index #

Step 3 — Compound Indexes Serve Multiple Query Patterns #

Step 4 — Text Indexes Enable Language-Aware Search #

Step 5 — TTL Indexes Automate Document Expiry #

Real-World Example: Index Strategy for Task Manager #

Common Mistakes #

Mistake 1 — Creating indexes on every field “just in case” #

Mistake 2 — Wrong compound index field order #

Mistake 3 — Not including all query fields in the index #

Quick Reference #

🧠 Test Yourself #

📚 More in this Tutorial Series