Understanding MongoDB — Documents, Collections and JSON Data

📚 MERN Stack 📂 Chapter 1: Introduction to the MERN Stack 📄 Lesson 1020 Beginner 🕒 January 8, 2025

MongoDB is the “M” in MERN and the foundation that stores all of your application’s data. Unlike the relational databases you may have encountered (MySQL, PostgreSQL), MongoDB does not use tables, rows, or columns. Instead it stores data as documents — flexible, JSON-like objects — grouped into collections. Understanding how MongoDB organises and represents data is essential before you write a single line of Mongoose code, because every schema decision you make later flows from how MongoDB works at its core.

Relational vs Document Databases

Concept	Relational (MySQL)	MongoDB
Storage unit	Row in a table	Document in a collection
Schema	Fixed — defined upfront, all rows match	Flexible — documents in the same collection can differ
Data format	Columns with specific types	JSON-like BSON objects with nested fields and arrays
Relationships	Foreign keys and JOINs	Embedded documents or `$lookup` aggregation
Query language	SQL	MongoDB Query Language (MQL) — JSON-based
Scaling	Vertical (bigger server)	Horizontal (more servers via sharding)

Note: MongoDB stores data as BSON (Binary JSON) on disk for performance, but you always work with it as plain JSON in your Node.js code. Mongoose handles the BSON conversion transparently — you never need to think about it directly.

Tip: Every MongoDB document automatically gets a unique _id field of type ObjectId if you do not supply one. ObjectId is a 12-byte value that encodes a timestamp, making every ID globally unique and time-ordered. In Mongoose you reference it as a string using .toString() or let Mongoose handle the conversion automatically.

Warning: MongoDB’s schema flexibility is powerful but can become a liability if you have no discipline. Two developers writing to the same collection without a Mongoose schema can produce inconsistent documents that break your application at runtime. Always define a Mongoose schema for every collection — it gives you the best of both worlds: MongoDB flexibility with enforced shape and validation.

Documents and Collections

// A MongoDB document — a blog post
{
  "_id": "64a1f2b3c8e4d5f6a7b8c9d0",
  "title": "Getting Started with MERN",
  "slug": "getting-started-with-mern",
  "body": "The MERN stack is a powerful...",
  "author": {
    "name": "Jane Smith",
    "email": "jane@example.com"
  },
  "tags": ["mern", "javascript", "beginner"],
  "published": true,
  "viewCount": 142,
  "createdAt": "2025-01-01T00:00:00.000Z",
  "updatedAt": "2025-01-15T10:30:00.000Z"
}

The document above shows MongoDB’s key strengths: the author field is a nested object (no JOIN needed), tags is an array, and different documents in the same posts collection could have different fields without breaking anything.

MongoDB Hierarchy

Level	Name	Analogy (SQL)	Example
1	MongoDB Server	Database Server	localhost:27017
2	Database	Database / Schema	`blogdb`
3	Collection	Table	`posts`, `users`, `comments`
4	Document	Row	One blog post object
5	Field	Column	`title`, `body`, `tags`

BSON Data Types

BSON Type	JavaScript Equivalent	Common Use
String	`string`	Text fields — title, body, slug
Number (Int32 / Double)	`number`	Counts, prices, ratings
Boolean	`boolean`	Flags — published, active, verified
Array	`Array`	Tags, list of IDs, embedded objects
Object	`object`	Nested sub-documents — address, author
ObjectId	`string` (24 hex chars)	Document `_id`, foreign references
Date	`Date`	createdAt, updatedAt, dueDate
Null	`null`	Optional fields with no value

Embedding vs Referencing

One of the most important decisions in MongoDB schema design is whether to embed related data inside a document or reference it by ID.

// Embedding — author data lives inside the post document
// Good when: author data rarely changes, you always need it with the post
{
  "title": "MERN Tutorial",
  "author": { "name": "Jane", "email": "jane@example.com" }
}

// Referencing — post stores only the author's ObjectId
// Good when: author data is shared across many posts, may be updated
{
  "title": "MERN Tutorial",
  "authorId": "64a1f2b3c8e4d5f6a7b8c9d0"
}

MongoDB Atlas — The Cloud Option

MongoDB Atlas Free Tier (M0)
════════════════════════════
Storage    : 512 MB
RAM        : Shared
Region     : Choose closest to your users
Connection : mongodb+srv://username:password@cluster.mongodb.net/dbname

Advantages over local MongoDB for learners:
  ✓ No local installation required
  ✓ Accessible from any machine or deployment environment
  ✓ Built-in backups, monitoring, and alerts
  ✓ Mirrors the setup you will use in production
  ✓ Free tier is permanent — not a trial

Common Mistakes

Mistake 1 — Deeply nesting everything

❌ Wrong — embedding all related data regardless of update patterns:

// Post with deeply nested comments and their authors and their profiles...
// Updating a user's name now requires updating hundreds of post documents
{ "title": "...", "comments": [ { "author": { "name": "...", "profile": {...} } } ] }

✅ Correct — embed data that is read together and rarely updated independently; reference data that is shared or updated frequently.

Mistake 2 — Using MongoDB like a relational database

❌ Wrong — creating a separate collection for every relationship and joining everything with $lookup:

post_tags table → tag_id foreign key → tags table → tag_category_id → tag_categories
// This is SQL thinking — MongoDB does not need this level of normalisation

✅ Correct — store tags as an array inside the post document. MongoDB is optimised for reading complete documents, not reconstructing data from many collections.

Mistake 3 — Ignoring indexes

❌ Wrong — querying a collection of 100,000 documents with no index on the query field:

Post.find({ slug: 'getting-started' }) // full collection scan — very slow at scale

✅ Correct — add an index on fields you query frequently:

// In your Mongoose schema
postSchema.index({ slug: 1 }, { unique: true });
postSchema.index({ createdAt: -1 }); // latest posts first

Quick Reference

Task	MongoDB Shell Command
Show all databases	`show dbs`
Use a database	`use blogdb`
Show collections	`show collections`
Insert a document	`db.posts.insertOne({ title: "Hello" })`
Find all documents	`db.posts.find()`
Find with filter	`db.posts.find({ published: true })`
Count documents	`db.posts.countDocuments()`
Delete a document	`db.posts.deleteOne({ _id: ObjectId("...") })`

Relational vs Document Databases #

Documents and Collections #

MongoDB Hierarchy #

BSON Data Types #

Embedding vs Referencing #

MongoDB Atlas — The Cloud Option #

Common Mistakes #

Mistake 1 — Deeply nesting everything #

Mistake 2 — Using MongoDB like a relational database #

Mistake 3 — Ignoring indexes #

Quick Reference #

🧠 Test Yourself #

📚 More in this Tutorial Series

Relational vs Document Databases

Documents and Collections

MongoDB Hierarchy

BSON Data Types

Embedding vs Referencing

MongoDB Atlas — The Cloud Option

Common Mistakes

Mistake 1 — Deeply nesting everything

Mistake 2 — Using MongoDB like a relational database

Mistake 3 — Ignoring indexes

Quick Reference

🧠 Test Yourself