T-SQL Data Types — Choosing the Right Type for Each Column

Choosing the right data type for each column is one of the most consequential schema design decisions — it affects storage size, query performance, and data integrity. SQL Server’s type system is richer than most databases, with specific types for Unicode strings, time zones, and exact decimals. The two most common mistakes in SQL Server schemas: using DATETIME (limited precision, no time zone) instead of DATETIME2, and using NVARCHAR(MAX) everywhere instead of appropriate fixed-length columns.

T-SQL Data Types for the BlogApp

-- ── Numeric types ──────────────────────────────────────────────────────────
-- INT:      -2,147,483,648 to 2,147,483,647     (4 bytes) — standard PK type
-- BIGINT:   -9 quintillion to 9 quintillion      (8 bytes) — high-volume PKs
-- SMALLINT: -32,768 to 32,767                    (2 bytes) — lookup tables
-- TINYINT:  0 to 255                             (1 byte)  — flags, small enums
-- DECIMAL(p,s) / NUMERIC(p,s): exact precision   — money, measurements
-- FLOAT: approximate (do NOT use for money or exact comparisons)

-- ── String types ──────────────────────────────────────────────────────────
-- VARCHAR(n):  non-Unicode (ASCII), up to n bytes  — English-only content
-- NVARCHAR(n): Unicode (UTF-16), up to n characters — ALWAYS use for user content
-- CHAR(n):     fixed-length non-Unicode             — codes, fixed-format strings
-- NCHAR(n):    fixed-length Unicode                 — rarely needed

-- The MAX keyword: VARCHAR(MAX) / NVARCHAR(MAX) = up to 2GB
-- Use MAX only when you genuinely need large text — regular columns prefer fixed limits
-- MAX columns cannot be included in indexes directly

-- ── Date and time types ───────────────────────────────────────────────────
-- DATE:             date only, no time        (3 bytes)  — birthdates
-- TIME(n):          time only, n=0-7 precision (3-5 bytes) — schedules
-- DATETIME:         legacy, 3.33ms precision  (8 bytes)  — AVOID in new code
-- DATETIME2(n):     high precision 0-7 decs   (6-8 bytes) — ALWAYS use this
-- DATETIMEOFFSET(n): DATETIME2 + time zone offset         — for global apps

-- ✅ Use DATETIME2(7) for all timestamps
-- ❌ Never use DATETIME (lower precision, 1753 minimum year)

-- ── Other important types ─────────────────────────────────────────────────
-- UNIQUEIDENTIFIER: GUID (16 bytes) — distributed IDs, but larger than INT
-- BIT:              0/1/NULL boolean (1 byte)
-- VARBINARY(n):     binary data, hashes
-- XML:              structured XML documents
-- JSON:             stored as NVARCHAR; use JSON functions (OPENJSON, JSON_VALUE)

-- ── BlogApp column type decisions ─────────────────────────────────────────
-- Posts.Id:           INT IDENTITY(1,1)   — simple auto-increment PK
-- Users.Id:           NVARCHAR(450)       — ASP.NET Core Identity uses string GUIDs
-- Posts.Title:        NVARCHAR(200)       — Unicode, reasonable max
-- Posts.Body:         NVARCHAR(MAX)       — large text content
-- Posts.Slug:         VARCHAR(200)        -- ASCII-safe slugs
-- Posts.PublishedAt:  DATETIME2(7)        — high precision UTC timestamp
-- Posts.ViewCount:    INT                 — fits well within INT range
-- RefreshTokens.Hash: VARBINARY(64)       — store hash of token, not raw token
-- Users.IsActive:     BIT NOT NULL DEFAULT 1
Note: Always use NVARCHAR for user-provided text content — names, titles, descriptions, comments. VARCHAR only stores ASCII (Latin1) characters; any Unicode character (emoji, accented letters, Chinese/Arabic/Hebrew) in a VARCHAR column is either mangled or causes an error. The BlogApp serves international users — use NVARCHAR for all user-facing text. The overhead is 2x storage vs VARCHAR, which is acceptable for text data.
Tip: Use DATETIME2(7) for all date-time columns in new schemas. Compared to the legacy DATETIME type: DATETIME2 has higher precision (100 nanosecond vs 3.33ms), a wider range (year 0001–9999 vs 1753–9999), and is ISO 8601 compatible. Entity Framework Core maps C#’s DateTime to DATETIME2(7) by default. All new SQL Server development should use DATETIME2DATETIME is a legacy type kept for backward compatibility.
Warning: Avoid NVARCHAR(MAX) as a default for all string columns. MAX columns cannot be included in standard indexes (they require full-text indexes for searching), cannot participate in some query optimisations, and prevent row compression. Define reasonable length limits: NVARCHAR(200) for titles, NVARCHAR(500) for descriptions, VARCHAR(100) for email addresses, VARCHAR(200) for URL slugs. Only use MAX for body text, HTML content, and JSON data that genuinely cannot be bounded.

Type Storage Reference

Type Storage Range / Notes
INT 4 bytes ±2.1 billion
BIGINT 8 bytes ±9.2 quintillion
NVARCHAR(n) 2n bytes Unicode, max 4000 chars
VARCHAR(n) n bytes ASCII, max 8000 chars
DATETIME2(7) 8 bytes 100ns precision, 0001–9999
DATETIME 8 bytes 3.33ms precision — avoid
BIT 1 byte 0, 1, or NULL
UNIQUEIDENTIFIER 16 bytes GUID format

Common Mistakes

Mistake 1 — Using DATETIME instead of DATETIME2 (lower precision, legacy range)

❌ Wrong — CreatedAt DATETIME; milliseconds rounded to nearest 3.33ms; cannot store dates before 1753.

✅ Correct — CreatedAt DATETIME2(7) NOT NULL DEFAULT SYSUTCDATETIME().

Mistake 2 — Using VARCHAR for user names and content (breaks for non-ASCII characters)

❌ Wrong — DisplayName VARCHAR(100); user named “André” or “山田” has their name corrupted.

✅ Correct — DisplayName NVARCHAR(100); handles all Unicode characters.

🧠 Test Yourself

A table stores post view counts. Currently INT (max ~2.1 billion). A viral post gets 3 billion views. What happens?