SELECT and WHERE — Retrieving and Filtering Data

The SELECT statement is the foundation of all data retrieval in SQL. Writing effective SELECT queries — with precise column lists, well-structured WHERE clauses, and proper NULL handling — is the most-used skill in SQL Server development. The most common mistakes: using SELECT * in production code (fragile when the schema changes), forgetting that NULL = NULL evaluates to UNKNOWN (not TRUE), and using pattern matching LIKE with a leading wildcard (LIKE '%pattern') which cannot use an index.

SELECT and WHERE Fundamentals

-- ── Basic SELECT — always name columns explicitly ──────────────────────────
SELECT
    p.Id,
    p.Title,
    p.Slug,
    p.Status,
    p.ViewCount,
    p.PublishedAt,
    p.CreatedAt
FROM dbo.Posts p;

-- ── Column aliases ────────────────────────────────────────────────────────
SELECT
    p.Id          AS PostId,
    p.Title       AS PostTitle,
    u.DisplayName AS AuthorName,
    p.ViewCount   AS Views
FROM dbo.Posts p
JOIN dbo.Users u ON u.Id = p.AuthorId;

-- ── WHERE with comparison operators ──────────────────────────────────────
SELECT Id, Title, Status
FROM   dbo.Posts
WHERE  IsPublished = 1                     -- BIT comparison
  AND  PublishedAt >= '2024-01-01'         -- date comparison
  AND  ViewCount BETWEEN 100 AND 10000     -- inclusive range
  AND  Status IN ('published', 'featured') -- list membership
  AND  Title LIKE 'Getting Started%'       -- starts-with (can use index)
  AND  CategoryId IS NOT NULL;             -- NULL check (NOT: IS NOT NULL)

-- ── NOT and OR ────────────────────────────────────────────────────────────
SELECT Id, Title
FROM   dbo.Posts
WHERE  Status NOT IN ('archived', 'draft')
   OR  IsFeatured = 1;  -- parentheses clarify precedence with mixed AND/OR

-- ── LIKE wildcards ────────────────────────────────────────────────────────
-- %    matches any sequence of characters (0 or more)
-- _    matches exactly one character
-- [abc] matches one character from the set: a, b, or c
-- [^abc] matches one character NOT in the set

SELECT Id, Title FROM dbo.Posts
WHERE Title LIKE 'Getting%';        -- ✅ leading text — can use index on Title
WHERE Title LIKE '%Started%';       -- ⚠️  leading wildcard — full table scan
WHERE Slug  LIKE '__-[a-z]%';       -- exactly 2 chars, dash, then a-z letter

-- ── NULL handling — IS NULL / IS NOT NULL ─────────────────────────────────
-- ❌ WRONG: NULL = NULL evaluates to UNKNOWN (not TRUE in SQL)
SELECT * FROM dbo.Posts WHERE PublishedAt = NULL;   -- returns no rows!

-- ✅ CORRECT: use IS NULL / IS NOT NULL
SELECT * FROM dbo.Posts WHERE PublishedAt IS NULL;  -- unpublished posts
SELECT * FROM dbo.Posts WHERE PublishedAt IS NOT NULL; -- published posts

-- ── DISTINCT — remove duplicate rows ──────────────────────────────────────
-- Get unique AuthorIds who have published at least one post
SELECT DISTINCT AuthorId FROM dbo.Posts WHERE IsPublished = 1;

-- ── TOP — limit result rows ───────────────────────────────────────────────
SELECT TOP 10 Id, Title, ViewCount
FROM   dbo.Posts
WHERE  IsPublished = 1
ORDER  BY ViewCount DESC;  -- TOP without ORDER BY is non-deterministic
Note: SQL uses three-valued logic: TRUE, FALSE, and UNKNOWN. Any comparison involving NULL produces UNKNOWN — not TRUE or FALSE. This is why WHERE column = NULL never returns rows: the comparison evaluates to UNKNOWN, which is treated as FALSE by the WHERE filter. UNKNOWN also propagates through AND and OR in non-obvious ways. Always use IS NULL and IS NOT NULL for NULL comparisons — they are the only predicates that evaluate correctly with NULL values.
Tip: Use LIKE 'prefix%' (trailing wildcard) rather than LIKE '%suffix%' (leading wildcard) for searchable columns. A leading wildcard prevents SQL Server from using a B-tree index — it must scan every row, reading the full column value to check for a match. A trailing wildcard allows an index seek: SQL Server can jump directly to the matching range. For full-text search needs (search anywhere in a document), use SQL Server’s Full-Text Search feature rather than LIKE with leading wildcards.
Warning: Never use SELECT * in production application queries. It creates fragile code that breaks when columns are added or reordered, returns more data than needed (wasting network bandwidth and I/O), and can cause EF Core to map columns incorrectly if the result set order changes. Always name every column you need. The exception: exploratory ad-hoc queries in SSMS or Azure Data Studio where you are investigating data, not writing production code.

Common Mistakes

Mistake 1 — NULL comparison with = instead of IS NULL (no rows returned)

❌ Wrong — WHERE PublishedAt = NULL — returns zero rows; NULL comparisons are always UNKNOWN.

✅ Correct — WHERE PublishedAt IS NULL — correctly identifies rows with NULL PublishedAt.

Mistake 2 — LIKE with leading wildcard for filtering (full table scan)

❌ Wrong — WHERE Title LIKE '%Started%' — scans every row; slow on large tables.

✅ Correct — use trailing wildcard LIKE 'Started%' or SQL Server Full-Text Search for contains-style search.

🧠 Test Yourself

A query has WHERE CategoryId = NULL. The Posts table has 50 rows with NULL CategoryId. How many rows does this query return?