SELECT — Reading Data from Tables

SQL’s SELECT statement is the most fundamental operation in any database-backed application — it is how you read data out of the database. Every FastAPI endpoint that returns data (a list of posts, a single user, search results) ultimately translates to one or more SELECT statements. While SQLAlchemy generates these statements for you, understanding the underlying SQL makes debugging easier, helps you write more efficient ORM queries, and is essential when you need to drop down to raw SQL for complex queries that the ORM cannot express cleanly.

Basic SELECT

-- Select all columns from a table
SELECT * FROM users;

-- Select specific columns (preferred in production — avoids loading unnecessary data)
SELECT id, name, email FROM users;

-- Alias columns for readability
SELECT
    id          AS user_id,
    name        AS full_name,
    email       AS email_address,
    created_at  AS registered_at
FROM users;

-- Select with a literal value or expression
SELECT
    id,
    name,
    UPPER(name)          AS name_upper,
    LENGTH(name)         AS name_length,
    NOW()                AS query_time,
    'blog_user'          AS user_type
FROM users;

-- Select from multiple schema objects
SELECT current_database(), current_user, version();
Note: Avoid SELECT * in production application code. It fetches every column, including large text fields or binary data you may not need, wastes network bandwidth and memory, and breaks silently when columns are added or removed. Always name the columns you need explicitly: SELECT id, title, status FROM posts. The only acceptable uses of SELECT * are ad-hoc exploration in psql and SELECT COUNT(*) (which is idiomatic SQL).
Tip: Column aliases set with AS apply only to the output — you cannot reference an alias in the same query’s WHERE or GROUP BY clause because those are evaluated before SELECT in the logical processing order. The logical order of SQL clause evaluation is: FROM → WHERE → GROUP BY → HAVING → SELECT → DISTINCT → ORDER BY → LIMIT. This ordering explains many “why can’t I use my alias here?” questions.
Warning: SQL string literals use single quotes'hello'. Double quotes are used for identifiers (table names, column names) when they contain spaces or uppercase letters: "User". Mixing them up causes confusing errors. In PostgreSQL, SELECT "name" selects the column named name, while SELECT 'name' returns the literal string name for every row. Never use double quotes for string values.

DISTINCT — Remove Duplicate Rows

-- All roles, including duplicates
SELECT role FROM users;
-- user, admin, user, editor, user, user, admin

-- Only unique roles
SELECT DISTINCT role FROM users;
-- user, admin, editor

-- DISTINCT ON — keep first row per unique value of the specified column
-- (useful for "latest post per user" type queries)
SELECT DISTINCT ON (author_id)
    id, author_id, title, created_at
FROM posts
ORDER BY author_id, created_at DESC;
-- Returns the most recent post for each author

Column Expressions and Type Casting

-- String operations
SELECT
    name,
    LOWER(email)                        AS email_lower,
    TRIM(name)                          AS name_trimmed,
    SUBSTRING(name FROM 1 FOR 5)        AS first_5_chars,
    CONCAT(name, ' <', email, '>')     AS display_name,
    name || ' — ' || role               AS name_role    -- || is string concat
FROM users;

-- Numeric operations
SELECT
    id,
    view_count,
    view_count * 1.0 / 1000             AS views_in_thousands,
    ROUND(view_count / 1000.0, 2)       AS views_rounded
FROM posts;

-- Date operations
SELECT
    id,
    created_at,
    NOW() - created_at                  AS age,
    DATE_PART('year', created_at)       AS year_created,
    TO_CHAR(created_at, 'YYYY-MM-DD')   AS created_date
FROM posts;

-- Type casting
SELECT
    id::TEXT                            AS id_as_string,
    view_count::FLOAT / 100             AS percentage,
    CAST(created_at AS DATE)            AS date_only
FROM posts;

LIMIT and OFFSET — Pagination

-- Get the first 10 rows
SELECT id, title FROM posts LIMIT 10;

-- Pagination: page 1 (rows 1–10)
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
LIMIT 10 OFFSET 0;

-- Page 2 (rows 11–20)
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
LIMIT 10 OFFSET 10;

-- Page n formula: OFFSET = (page_number - 1) * page_size
-- Page 3 with 20 per page:
SELECT id, title FROM posts
ORDER BY id
LIMIT 20 OFFSET 40;   -- OFFSET = (3 - 1) * 20 = 40

Common Mistakes

Mistake 1 — Using SELECT * in application code

❌ Wrong — fetches all columns including large unused ones:

SELECT * FROM posts;   -- fetches body (may be megabytes) even if you only need title

✅ Correct — select only needed columns:

SELECT id, title, status, created_at FROM posts;   -- ✓

Mistake 2 — Forgetting ORDER BY with LIMIT/OFFSET

❌ Wrong — results in undefined, inconsistent order:

SELECT id, title FROM posts LIMIT 10;   -- different rows each time without ORDER BY!

✅ Correct — always specify ORDER BY when paginating:

SELECT id, title FROM posts ORDER BY created_at DESC LIMIT 10;   -- ✓

Mistake 3 — Using double quotes for string values

❌ Wrong — double quotes mean identifier, not string:

SELECT id FROM users WHERE email = "alice@example.com";
-- ERROR: column "alice@example.com" does not exist

✅ Correct — single quotes for string literals:

SELECT id FROM users WHERE email = 'alice@example.com';   -- ✓

Quick Reference

Pattern SQL
All columns SELECT * FROM table
Named columns SELECT col1, col2 FROM table
Column alias SELECT col AS alias FROM table
Remove duplicates SELECT DISTINCT col FROM table
Limit results SELECT ... LIMIT n OFFSET m
String concat col1 || ' ' || col2
Type cast col::type or CAST(col AS type)

🧠 Test Yourself

You query SELECT id, title FROM posts LIMIT 10 without ORDER BY. On the first request you get posts 1–10. On the second request after a new post is inserted, what might you get?