Aggregate functions compute a single result from a set of rows. Instead of reading individual row values, they summarise: how many rows match a condition, what is the total of a column, what is the average, the maximum, the minimum. In a FastAPI application, aggregates power dashboard endpoints (total posts by status, average rating per post, most active authors), reporting queries (monthly revenue, weekly new user counts), and validation queries (is this username already taken? how many active sessions does this user have?).
COUNT, SUM, AVG, MIN, MAX
-- COUNT(*): count all rows (including NULLs)
SELECT COUNT(*) FROM posts; -- total posts
SELECT COUNT(*) FROM posts WHERE status = 'published'; -- published only
-- COUNT(col): count non-NULL values in a column
SELECT COUNT(published_at) FROM posts; -- posts with a published_at date
-- Note: rows where published_at IS NULL are NOT counted
-- COUNT(DISTINCT col): count unique non-NULL values
SELECT COUNT(DISTINCT author_id) FROM posts; -- number of authors who have posts
-- SUM
SELECT SUM(view_count) FROM posts; -- total views across all posts
SELECT SUM(view_count) FROM posts WHERE author_id = 1; -- total views for author 1
-- AVG (returns NUMERIC/FLOAT)
SELECT AVG(view_count) FROM posts; -- average views per post
SELECT ROUND(AVG(view_count), 2) FROM posts; -- rounded to 2 decimal places
-- MIN and MAX
SELECT MIN(created_at) FROM users; -- earliest registration
SELECT MAX(view_count) FROM posts; -- highest view count ever
SELECT MIN(price), MAX(price), AVG(price) FROM products; -- price stats in one query
-- All aggregates in one summary query
SELECT
COUNT(*) AS total_posts,
COUNT(*) FILTER (WHERE status = 'published') AS published_posts,
COUNT(*) FILTER (WHERE status = 'draft') AS draft_posts,
SUM(view_count) AS total_views,
ROUND(AVG(view_count), 0) AS avg_views,
MAX(view_count) AS max_views
FROM posts;
COUNT(*) and COUNT(column_name) behave differently. COUNT(*) counts all rows including those with NULL values in any column. COUNT(column_name) counts only rows where that specific column is NOT NULL. Use COUNT(*) when you want to count rows; use COUNT(column_name) when you want to count how many rows have a value for that column. COUNT(DISTINCT column) counts unique non-NULL values.FILTER (WHERE condition) clause attached to an aggregate is a PostgreSQL-specific feature that filters which rows are included in the aggregate calculation without affecting other aggregates in the same query. Use it to compute multiple conditional counts in a single query: COUNT(*) FILTER (WHERE status = 'published') AS pub_count, COUNT(*) FILTER (WHERE status = 'draft') AS draft_count. This is far more efficient than running two separate COUNT queries.COUNT(*). AVG(rating) only averages non-NULL ratings. If you have 10 posts and 3 have a NULL rating, AVG calculates the average of the 7 non-NULL values, not of all 10. If you want to treat NULL as 0 in the average, use AVG(COALESCE(rating, 0)). Be explicit about your intention when NULLs are involved in aggregate calculations.FILTER Clause
-- FILTER applies a WHERE condition to a specific aggregate only
SELECT
author_id,
COUNT(*) AS total_posts,
COUNT(*) FILTER (WHERE status = 'published') AS published,
COUNT(*) FILTER (WHERE status = 'draft') AS drafts,
SUM(view_count) FILTER (WHERE status = 'published') AS published_views,
AVG(view_count) FILTER (WHERE view_count > 0) AS avg_views_nonzero
FROM posts
GROUP BY author_id;
-- Compare to CASE WHEN (older pattern, still valid):
SELECT
author_id,
COUNT(*) AS total,
SUM(CASE WHEN status = 'published' THEN 1 ELSE 0 END) AS published
FROM posts
GROUP BY author_id;
-- FILTER is cleaner and more readable than the CASE WHEN pattern
Aggregate Functions for Strings and Arrays
-- STRING_AGG: concatenate values into a comma-separated string
SELECT author_id, STRING_AGG(title, ', ' ORDER BY created_at) AS titles
FROM posts
GROUP BY author_id;
-- author_id=1: "Hello World, FastAPI Guide, PostgreSQL Intro"
-- ARRAY_AGG: collect values into a PostgreSQL array
SELECT author_id, ARRAY_AGG(id ORDER BY created_at) AS post_ids
FROM posts
GROUP BY author_id;
-- author_id=1: {1, 2, 3}
-- ARRAY_AGG with DISTINCT: unique values only
SELECT post_id, ARRAY_AGG(DISTINCT tag_id) AS unique_tag_ids
FROM post_tags
GROUP BY post_id;
-- JSON aggregation: build JSON from rows
SELECT
author_id,
JSON_AGG(
JSON_BUILD_OBJECT('id', id, 'title', title, 'views', view_count)
ORDER BY view_count DESC
) AS posts_json
FROM posts
GROUP BY author_id;
Common Mistakes
Mistake 1 — Confusing COUNT(*) with COUNT(column)
❌ Wrong — expects COUNT(*) to exclude NULL rows:
SELECT COUNT(*) FROM posts; -- counts ALL rows, even those with NULL published_at
✅ Correct — use COUNT(column) to count non-NULL values:
SELECT COUNT(published_at) FROM posts; -- counts only rows where published_at is not NULL ✓
Mistake 2 — AVG on column with NULLs giving misleading result
❌ Wrong — NULLs excluded from average silently:
SELECT AVG(rating) FROM posts; -- only averages non-NULL ratings — is this intended?
✅ Correct — be explicit:
SELECT AVG(COALESCE(rating, 0)) FROM posts; -- treat NULL as 0 ✓
-- or:
SELECT AVG(rating) FILTER (WHERE rating IS NOT NULL) FROM posts; -- explicit ✓
Mistake 3 — Running separate queries for each count instead of using FILTER
❌ Wrong — multiple round-trips for related counts:
SELECT COUNT(*) FROM posts WHERE status = 'published';
SELECT COUNT(*) FROM posts WHERE status = 'draft';
SELECT COUNT(*) FROM posts WHERE status = 'archived';
✅ Correct — single query with FILTER:
SELECT
COUNT(*) FILTER (WHERE status = 'published') AS published,
COUNT(*) FILTER (WHERE status = 'draft') AS drafts,
COUNT(*) FILTER (WHERE status = 'archived') AS archived
FROM posts; -- ✓ one query, one round-trip
Quick Reference
| Function | Description | NULL handling |
|---|---|---|
COUNT(*) |
Count all rows | Includes NULLs |
COUNT(col) |
Count non-NULL values | Excludes NULLs |
COUNT(DISTINCT col) |
Count unique non-NULL values | Excludes NULLs |
SUM(col) |
Total of values | Excludes NULLs |
AVG(col) |
Arithmetic mean | Excludes NULLs |
MIN(col) |
Smallest value | Excludes NULLs |
MAX(col) |
Largest value | Excludes NULLs |
STRING_AGG(col, sep) |
Concatenate as string | Excludes NULLs |
ARRAY_AGG(col) |
Collect into array | Includes NULLs (use FILTER) |
agg FILTER (WHERE cond) |
Conditional aggregate | Only counts matching rows |