PostgreSQL’s built-in full-text search turns your database into a search engine — no Elasticsearch required for moderate search needs. The system works by converting text into a sorted list of lexemes (normalised word forms), then matching against a tsquery. A GIN index on the tsvector makes searches on millions of rows return in milliseconds. For the blog application, full-text search enables searching post titles and bodies, with relevance ranking so the most relevant posts appear first. This is the search endpoint you will build in the FastAPI part of the series.
tsvector and tsquery Basics
-- tsvector: a processed document (sorted lexemes with positions)
-- tsquery: a search query
-- Convert text to tsvector
SELECT to_tsvector('english', 'FastAPI is a modern Python web framework');
-- 'fastapi':1 'framework':7 'modern':4 'python':5 'web':6
-- Stopwords removed (is, a), lexemes normalised (framework not frameworks)
-- Convert search terms to tsquery
SELECT to_tsquery('english', 'python & fastapi'); -- python AND fastapi
SELECT to_tsquery('english', 'python | django'); -- python OR django
SELECT to_tsquery('english', '!javascript'); -- NOT javascript
SELECT plainto_tsquery('english', 'python web framework'); -- implicit AND, no operators
SELECT websearch_to_tsquery('english', 'python "web framework" -javascript');
-- Match a tsvector against a tsquery (@@ operator)
SELECT to_tsvector('english', 'FastAPI is a Python web framework')
@@ to_tsquery('english', 'python & framework'); -- TRUE
-- The @@ operator returns true if the document matches the query
'english') controls the stemming rules and stopword list. For multilingual content, either use a generic dictionary ('simple') or detect the language and store documents in multiple language-specific tsvector columns.websearch_to_tsquery() for user-facing search inputs — it handles the most natural search syntax: quoted phrases, minus for exclusion, implicit AND between words, without requiring users to know Boolean operators. plainto_tsquery() is simpler (all words ANDed, no operators) but does not support phrases. Reserve to_tsquery() for programmatic queries where you control the syntax.to_tsvector() in a WHERE clause without a GIN index — it recomputes the tsvector for every row on every query, resulting in a sequential scan. Always either: (1) store a computed tsvector column with a GIN index, (2) create an expression GIN index on to_tsvector('english', col), or (3) for multi-column search, create an index on the concatenated tsvector. Any of these allows the index to be used for fast lookups.Creating a Full-Text Search Index
-- ── Option A: Expression GIN index (no extra column) ─────────────────────────
CREATE INDEX idx_posts_fts ON posts
USING GIN (to_tsvector('english', title || ' ' || COALESCE(body, '')));
-- Queries that use this index:
SELECT id, title FROM posts
WHERE to_tsvector('english', title || ' ' || COALESCE(body, ''))
@@ websearch_to_tsquery('english', 'fastapi tutorial');
-- ── Option B: Generated tsvector column (more efficient for updates) ──────────
ALTER TABLE posts
ADD COLUMN search_vector TSVECTOR
GENERATED ALWAYS AS (
setweight(to_tsvector('english', COALESCE(title, '')), 'A') ||
setweight(to_tsvector('english', COALESCE(body, '')), 'B')
) STORED;
-- setweight assigns weight to different fields: 'A' (highest) to 'D' (lowest)
-- This means title matches rank higher than body matches
CREATE INDEX idx_posts_search ON posts USING GIN (search_vector);
-- Query using the stored column (faster — no recomputation per query):
SELECT id, title
FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', 'fastapi tutorial');
Relevance Ranking with ts_rank
-- ts_rank: score based on how often query terms appear and their weights
-- ts_rank_cd: score based on term density and cover (distance between terms)
-- Search with relevance ranking
SELECT
p.id,
p.title,
p.status,
ts_rank(
p.search_vector,
websearch_to_tsquery('english', :search_query)
) AS relevance_score,
ts_headline(
'english',
p.body,
websearch_to_tsquery('english', :search_query),
'StartSel=<mark>, StopSel=</mark>, MaxWords=35, MinWords=15'
) AS excerpt_with_highlights
FROM posts p
WHERE p.status = 'published'
AND p.search_vector @@ websearch_to_tsquery('english', :search_query)
ORDER BY relevance_score DESC, p.view_count DESC -- rank by relevance, then popularity
LIMIT 20;
-- ts_headline: returns a fragment of the original text with query terms highlighted
-- The StartSel/StopSel options define the HTML tags wrapping matched terms
Full-Text Search FastAPI Endpoint
-- Complete search query for FastAPI GET /search?q=...&page=1&limit=10
-- Count total matches (for pagination)
SELECT COUNT(*) AS total
FROM posts
WHERE status = 'published'
AND search_vector @@ websearch_to_tsquery('english', :query);
-- Paginated results with highlighting
SELECT
p.id,
p.title,
p.slug,
p.view_count,
p.published_at,
u.name AS author_name,
ts_rank(p.search_vector, q.query) AS score,
ts_headline('english', p.body, q.query,
'MaxWords=40, MinWords=15, StartSel=<em>, StopSel=</em>'
) AS excerpt
FROM posts p
JOIN users u ON p.author_id = u.id
CROSS JOIN websearch_to_tsquery('english', :query) AS q(query)
WHERE p.status = 'published'
AND p.search_vector @@ q.query
ORDER BY score DESC, p.view_count DESC
LIMIT :limit OFFSET :offset;
Common Mistakes
Mistake 1 — Missing GIN index on tsvector (sequential scan)
❌ Wrong — recomputes tsvector for every row:
WHERE to_tsvector('english', title) @@ to_tsquery('python');
-- Sequential scan on all posts!
✅ Correct — use a stored column or expression index with GIN:
CREATE INDEX idx_fts ON posts USING GIN (to_tsvector('english', title));
-- OR use a generated STORED column ✓
Mistake 2 — Using to_tsquery with user input (syntax errors)
❌ Wrong — user input like “C++” breaks to_tsquery syntax:
WHERE search_vector @@ to_tsquery('english', user_input);
-- "C++" → ERROR: syntax error in tsquery
✅ Correct — use websearch_to_tsquery for user input:
WHERE search_vector @@ websearch_to_tsquery('english', user_input); -- ✓ handles special chars
Mistake 3 — Not using setweight to prioritise title over body
❌ Wrong — title and body have equal weight:
to_tsvector('english', title || ' ' || body) -- title match = body match
✅ Correct — weight title higher:
setweight(to_tsvector('english', title), 'A') ||
setweight(to_tsvector('english', body), 'B') -- title matches rank higher ✓
Quick Reference
| Function | Use |
|---|---|
to_tsvector(lang, text) |
Convert text to lexeme document |
websearch_to_tsquery(lang, q) |
Convert user search string to query (safe) |
plainto_tsquery(lang, q) |
Simple AND query from plain text |
to_tsquery(lang, q) |
Explicit Boolean tsquery (requires valid syntax) |
ts_rank(vec, query) |
Relevance score |
ts_headline(lang, text, query) |
Highlighted excerpt |
setweight(vec, weight) |
Assign A/B/C/D weight to tsvector |
| GIN index | CREATE INDEX ... USING GIN (tsvector_col) |