Expert PostgreSQL Interview Questions and Answers

🐘 Expert PostgreSQL Interview Questions

This lesson targets senior DBAs, architects, and lead engineers. Topics include the WAL, replication, logical decoding, pg_vector, query planner internals, lock types, deadlocks, connection pooling, extensions, zero-downtime migrations, and PostgreSQL 16/17 features. These questions reveal whether you understand PostgreSQL deeply or just write SQL in it.

Questions & Answers

01 What is the WAL (Write-Ahead Log)? Why is it critical? ►

Internals The Write-Ahead Log is a sequential log of all changes made to the database. Before any data page is modified, the change is first written to the WAL. This is the foundation of durability and crash recovery.

Why WAL is critical:

Crash recovery — if PostgreSQL crashes, it replays WAL records from the last checkpoint to restore a consistent state. No committed data is lost.
Durability (ACID D) — a COMMIT is not acknowledged until the WAL record is flushed to disk (with fsync=on)
Streaming replication — standby servers receive and replay WAL records to stay in sync with the primary
Logical replication & CDC — logical decoding reads WAL to produce a logical stream of changes
Point-in-Time Recovery (PITR) — replaying archived WAL segments allows recovery to any point in time

-- WAL location
SHOW data_directory;  -- /var/lib/postgresql/16/main
-- WAL files in: /var/lib/postgresql/16/main/pg_wal/

-- WAL configuration
SHOW wal_level;              -- minimal | replica | logical
SHOW archive_mode;           -- on | off
SHOW max_wal_size;           -- max WAL accumulation before forced checkpoint

-- Current WAL position
SELECT pg_current_wal_lsn();   -- LSN (Log Sequence Number)
SELECT pg_walfile_name(pg_current_wal_lsn()); -- WAL file name

02 What is streaming replication in PostgreSQL? How does it work? ►

Replication Streaming replication sends WAL records from the primary to one or more standby servers in near real-time. Standbys continuously apply WAL, maintaining a hot or warm copy of the database.

-- Primary configuration (postgresql.conf)
wal_level = replica          -- or logical
max_wal_senders = 10         -- max concurrent WAL sender processes
wal_keep_size = 1GB          -- keep 1GB of WAL for slow standbys

-- pg_hba.conf on primary — allow replication connections
host replication replicator 192.168.1.10/32 scram-sha-256

-- Create replication user
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secret';

-- Standby — pg_basebackup to clone primary
pg_basebackup -h primary_host -U replicator -D /var/lib/postgresql/data \
  --wal-method=stream --write-recovery-conf -P

-- Standby postgresql.conf
primary_conninfo = 'host=primary_host port=5432 user=replicator password=secret'

-- Monitor replication lag
SELECT
    client_addr,
    state,
    sent_lsn,
    write_lsn,
    flush_lsn,
    replay_lsn,
    (sent_lsn - replay_lsn) AS lag_bytes
FROM pg_stat_replication;

-- Synchronous replication (any|first N standbys must confirm before commit acks)
synchronous_standby_names = 'FIRST 1 (standby1, standby2)'

03 What is logical replication and logical decoding in PostgreSQL? ►

Replication Logical replication replicates data changes at the SQL level (individual row changes per table) rather than at the WAL byte level. It enables selective replication, cross-version replication, and CDC (Change Data Capture).

-- PUBLICATION — defines what to replicate (primary)
CREATE PUBLICATION my_pub FOR TABLE users, orders, products;
CREATE PUBLICATION all_tables FOR ALL TABLES;
CREATE PUBLICATION orders_pub FOR TABLE orders WHERE (status = 'completed');

-- SUBSCRIPTION — connects to a publication (subscriber)
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=primary dbname=mydb user=repuser password=secret'
PUBLICATION my_pub;

-- Monitor
SELECT * FROM pg_publication;
SELECT * FROM pg_subscription;
SELECT * FROM pg_stat_subscription;

-- Logical decoding — read WAL as a logical stream (for CDC, Debezium, etc.)
-- Requires wal_level = logical
SELECT pg_create_logical_replication_slot('my_slot', 'pgoutput');
SELECT * FROM pg_logical_slot_get_changes('my_slot', NULL, NULL,
  'proto_version', '1', 'publication_names', 'my_pub');
SELECT pg_drop_replication_slot('my_slot');

-- Use cases:
-- Selective replication (specific tables/rows to a read replica)
-- Cross-major-version replication (upgrade with zero downtime)
-- CDC: Debezium → Kafka → data warehouse
-- Audit logging: stream all changes to an audit service

04 What lock types does PostgreSQL use? How do you diagnose deadlocks? ►

Concurrency PostgreSQL has many lock modes on different objects (tables, rows, pages, advisory). The most important for application developers:

ACCESS SHARE — taken by SELECT. Conflicts only with ACCESS EXCLUSIVE.
ROW SHARE — taken by SELECT FOR UPDATE/SHARE.
ROW EXCLUSIVE — taken by INSERT, UPDATE, DELETE.
SHARE UPDATE EXCLUSIVE — taken by VACUUM, CREATE INDEX CONCURRENTLY.
SHARE — taken by CREATE INDEX (non-concurrent).
ACCESS EXCLUSIVE — taken by ALTER TABLE, DROP TABLE, TRUNCATE. Blocks everything including SELECT.

-- Find blocking queries right now
SELECT
    blocked.pid              AS blocked_pid,
    blocked_activity.query   AS blocked_query,
    blocking.pid             AS blocking_pid,
    blocking_activity.query  AS blocking_query,
    now() - blocked_activity.query_start AS blocked_duration
FROM pg_catalog.pg_locks blocked
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked.pid
JOIN pg_catalog.pg_locks blocking ON blocking.relation = blocked.relation
    AND blocking.granted AND NOT blocked.granted
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking.pid;

-- Kill a blocking query
SELECT pg_terminate_backend(blocking_pid);

-- Deadlock: TX1 locks row A then tries row B; TX2 locks row B then tries row A
-- PostgreSQL detects and aborts one TX automatically
-- log_lock_waits = on -- logs waits exceeding deadlock_timeout (default 1s)
-- deadlock_timeout = 500ms -- check for deadlock after 500ms of waiting

05 How does the PostgreSQL query planner work? What are common plan problems? ►

Planner The PostgreSQL query planner generates a set of candidate execution plans and estimates the cost of each using statistics and cost constants, then picks the cheapest plan.

Planning steps:

Parse & analyse — parse SQL into a parse tree, then into a query tree
Rewrite — apply view rewrite rules
Plan/optimise — generate candidate plans (join order permutations, index choices), estimate costs using pg_statistic data, pick the minimum-cost plan
Execute — the executor runs the chosen plan

-- Common planner problems and fixes

-- 1. Planner choosing Seq Scan when Index Scan is better
-- Cause: stale statistics, low selectivity, bad cost constants
ANALYZE orders;  -- refresh statistics
SET enable_seqscan = off;  -- force index use (diagnose only, never in production)

-- 2. Bad row estimate (planner thinks 100 rows, gets 10,000)
-- Cause: correlation between columns not captured in single-column statistics
CREATE STATISTICS stat_orders_customer_status (dependencies, ndistinct)
ON customer_id, status FROM orders;
ANALYZE orders;  -- now captures cross-column correlation

-- 3. Wrong join order for complex queries
SET join_collapse_limit = 1;   -- force planner to use your explicit join order
SET from_collapse_limit = 1;

-- 4. EXPLAIN ANALYZE says estimated vs actual rows differ wildly
SELECT attname, n_distinct, correlation FROM pg_stats
WHERE tablename = 'orders' AND attname = 'customer_id';
-- correlation near 1.0: clustered (good for index scans)
-- correlation near 0:   random order (planner may prefer seq scan)

-- Parallel query tuning
SET max_parallel_workers_per_gather = 4; -- use up to 4 workers for this query
SET parallel_tuple_cost = 0.1;
SET parallel_setup_cost = 1000;

06 What is pgvector and how do you use it for AI embeddings? ►

AI / ML pgvector is a PostgreSQL extension for storing and querying high-dimensional vectors — enabling similarity search for AI embeddings (semantic search, recommendations, RAG pipelines) directly in PostgreSQL.

-- Install extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create table with vector column
CREATE TABLE documents (
    id        BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    title     TEXT NOT NULL,
    content   TEXT NOT NULL,
    embedding vector(1536)   -- OpenAI text-embedding-3-small: 1536 dimensions
);

-- Insert with embedding (generated by your application)
INSERT INTO documents (title, content, embedding)
VALUES ('PostgreSQL Guide', 'PostgreSQL is a powerful...', '[0.023, -0.441, ...]'::vector);

-- Exact nearest-neighbour search (L2 distance)
SELECT title, embedding <-> '[0.015, -0.432, ...]'::vector AS distance
FROM documents
ORDER BY distance LIMIT 5;

-- Other distance operators:
-- <->  L2 (Euclidean) distance
-- <#>  negative inner product (cosine similarity via normalised vectors)
-- <=>  cosine distance (pgvector 0.5+)

-- Approximate nearest neighbour index (IVFFlat — faster, slightly less accurate)
CREATE INDEX ON documents USING ivfflat (embedding vector_l2_ops)
WITH (lists = 100);  -- ~sqrt(n_rows) lists is a good starting point

-- HNSW index (pgvector 0.5+, more accurate, better for high recall)
CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops)
WITH (m = 16, ef_construction = 64);

-- Hybrid search: combine semantic + keyword
SELECT d.id, d.title,
       d.embedding <=> query_vec AS semantic_dist,
       ts_rank(d.fts, query_tsq)  AS keyword_rank
FROM documents d,
     to_tsquery('english', 'postgresql') AS query_tsq,
     '[0.023, ...]'::vector               AS query_vec
WHERE d.fts @@ query_tsq
ORDER BY semantic_dist LIMIT 10;

07 What is PgBouncer and why is connection pooling critical for PostgreSQL? ►

Operations Each PostgreSQL connection is a separate OS process consuming ~5-10MB RAM. For web applications with many concurrent requests, thousands of connections would exhaust server memory. PgBouncer is a lightweight connection pooler that maintains a small pool of real PostgreSQL connections and multiplexes many application connections onto them.

# PgBouncer modes:
# session pooling:     one server connection per client session (default)
# transaction pooling: connection returned to pool after each transaction (most efficient)
# statement pooling:   connection returned after each statement (limited use)

# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
pool_mode = transaction       # most efficient for web apps
max_client_conn = 10000       # application connects to PgBouncer
default_pool_size = 25        # PgBouncer maintains 25 real PG connections
server_pool_timeout = 600
server_idle_timeout = 600

# Application connects to PgBouncer (port 6432), not PostgreSQL (5432)
# Effective connections: 10,000 clients → 25 PostgreSQL connections

# Limitations of transaction pooling:
# - Cannot use session-level features: SET, prepared statements (without protocol workarounds),
#   advisory locks, NOTIFY/LISTEN, temp tables across transactions

Alternatives: pgpool-II (load balancing + pooling), Supavisor (cloud-native Elixir pooler used by Supabase), Odyssey (Yandex’s pooler).

08 What is Point-in-Time Recovery (PITR) in PostgreSQL? ►

Backup PITR allows you to restore the database to any point in time — not just the last backup. It works by taking a base backup and then replaying archived WAL segments up to the target time.

-- Step 1: Enable WAL archiving (postgresql.conf)
wal_level = replica
archive_mode = on
archive_command = 'aws s3 cp %p s3://my-wal-archive/%f'
# %p = source path, %f = filename

-- Step 2: Take a base backup
pg_basebackup -h localhost -U postgres -D /backup/base -Ft -z -Xs -P

-- Step 3: After disaster, restore base backup and create recovery config
-- postgresql.conf (or recovery.conf in older versions):
restore_command = 'aws s3 cp s3://my-wal-archive/%f %p'
recovery_target_time = '2026-04-17 14:30:00 UTC'
recovery_target_action = 'promote'   -- or 'pause'

-- Start PostgreSQL — it will replay WAL until the target time, then promote

-- Verify with pg_waldump (inspect WAL content)
pg_waldump --timeline=1 --start=0/15000060 --end=0/16000000 \
  /var/lib/postgresql/pg_wal/000000010000000000000015

-- Continuous archiving + base backup = RPO near zero
-- Combine with streaming replication for HA + PITR

09 How do you perform zero-downtime schema migrations in PostgreSQL? ►

Operations Many DDL operations take an ACCESS EXCLUSIVE lock, blocking all reads and writes — dangerous on production tables. Zero-downtime requires careful, phased migrations.

Adding a column safely:

-- ❌ DANGEROUS on large tables — blocks all reads/writes while backfilling
ALTER TABLE orders ADD COLUMN priority INT DEFAULT 0 NOT NULL;

-- ✅ SAFE — multi-step approach:
-- Step 1: Add nullable column (instant, no lock needed for constant DEFAULT)
ALTER TABLE orders ADD COLUMN priority INT;

-- Step 2: Backfill in small batches (non-blocking)
DO $$
DECLARE batch_size INT := 10000; min_id BIGINT; max_id BIGINT;
BEGIN
  SELECT MIN(id), MAX(id) INTO min_id, max_id FROM orders WHERE priority IS NULL;
  FOR i IN 0..((max_id - min_id) / batch_size) LOOP
    UPDATE orders SET priority = 0
    WHERE id BETWEEN min_id + i*batch_size AND min_id + (i+1)*batch_size - 1
      AND priority IS NULL;
    PERFORM pg_sleep(0.05); -- brief pause to avoid lock queuing
  END LOOP;
END $$;

-- Step 3: Add NOT NULL constraint (validates existing data -- can be slow)
-- Safe approach: add constraint NOT VALID first, then validate separately
ALTER TABLE orders ADD CONSTRAINT orders_priority_not_null CHECK (priority IS NOT NULL) NOT VALID;
ALTER TABLE orders VALIDATE CONSTRAINT orders_priority_not_null; -- validates without full lock

-- Adding an index safely (non-blocking)
CREATE INDEX CONCURRENTLY idx_orders_priority ON orders (priority);
-- CONCURRENTLY takes longer but doesn't block reads or writes

10 What are PostgreSQL extensions? Name the most useful ones. ►

Extensions Extensions add new data types, functions, operators, index methods, and procedural languages to PostgreSQL. They are the primary mechanism for extending database capabilities.

-- List available extensions
SELECT * FROM pg_available_extensions ORDER BY name;

-- List installed extensions
SELECT * FROM pg_extension;

-- Install an extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE EXTENSION IF NOT EXISTS vector;

Essential extensions:

pg_stat_statements — query performance statistics (every production database should have this)
pgcrypto — encryption functions: crypt(), gen_random_uuid(), pgp_sym_encrypt()
uuid-ossp — UUID generation functions (now superseded by built-in gen_random_uuid() in PostgreSQL 13+)
PostGIS — geospatial types, functions, and indexes for location-based queries
pg_trgm — trigram similarity for fuzzy text matching, fast LIKE/ILIKE with GIN index
hstore — key-value pairs (mostly superseded by JSONB)
vector (pgvector) — AI embedding similarity search
timescaledb — time-series data optimisation and compression
pg_cron — in-database job scheduler (cron syntax)
pg_partman — automated partition management and maintenance
pgaudit — detailed audit logging for compliance

11 What is pg_trgm and how does it enable fast fuzzy search? ►

Search pg_trgm (trigram) breaks text into 3-character sequences (trigrams) and measures similarity. Combined with a GIN or GiST index, it enables fast fuzzy matching, LIKE with a leading wildcard, and ILIKE — operations that normally require a full table scan.

CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- Trigrams of a string
SELECT show_trgm('Hello World');
-- {"  h","  w"," he"," wo","ell","hel","llo","lo ","orl","rld","wor"}

-- Similarity score (0 to 1)
SELECT similarity('PostgreSQL', 'Postgresql');  -- 0.875 (case-insensitive)
SELECT similarity('hello', 'helo');             -- 0.5

-- GIN index enables fast LIKE/ILIKE and similarity queries
CREATE INDEX idx_users_name_trgm ON users USING GIN (name gin_trgm_ops);
CREATE INDEX idx_products_name_trgm ON products USING GIN (LOWER(name) gin_trgm_ops);

-- Fast LIKE with leading wildcard (usually requires seq scan without pg_trgm index)
SELECT * FROM users WHERE name ILIKE '%alice%';   -- uses the GIN index

-- Similarity search (fuzzy matching — find approximate matches)
SELECT name, similarity(name, 'Allice') AS sim
FROM users
WHERE name % 'Allice'       -- % operator: similarity > pg_trgm.similarity_threshold
ORDER BY sim DESC LIMIT 10;

-- Word similarity (partial match)
SELECT * FROM products WHERE word_similarity('widget', name) > 0.5;

-- Combined with other conditions
SELECT name FROM users
WHERE name ILIKE '%smith%'
ORDER BY similarity(name, 'John Smith') DESC LIMIT 5;

12 What are Postgres range types and exclusion constraints? ►

Data Types Range types represent a contiguous range of values (integers, dates, timestamps, numerics). Exclusion constraints using ranges with GiST indexes enforce non-overlapping rules — impossible to model as simple UNIQUE constraints.

-- Built-in range types
-- int4range, int8range, numrange, daterange, tsrange, tstzrange

-- Create a hotel room reservation table
CREATE TABLE reservations (
    id      BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    room_id BIGINT NOT NULL,
    guest   TEXT NOT NULL,
    period  daterange NOT NULL,  -- e.g., [2026-04-17, 2026-04-20)

    -- Exclusion constraint: no two reservations for the same room can overlap
    CONSTRAINT no_overlap EXCLUDE USING GIST (
        room_id WITH =,           -- same room
        period  WITH &&           -- overlapping periods
    )
);

-- Insert succeeds
INSERT INTO reservations (room_id, guest, period)
VALUES (101, 'Alice', '[2026-04-17, 2026-04-20)');

-- This INSERT fails — overlaps with Alice's reservation
INSERT INTO reservations (room_id, guest, period)
VALUES (101, 'Bob', '[2026-04-19, 2026-04-22)');
-- ERROR: conflicting key value violates exclusion constraint "no_overlap"

-- Range operators
SELECT '[2026-01-01, 2026-12-31]'::daterange @> '2026-06-15'::date; -- contains
SELECT '[1, 10)'::int4range && '[5, 15)'::int4range;               -- overlaps
SELECT '[1, 10)'::int4range << '[15, 20)'::int4range;              -- strictly left
SELECT lower('[2026-01-01, 2026-12-31]'::daterange);               -- 2026-01-01

13 What is the difference between SELECT FOR UPDATE and SELECT FOR SHARE? ►

Concurrency These are locking reads that prevent other transactions from modifying (or locking) the selected rows until the current transaction commits.

-- SELECT FOR UPDATE
-- Acquires ROW SHARE lock on the table + exclusive row lock
-- Blocks other FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR KEY SHARE
-- Use: you intend to UPDATE the selected rows in the same transaction
BEGIN;
SELECT * FROM inventory WHERE product_id = 42 FOR UPDATE;  -- lock this row
UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 42;
COMMIT;

-- SELECT FOR SHARE
-- Blocks other FOR UPDATE but allows other FOR SHARE (multiple readers can share)
-- Use: you need the row to remain stable but won't modify it yourself
SELECT * FROM orders WHERE id = 99 FOR SHARE;

-- SKIP LOCKED — skip rows already locked (queue processing pattern)
SELECT * FROM jobs
WHERE status = 'pending'
ORDER BY priority DESC
FOR UPDATE SKIP LOCKED
LIMIT 1;
-- Multiple workers can run this simultaneously; each gets a different job

-- NOWAIT — fail immediately instead of waiting if row is locked
SELECT * FROM accounts WHERE id = 1 FOR UPDATE NOWAIT;
-- Throws: ERROR: could not obtain lock on row in relation "accounts"

-- NO KEY UPDATE / KEY SHARE — weaker variants that allow FK-checking queries

14 What are PostgreSQL 16 and 17 major new features? ►

New Features

PostgreSQL 16 (September 2023):

Logical replication from standbys — standbys can now serve as logical replication sources
Parallel execution for full-text search — DISTINCT, string_agg, array_agg now work in parallel
SIMD CPU optimisation — accelerated checksums and WAL operations using CPU vector extensions
Performance — up to 300% improvement in certain bulk load and vacuum scenarios
pg_stat_io — new view tracking I/O statistics per object type and access method
SQL/JSON constructors — JSON_OBJECT, JSON_ARRAY, JSON_OBJECTAGG (SQL standard functions)
Role privileges — pg_use_reserved_connections, pg_create_subscription granular roles

PostgreSQL 17 (September 2024):

New streaming I/O API — significantly faster for sequential scans on large tables and VACUUM
Incremental backup — pg_basebackup --incremental: faster, smaller backups
MERGE improvements — MERGE now supports RETURNING clause and MERGE ... WHEN NOT MATCHED BY SOURCE
vacuum_buffer_usage_limit — control how much shared_buffers VACUUM can use, reducing impact on active queries
JSON_TABLE — SQL standard function to turn JSON into a relational table
Partition pruning improvements — faster for queries with many partitions

15 What are common PostgreSQL anti-patterns and how do you fix them? ►

Performance

SELECT * in production queries — fetches all columns including large JSONB/TEXT fields. Always list needed columns explicitly. Reduces I/O and network traffic.
Function in WHERE clause on indexed column — WHERE LOWER(email) = 'alice@...' without a functional index causes a full scan. Create a functional index or use a generated column.
OR instead of UNION for separate indexed columns — WHERE a = 1 OR b = 2 often causes a seq scan. Use SELECT ... WHERE a=1 UNION SELECT ... WHERE b=2 to use separate indexes.
NOT IN with a subquery returning NULLs — WHERE id NOT IN (SELECT user_id FROM inactive) returns 0 rows if any user_id is NULL. Use NOT EXISTS instead.
Unnecessary ORM N+1 queries — loading 100 orders then fetching each customer separately. Use JOINs or eager loading.
Missing indexes on foreign keys — PostgreSQL does NOT auto-create indexes on FK columns (unlike MySQL). Always index FK columns used in JOINs or lookups.
Too many indexes — every index slows down writes. Index only columns frequently used in WHERE, JOIN ON, ORDER BY on large tables.
Using sequences as distributed IDs — sequential IDs reveal business data volume; use UUIDs (gen_random_uuid()) or ULIDs for externally-visible IDs.

16 What is pg_partman and how does it automate partition management? ►

Partitioning pg_partman is an extension that automates the creation and maintenance of time-based and serial-based partition sets — creating future partitions in advance and archiving/dropping old ones.

CREATE EXTENSION pg_partman SCHEMA partman;

-- Set up automated time partitioning (daily partitions for events table)
SELECT partman.create_parent(
    p_parent_table := 'public.events',
    p_control      := 'created_at',       -- partition key column
    p_interval     := '1 day',            -- partition size
    p_premake      := 7                   -- create 7 future partitions in advance
);

-- Configure retention (auto-drop partitions older than 90 days)
UPDATE partman.part_config
SET infinite_time_partitions = TRUE,
    retention                = '90 days',
    retention_keep_table     = FALSE      -- actually drop (not just detach)
WHERE parent_table = 'public.events';

-- Run maintenance (usually via pg_cron every hour)
SELECT partman.run_maintenance();
-- Creates upcoming partitions, drops expired ones per retention policy

-- pg_cron integration
CREATE EXTENSION pg_cron;
SELECT cron.schedule('partition-maintenance', '0 * * * *',
    'SELECT partman.run_maintenance()');

-- Useful views
SELECT * FROM partman.part_config;
SELECT tablename FROM pg_tables WHERE tablename LIKE 'events_p%' ORDER BY tablename;

17 How do you implement audit logging in PostgreSQL? ►

Security Audit logging records who changed what and when — critical for compliance (GDPR, SOC 2, HIPAA) and security incident investigation.

-- Option 1: Trigger-based audit log (application-level)
CREATE TABLE audit_log (
    id          BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    table_name  TEXT NOT NULL,
    operation   TEXT NOT NULL,  -- INSERT, UPDATE, DELETE
    record_id   BIGINT,
    old_data    JSONB,
    new_data    JSONB,
    changed_by  TEXT,
    changed_at  TIMESTAMPTZ DEFAULT NOW()
);

CREATE OR REPLACE FUNCTION log_audit()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
    INSERT INTO audit_log (table_name, operation, record_id, old_data, new_data, changed_by)
    VALUES (
        TG_TABLE_NAME,
        TG_OP,
        COALESCE(NEW.id, OLD.id),
        CASE WHEN TG_OP = 'INSERT' THEN NULL ELSE to_jsonb(OLD) END,
        CASE WHEN TG_OP = 'DELETE' THEN NULL ELSE to_jsonb(NEW) END,
        current_setting('app.current_user', TRUE)
    );
    RETURN COALESCE(NEW, OLD);
END;
$$;

CREATE TRIGGER trg_audit
AFTER INSERT OR UPDATE OR DELETE ON orders
FOR EACH ROW EXECUTE FUNCTION log_audit();

-- Option 2: pgaudit extension (session and object auditing)
-- postgresql.conf: shared_preload_libraries = 'pgaudit'
SET pgaudit.log = 'write, ddl';  -- log all writes and DDL changes
SET pgaudit.log_relation = on;    -- log relation name in each entry

-- Temporal tables / history tables using triggers for full change history
-- See: temporal_tables extension or manual implementation with VALID_FROM/VALID_TO

18 What is the pg_upgrade tool and how do you perform a major version upgrade? ►

Operations pg_upgrade migrates a PostgreSQL cluster to a newer major version (e.g., 15 → 17) in-place without a full dump/restore — typically completing in minutes regardless of database size.

# Pre-upgrade checklist
pg_dumpall --globals-only > globals.sql       # backup roles, tablespaces
pg_dump -Fc mydb > mydb_backup.dump           # full backup
pg_upgrade --check ...                        # dry-run check (no changes made)

# Install new PostgreSQL version (both old and new must be installed simultaneously)
sudo apt install postgresql-17

# Stop the old cluster
sudo systemctl stop postgresql@15-main

# Run pg_upgrade
pg_upgrade \
  --old-datadir=/var/lib/postgresql/15/main \
  --new-datadir=/var/lib/postgresql/17/main \
  --old-bindir=/usr/lib/postgresql/15/bin \
  --new-bindir=/usr/lib/postgresql/17/bin \
  --link                  # hard-link files instead of copy (instant, less disk space)
                          # without --link: safe but copies entire cluster

# Start new cluster, test
sudo systemctl start postgresql@17-main
psql -p 5433 mydb -c "SELECT version();"

# Update statistics (planner stats are not migrated)
vacuumdb --all --analyze-in-stages -p 5433

# Cleanup old cluster
./delete_old_cluster.sh     # generated by pg_upgrade

# Zero-downtime alternative: logical replication upgrade
# 1. Set up logical replication from PG15 to PG17
# 2. Wait for replication lag to near zero
# 3. Switch application connection string to PG17
# 4. Decommission PG15

19 What is Citus and how does it scale PostgreSQL horizontally? ►

Scaling Citus is a PostgreSQL extension (now open source, part of Microsoft Azure) that transforms PostgreSQL into a distributed database. It shards tables across multiple worker nodes while the coordinator node routes queries transparently.

-- Coordinator node — all queries come here, routed to workers
-- Worker nodes — each holds a subset of shards

-- Distribute a table by a shard key (hash partitioned across workers)
SELECT create_distributed_table('orders', 'customer_id');
-- orders is now split into 32 shards (default) across workers
-- Queries filtered by customer_id route to ONE worker (fast)
-- Full-table scans fan out to ALL workers (parallel)

-- Reference table — copied to ALL workers (for small lookup tables)
SELECT create_reference_table('products');
-- JOINs between distributed tables and reference tables are local per worker

-- Co-location — distribute related tables by the same key
SELECT create_distributed_table('order_items', 'customer_id',
    colocate_with := 'orders');
-- ORDER_ITEMS shards are co-located with ORDERS shards
-- JOIN between orders and order_items stays local per worker (no network hop)

-- Multi-shard queries (full scans) run in parallel across workers
SELECT COUNT(*) FROM orders;           -- parallel across all workers
SELECT * FROM orders WHERE customer_id = 42;  -- single worker lookup

-- Use cases:
-- Multi-tenant SaaS: shard by tenant_id — each tenant's data on one shard
-- Time-series: shard by device_id or user_id with RANGE on time
-- Analytics: distributed aggregations across billions of rows

20 How does PostgreSQL handle foreign data wrappers (FDW)? ►

FDW Foreign Data Wrappers (SQL/MED standard) allow PostgreSQL to access external data sources as if they were local tables — other PostgreSQL databases, MySQL, Oracle, CSV files, HTTP APIs, S3, Redis, and more.

-- postgres_fdw: access another PostgreSQL database
CREATE EXTENSION postgres_fdw;

CREATE SERVER remote_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'remote-host', port '5432', dbname 'analytics');

CREATE USER MAPPING FOR current_user
SERVER remote_db
OPTIONS (user 'fdw_user', password 'secret');

-- Import all tables from a schema on the remote server
IMPORT FOREIGN SCHEMA public
FROM SERVER remote_db INTO fdw_schema;

-- Or create individual foreign tables
CREATE FOREIGN TABLE remote_orders (
    id         BIGINT,
    total      NUMERIC,
    created_at TIMESTAMPTZ
) SERVER remote_db OPTIONS (schema_name 'public', table_name 'orders');

-- Query as a local table — pushes WHERE/LIMIT down to remote server
SELECT * FROM remote_orders WHERE created_at >= NOW() - INTERVAL '7 days';

-- Join local and remote data
SELECT l.name, r.total FROM customers l
JOIN remote_orders r ON r.customer_id = l.id
WHERE r.total > 1000;

-- Other FDWs:
-- file_fdw    — CSV/text files on the server filesystem
-- oracle_fdw  — Oracle Database (third-party)
-- redis_fdw   — Redis
-- multicorn   — Python-based FDW framework for custom sources

21 How would you design a high-performance multi-tenant PostgreSQL architecture? ►

Architecture Multi-tenancy isolates customer data within a shared PostgreSQL deployment. Three main approaches with different isolation/cost trade-offs:

Approach 1: Separate database per tenant (highest isolation)

-- Each tenant: myapp_tenant_acme, myapp_tenant_globex
-- Connection string: postgresql://user:pass@host/myapp_tenant_{slug}
-- Pros: complete isolation, easy to dump/delete a tenant
-- Cons: high connection overhead, hard to query across tenants, >100 tenants is impractical

Approach 2: Separate schema per tenant (good isolation)

CREATE SCHEMA tenant_acme;
SET search_path = tenant_acme;
CREATE TABLE orders (...);  -- each tenant has own orders table
-- Pros: shared server, isolated objects, migrations per tenant
-- Cons: DDL changes must be applied to all schemas; difficult above ~1,000 tenants

Approach 3: Shared tables with tenant_id column (most scalable)

-- All tenants share tables, differentiated by tenant_id
ALTER TABLE orders ADD COLUMN tenant_id BIGINT NOT NULL REFERENCES tenants(id);

-- Compound indexes: tenant_id as leading column
CREATE INDEX idx_orders_tenant ON orders (tenant_id, created_at DESC);

-- Row-Level Security enforces isolation automatically
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON orders
    USING (tenant_id = current_setting('app.tenant_id')::BIGINT);

-- PgBouncer: one pool per tenant, sized to tenant tier
-- Partition by tenant_id hash for largest tenants
-- Use Citus for horizontal scale beyond single-node capacity

📝 Knowledge Check

These questions mirror real senior-level PostgreSQL architecture and internals interview scenarios.

🧠 Quiz Question 1 of 5

🧠 Quiz Question 2 of 5

🧠 Quiz Question 3 of 5

🧠 Quiz Question 4 of 5

🧠 Quiz Question 5 of 5

Tip: Senior PostgreSQL interviews focus on tradeoffs as much as features. For MVCC, explain what dead tuples are before explaining VACUUM. For partitioning, explain what problem you’re solving (query speed, data management) before choosing range vs hash. For replication, explain RPO and RTO requirements before recommending synchronous vs asynchronous. Context and tradeoffs are what separate expert answers.