Expert SQL Server Interview Questions and Answers

🗄️ Expert SQL Server Interview Questions

This lesson targets senior DBAs, architects, and lead engineers. Topics include the transaction log and WAL, Always On Availability Groups, replication, In-Memory OLTP, Query Store, Extended Events, deadlock analysis, index internals, TDE, row-level security, CLR integration, wait statistics, and zero-downtime maintenance. These questions reveal whether you understand SQL Server deeply or just write T-SQL.

Questions & Answers

01 What is the SQL Server transaction log and how does it work? ►

Internals The transaction log (LDF file) records every change to the database in sequential order before those changes are written to data pages. This Write-Ahead Logging (WAL) pattern is what gives SQL Server its durability and crash recovery capabilities.

How the log works:

Every DML operation writes a log record describing the change (before and after image)
Log records are written to a circular log buffer, then flushed to the LDF file on COMMIT or when the buffer fills
Data pages are written to disk asynchronously (lazy writer, checkpoint) — but the log is always ahead
On crash: SQL Server replays the log from the last checkpoint (REDO pass) and rolls back any incomplete transactions (UNDO pass)
Log records are only reusable (virtual log files recycled) after a log backup or checkpoint (SIMPLE recovery model)

-- Check transaction log usage
DBCC SQLPERF(LOGSPACE);  -- log size and used % per database

SELECT name, log_reuse_wait_desc FROM sys.databases;
-- log_reuse_wait_desc shows WHY the log cannot be reused:
-- NOTHING = no wait (healthy)
-- LOG_BACKUP = waiting for log backup (FULL/BULK_LOGGED recovery)
-- ACTIVE_TRANSACTION = long-running transaction holding the log open
-- REPLICATION = log reader agent hasn't read transactions yet
-- DATABASE_MIRRORING / HADR = secondary hasn't received / hardened log

-- Recovery models
ALTER DATABASE MyDb SET RECOVERY FULL;          -- all operations logged, log backups needed
ALTER DATABASE MyDb SET RECOVERY SIMPLE;        -- log auto-truncated at checkpoint, no log backups
ALTER DATABASE MyDb SET RECOVERY BULK_LOGGED;   -- minimal logging for bulk ops (BULK INSERT, etc.)

02 What is Always On Availability Groups (AG) in SQL Server? ►

HA/DR Always On Availability Groups (SQL Server 2012+) is the premier high availability and disaster recovery solution. An AG contains a set of databases (the availability group) that fail over together. The primary replica serves reads and writes; secondary replicas receive and apply transaction log hardening.

Key concepts:

Primary replica — the read/write instance; only one at a time
Secondary replicas — receive log stream from primary; can be readable for offloading reporting
Synchronous commit — primary waits for secondary to harden the log before committing. Zero data loss on failover. Small latency overhead.
Asynchronous commit — primary does not wait. No latency impact but potential for data loss on failover. Used for geographically distant replicas.
Listener — virtual name/IP that clients connect to; automatically routes to the current primary
Automatic failover — with synchronous commit and WSFC; RTO typically < 30 seconds
Read-scale secondary — route heavy reporting queries to a synchronous secondary to relieve primary load

-- Monitor AG health
SELECT ag.name AS AG_Name, ars.role_desc, ars.operational_state_desc,
       ars.synchronization_health_desc, adc.database_name
FROM sys.availability_groups ag
JOIN sys.availability_replicas ar ON ar.group_id = ag.group_id
JOIN sys.dm_hadr_availability_replica_states ars ON ars.replica_id = ar.replica_id
JOIN sys.dm_hadr_database_replica_states drs ON drs.replica_id = ar.replica_id
JOIN sys.availability_databases_cluster adc ON adc.group_id = ag.group_id;

-- Check replication lag (on primary)
SELECT ar.replica_server_name, drs.synchronization_state_desc,
       drs.log_send_queue_size, drs.redo_queue_size
FROM sys.availability_replicas ar
JOIN sys.dm_hadr_database_replica_states drs ON drs.replica_id = ar.replica_id;

03 What is In-Memory OLTP (Hekaton) in SQL Server? ►

In-Memory In-Memory OLTP (code-named Hekaton, introduced in SQL Server 2014) stores tables entirely in RAM with an optimistic, multi-version concurrency control (MVCC) engine. It eliminates lock/latch contention and uses natively compiled stored procedures — C++ compiled from T-SQL for near-zero execution overhead.

-- Enable memory-optimised filegroup
ALTER DATABASE MyDb ADD FILEGROUP mem_fg CONTAINS MEMORY_OPTIMIZED_DATA;
ALTER DATABASE MyDb ADD FILE (NAME='mem', FILENAME='C:\Data\mem') TO FILEGROUP mem_fg;

-- Create a memory-optimised table
CREATE TABLE dbo.SessionCache (
    SessionId   UNIQUEIDENTIFIER NOT NULL PRIMARY KEY NONCLUSTERED HASH
                WITH (BUCKET_COUNT = 131072),  -- must be power of 2
    UserId      INT              NOT NULL,
    Data        NVARCHAR(4000)   NULL,
    ExpiresAt   DATETIME2        NOT NULL,

    INDEX IX_SessionCache_UserId HASH (UserId)
        WITH (BUCKET_COUNT = 131072)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
-- DURABILITY options:
-- SCHEMA_AND_DATA — durable (survives restart, written to disk checkpoint files)
-- SCHEMA_ONLY     — data lost on restart (session caches, staging tables)

-- Natively compiled stored procedure
CREATE PROCEDURE dbo.usp_GetSession @SessionId UNIQUEIDENTIFIER
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT UserId, Data FROM dbo.SessionCache
    WHERE SessionId = @SessionId AND ExpiresAt > SYSDATETIME();
END;

-- Performance gains: 10-30x faster for hot-path OLTP workloads
-- Limitations: no LOB columns, no FK to disk-based tables, no triggers, limited T-SQL support

04 What is the Query Store in SQL Server? ►

Performance Query Store (SQL Server 2016+) automatically captures every query’s execution plans, runtime statistics, and wait statistics over time. It persists this data across restarts — enabling “time travel” to diagnose when and why a query regressed.

-- Enable Query Store
ALTER DATABASE MyDb SET QUERY_STORE = ON
(
    OPERATION_MODE              = READ_WRITE,
    CLEANUP_POLICY              = (STALE_QUERY_THRESHOLD_DAYS = 30),
    DATA_FLUSH_INTERVAL_SECONDS = 900,
    INTERVAL_LENGTH_MINUTES     = 60,
    MAX_STORAGE_SIZE_MB         = 1000,
    QUERY_CAPTURE_MODE          = AUTO   -- or ALL / NONE
);

-- Find top 10 queries by avg CPU over last 24h
SELECT TOP 10
    q.query_id,
    qt.query_sql_text,
    rs.avg_cpu_time / 1000.0            AS avg_cpu_ms,
    rs.avg_duration / 1000.0            AS avg_duration_ms,
    rs.count_executions
FROM sys.query_store_query q
JOIN sys.query_store_query_text  qt ON qt.query_text_id = q.query_text_id
JOIN sys.query_store_plan        p  ON p.query_id = q.query_id
JOIN sys.query_store_runtime_stats rs ON rs.plan_id = p.plan_id
JOIN sys.query_store_runtime_stats_interval ri ON ri.runtime_stats_interval_id = rs.runtime_stats_interval_id
WHERE ri.start_time >= DATEADD(HOUR, -24, GETUTCDATE())
ORDER BY rs.avg_cpu_time DESC;

-- Force a specific execution plan (plan regression fix)
EXEC sp_query_store_force_plan @query_id = 42, @plan_id = 7;

-- Unforce a plan
EXEC sp_query_store_unforce_plan @query_id = 42, @plan_id = 7;

-- Automatic plan correction (SQL Server 2017+ Enterprise)
ALTER DATABASE MyDb SET AUTOMATIC_TUNING (FORCE_LAST_GOOD_PLAN = ON);

05 What are Extended Events (XEvents) in SQL Server? ►

Diagnostics Extended Events (XE) is SQL Server’s lightweight, flexible diagnostic tracing system — the modern replacement for SQL Trace and Profiler. XE has much lower overhead than Profiler and can capture hundreds of different events.

-- Create an XE session to capture slow queries
CREATE EVENT SESSION [SlowQueries] ON SERVER
ADD EVENT sqlserver.sql_statement_completed (
    ACTION (
        sqlserver.sql_text,
        sqlserver.plan_handle,
        sqlserver.username,
        sqlserver.database_name,
        sqlserver.client_app_name
    )
    WHERE duration > 1000000  -- duration is in microseconds (1s = 1,000,000)
),
ADD EVENT sqlserver.rpc_completed (
    WHERE duration > 1000000
)
ADD TARGET package0.ring_buffer (SET max_memory = 51200),  -- in-memory
ADD TARGET package0.event_file  (SET filename = 'C:\XE\SlowQueries.xel',
                                   max_file_size = 100,        -- MB
                                   max_rollover_files = 5);

-- Start/Stop
ALTER EVENT SESSION [SlowQueries] ON SERVER STATE = START;
ALTER EVENT SESSION [SlowQueries] ON SERVER STATE = STOP;

-- Read from the ring buffer
SELECT
    xdr.value('(event/@name)[1]',            'varchar(100)') AS event_name,
    xdr.value('(event/@timestamp)[1]',        'datetime2')    AS event_time,
    xdr.value('(event/data[@name="duration"]/value)[1]', 'bigint') / 1000.0 AS duration_ms,
    xdr.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS sql_text
FROM (
    SELECT CAST(target_data AS XML) AS target_data
    FROM sys.dm_xe_sessions s
    JOIN sys.dm_xe_session_targets t ON t.event_session_address = s.address
    WHERE s.name = 'SlowQueries' AND t.target_name = 'ring_buffer'
) AS d
CROSS APPLY target_data.nodes('RingBufferTarget/event') AS t(xdr)
ORDER BY duration_ms DESC;

DROP EVENT SESSION [SlowQueries] ON SERVER;

06 How do you analyse and resolve deadlocks in SQL Server? ►

Concurrency A deadlock occurs when two or more sessions are each waiting for the other to release a lock — a circular wait. SQL Server’s deadlock monitor automatically detects and resolves deadlocks by choosing a victim (rolls back one transaction).

-- Enable deadlock trace flag (adds deadlock graph to error log)
DBCC TRACEON(1222, -1);   -- detailed deadlock info
DBCC TRACEON(1204, -1);   -- simpler format (legacy)

-- Capture deadlocks with Extended Events (preferred — zero overhead)
CREATE EVENT SESSION [DeadlockCapture] ON SERVER
ADD EVENT sqlserver.xml_deadlock_report
ADD TARGET package0.event_file (SET filename = 'C:\XE\Deadlocks.xel');
ALTER EVENT SESSION [DeadlockCapture] ON SERVER STATE = START;

-- Read deadlock XML
SELECT
    xdr.value('(event/@timestamp)[1]',  'datetime2') AS deadlock_time,
    xdr.query('(event/data/value/deadlock)[1]')       AS deadlock_graph
FROM (
    SELECT CAST(target_data AS XML) AS target_data
    FROM sys.dm_xe_sessions s
    JOIN sys.dm_xe_session_targets t ON t.event_session_address = s.address
    WHERE s.name = 'DeadlockCapture'
) d CROSS APPLY target_data.nodes('RingBufferTarget/event') t(xdr);

-- Common deadlock causes and fixes:
-- 1. Inconsistent lock order: T1 locks A then B; T2 locks B then A
--    Fix: always access tables in the same order in all transactions
-- 2. Long transactions holding locks
--    Fix: keep transactions short; commit or rollback promptly
-- 3. Missing indexes causing table scans that lock too many rows
--    Fix: add indexes to narrow the lock scope
-- 4. Implicit transactions (IMPLICIT_TRANSACTIONS ON)
--    Fix: use explicit BEGIN TRANSACTION / COMMIT
-- 5. Hot rows (many sessions updating the same row)
--    Fix: RCSI, optimistic concurrency, or redesign access pattern

07 What is Transparent Data Encryption (TDE) in SQL Server? ►

Security TDE encrypts all data at rest — data files (MDF/NDF), log files (LDF), and backups — at the page level using AES-256. Encryption and decryption are transparent to applications. Protects against theft of physical media or data files.

-- 1. Create a master key in master database (if not exists)
USE master;
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'StrongPassword!1';

-- 2. Create a certificate to protect the DEK
CREATE CERTIFICATE TDE_Cert WITH SUBJECT = 'TDE Certificate';

-- 3. Enable TDE on the target database
USE MyDatabase;
CREATE DATABASE ENCRYPTION KEY
    WITH ALGORITHM = AES_256
    ENCRYPTION BY SERVER CERTIFICATE TDE_Cert;

ALTER DATABASE MyDatabase SET ENCRYPTION ON;

-- 4. Verify encryption state
SELECT db_name(database_id) AS DB, encryption_state_desc, percent_complete
FROM sys.dm_database_encryption_keys;
-- encryption_state: 0=no encryption, 1=unencrypted, 2=encrypting,
--                   3=encrypted, 4=key change in progress, 5=decrypting

-- 5. CRITICAL: Back up the certificate immediately!
-- If you lose the certificate, you CANNOT restore the database
BACKUP CERTIFICATE TDE_Cert
    TO FILE = 'C:\Backups\TDE_Cert.cer'
    WITH PRIVATE KEY (
        FILE      = 'C:\Backups\TDE_Cert.pvk',
        ENCRYPTION BY PASSWORD = 'PrivKeyPassword!1'
    );

-- TDE does NOT protect: data in transit (use TLS), data in memory,
-- backups without the certificate, tempdb (though it is auto-encrypted when any DB uses TDE)

08 What is Row-Level Security (RLS) in SQL Server? ►

Security Row-Level Security (SQL Server 2016+) restricts which rows a user can see or modify by automatically appending a filter predicate to every query — enforced at the database level regardless of the application.

-- Use case: multi-tenant SaaS — each user sees only their own data

-- Step 1: Create a security predicate function
CREATE SCHEMA Security;
GO

CREATE FUNCTION Security.fn_TenantFilter (@TenantId INT)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN
(
    SELECT 1 AS AccessGranted
    WHERE @TenantId = CAST(SESSION_CONTEXT(N'TenantId') AS INT)
        OR IS_MEMBER('db_owner') = 1  -- admins bypass the filter
);
GO

-- Step 2: Create a security policy on the Orders table
CREATE SECURITY POLICY TenantPolicy
ADD FILTER PREDICATE Security.fn_TenantFilter(TenantId) ON dbo.Orders,
ADD BLOCK  PREDICATE Security.fn_TenantFilter(TenantId) ON dbo.Orders AFTER INSERT;
-- FILTER predicate: restricts SELECT
-- BLOCK predicate AFTER INSERT: prevents inserting rows for another tenant

-- Step 3: At connection time, set the tenant context
EXEC sp_set_session_context N'TenantId', 42;  -- called by application after login

-- Now: SELECT * FROM Orders returns ONLY rows where TenantId = 42
-- No code change needed in queries — fully transparent

-- Disable/Enable policy
ALTER SECURITY POLICY TenantPolicy WITH (STATE = OFF);
ALTER SECURITY POLICY TenantPolicy WITH (STATE = ON);

DROP SECURITY POLICY TenantPolicy;

-- Column-level security: use GRANT/DENY on columns
GRANT SELECT ON Orders (Id, Status, CreatedAt) TO ReportingRole;
DENY  SELECT ON Orders (CreditCardToken)       TO ReportingRole;

09 What are wait statistics in SQL Server and how do you use them? ►

Performance When a SQL Server thread cannot run because it is waiting for something (a lock, I/O, memory, CPU), it records a wait. Wait statistics in sys.dm_os_wait_stats reveal the primary bottleneck on your server — the single most useful starting point for performance analysis.

-- Top wait types (cumulative since last restart or reset)
SELECT TOP 20
    wait_type,
    waiting_tasks_count,
    wait_time_ms,
    signal_wait_time_ms,                           -- CPU runnable queue time
    wait_time_ms - signal_wait_time_ms AS resource_wait_ms,
    CAST(100.0 * wait_time_ms / SUM(wait_time_ms) OVER() AS DECIMAL(5,2)) AS pct
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN (
    -- Benign background waits to filter out
    'SLEEP_TASK','SQLTRACE_BUFFER_FLUSH','WAITFOR','BROKER_TO_FLUSH',
    'CHECKPOINT_QUEUE','REQUEST_FOR_DEADLOCK_SEARCH','HADR_FILESTREAM_IOMGR_IOCOMPLETION',
    'DISPATCHER_QUEUE_SEMAPHORE','XE_TIMER_EVENT','BROKER_EVENTHANDLER',
    'ONDEMAND_TASK_QUEUE','PREEMPTIVE_OS_AUTHENTICATIONOPS','SNI_HTTP_ACCEPT',
    'SLEEP_DBSTARTUP','SLEEP_DCOMSTARTUP','SLEEP_MASTERDBREADY','SLEEP_MASTERMDREADY',
    'SLEEP_MASTERUPGRADED','SLEEP_MSDBSTARTUP','SLEEP_TEMPDBSTARTUP','SLEEP_SYSTEMTASK',
    'SLEEP_MEMORYPOOL_FLUSH','SLEEP_BUFFERPOOL_HELPLW','SLEEP_TEXTPTR_INITIALIZE',
    'DBMIRROR_EVENTS_QUEUE','SQLTRACE_INCREMENTAL_FLUSH_SLEEP','XE_DISPATCHER_WAIT',
    'HADR_WORK_QUEUE','HADR_LOG_CAPTURE_WAIT','HADR_TRANSPORT_DRAINED'
)
ORDER BY wait_time_ms DESC;

-- Common waits and what they mean:
-- PAGEIOLATCH_SH/EX — disk I/O for data pages (add faster storage/memory)
-- LCK_M_* — locks (long transactions, missing indexes, hot rows)
-- CXPACKET — parallel query exchange (tune max degree of parallelism)
-- SOS_SCHEDULER_YIELD — CPU pressure (upgrade CPU, tune queries)
-- WRITELOG — log write wait (upgrade to SSD, increase log I/O)
-- RESOURCE_SEMAPHORE — query memory grants queuing (too many large parallel queries)

-- Reset wait stats
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

10 What is the Buffer Pool and how does SQL Server manage memory? ►

Internals The Buffer Pool (Buffer Cache) is SQL Server’s primary in-memory cache for data and index pages. It grows dynamically to use available RAM and shrinks when the OS requests memory back (Memory Manager).

-- Memory configuration
SELECT name, value_in_use, description
FROM sys.configurations
WHERE name IN ('max server memory (MB)', 'min server memory (MB)', 'max degree of parallelism',
               'cost threshold for parallelism', 'optimize for ad hoc workloads');

-- Memory breakdown
SELECT
    physical_memory_in_use_kb  / 1024.0 AS process_memory_mb,
    locked_page_allocations_kb / 1024.0 AS locked_pages_mb,
    virtual_address_space_reserved_kb / 1024.0 AS vas_reserved_mb
FROM sys.dm_os_process_memory;

-- Buffer pool usage by database
SELECT
    DB_NAME(database_id) AS Database_Name,
    COUNT(*)             AS Cached_Pages,
    COUNT(*) * 8.0 / 1024 AS Cached_MB
FROM sys.dm_os_buffer_descriptors
WHERE database_id > 4
GROUP BY database_id
ORDER BY Cached_MB DESC;

-- Page life expectancy (PLE) — how long a page stays in cache
-- Rule of thumb: PLE should be > 300s for healthy systems
-- Modern systems with lots of RAM: aim for > 1000s
SELECT
    cntr_value AS PageLifeExpectancy_Seconds
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Page life expectancy'
  AND object_name LIKE '%Buffer Manager%';

-- Recommended max server memory = Total RAM - OS (4GB) - SSRS/SSIS overhead
-- Leaving RAM for the OS prevents paging which kills performance

11 What are SQL Server Agent jobs and how do you use them for maintenance? ►

Operations SQL Server Agent is an automation service that schedules and runs jobs — T-SQL scripts, SSIS packages, PowerShell, or command-line tools — on a defined schedule or triggered by alerts.

-- Create a maintenance job programmatically
USE msdb;
GO

EXEC sp_add_job
    @job_name = N'Weekly_Index_Maintenance',
    @description = N'Rebuild or reorganise fragmented indexes',
    @enabled = 1;

-- Add a step to run T-SQL
EXEC sp_add_jobstep
    @job_name  = N'Weekly_Index_Maintenance',
    @step_name = N'Rebuild Indexes',
    @command   = N'
EXEC dbo.IndexOptimize   -- Ola Hallengren solution (recommended)
    @Databases   = ''MyDatabase'',
    @FragmentationLow    = NULL,       -- don''t touch low fragmentation
    @FragmentationMedium = ''INDEX_REORGANIZE'',
    @FragmentationHigh   = ''INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE'',
    @FragmentationLevel1 = 5,
    @FragmentationLevel2 = 30,
    @UpdateStatistics    = ''ALL'';
    ';

-- Add a weekly schedule
EXEC sp_add_schedule
    @schedule_name = N'Every_Sunday_2AM',
    @freq_type     = 8,         -- weekly
    @freq_interval = 1,         -- Sunday
    @active_start_time = 20000; -- 2:00 AM

EXEC sp_attach_schedule @job_name = N'Weekly_Index_Maintenance',
                         @schedule_name = N'Every_Sunday_2AM';

EXEC sp_add_jobserver @job_name = N'Weekly_Index_Maintenance', @server_name = @@SERVERNAME;

-- Ola Hallengren maintenance solution (industry standard):
-- IndexOptimize   — smart index rebuild/reorganise
-- DatabaseBackup  — backup with retention management
-- DatabaseIntegrityCheck — DBCC CHECKDB with alerting
-- Download from: ola.hallengren.com

12 What is CLR integration in SQL Server and when should you use it? ►

Advanced SQL Server CLR integration allows you to write stored procedures, functions, triggers, types, and aggregates in .NET languages (C#, VB.NET) and deploy them into SQL Server. The CLR code runs inside the SQL Server process.

-- Enable CLR integration
EXEC sp_configure 'clr enabled', 1;
RECONFIGURE;

-- For .NET 8 / SQL Server 2022: also enable strict security mode bypass
-- (or sign the assembly with a trusted certificate)
ALTER DATABASE MyDb SET TRUSTWORTHY ON;  -- legacy option
-- OR: use asymmetric key-based trusted assemblies

-- C# example: regex string function
-- (in Visual Studio, create a SQL Server Database Project)
// RegexFunctions.cs
using Microsoft.SqlServer.Server;
using System.Text.RegularExpressions;
using System.Data.SqlTypes;

public static class RegexFunctions
{
    [SqlFunction(IsDeterministic = true, IsPrecise = true)]
    public static SqlBoolean RegexMatch(SqlString input, SqlString pattern)
    {
        if (input.IsNull || pattern.IsNull) return SqlBoolean.Null;
        return Regex.IsMatch(input.Value, pattern.Value);
    }

    [SqlFunction(IsDeterministic = true)]
    public static SqlString RegexReplace(SqlString input, SqlString pattern, SqlString replacement)
        => Regex.Replace(input.Value, pattern.Value, replacement.Value);
}

-- Deploy and register
CREATE ASSEMBLY RegexFunctions FROM 'C:\Assemblies\RegexFunctions.dll'
    WITH PERMISSION_SET = SAFE;

CREATE FUNCTION dbo.RegexMatch(@input NVARCHAR(MAX), @pattern NVARCHAR(500))
RETURNS BIT AS EXTERNAL NAME RegexFunctions.[RegexFunctions].RegexMatch;

-- Use it
SELECT Name FROM Products WHERE dbo.RegexMatch(Name, '^[A-Z]{3}-\d{4}$') = 1;

-- When to use CLR:
-- Complex string manipulation / regex (T-SQL is verbose)
-- Mathematical computations (T-SQL struggles)
-- Custom aggregates (e.g., median, geometric mean)
-- Integration with .NET libraries unavailable in T-SQL
-- NOT recommended for: standard CRUD, simple calculations (T-SQL overhead is zero)

13 What is the difference between online and offline index operations? ►

Operations

OFFLINE index rebuild — takes a schema modification lock (SCH-M) that blocks all access to the table during the entire operation. Safe on any SQL Server edition but causes downtime on production tables.
ONLINE index rebuild — uses row versioning and lock management to allow DML (INSERT/UPDATE/DELETE) to continue during the rebuild. A brief SCH-M lock is taken only at the very start and end. Requires Enterprise or Developer edition. Available in Standard edition from SQL Server 2022 for non-clustered indexes.
RESUMABLE index operations (SQL Server 2017+) — pauses and resumes large index builds/rebuilds. Enables scheduling across multiple maintenance windows.

-- Offline rebuild (blocks all access — avoid on production during business hours)
ALTER INDEX ALL ON Orders REBUILD;

-- Online rebuild (Enterprise — allows concurrent access)
ALTER INDEX IX_Orders_Customer ON Orders REBUILD WITH (ONLINE = ON);

-- Online rebuild with low priority wait management
ALTER INDEX IX_Orders_Customer ON Orders REBUILD WITH (
    ONLINE = ON (
        WAIT_AT_LOW_PRIORITY (
            MAX_DURATION = 5 MINUTES,
            ABORT_AFTER_WAIT = SELF  -- cancel rebuild if blocked for 5min
            -- ABORT_AFTER_WAIT = BLOCKERS  -- kill blockers instead
        )
    )
);

-- Resumable index build (can be paused and resumed)
ALTER INDEX IX_Orders_Customer ON Orders REBUILD WITH (ONLINE = ON, RESUMABLE = ON, MAX_DURATION = 120);  -- pause after 120 minutes
ALTER INDEX IX_Orders_Customer ON Orders PAUSE;   -- pause the operation
ALTER INDEX IX_Orders_Customer ON Orders RESUME;  -- resume from where it stopped
ALTER INDEX IX_Orders_Customer ON Orders ABORT;   -- cancel entirely

-- Check resumable operations in progress
SELECT * FROM sys.index_resumable_operations;

14 What are common SQL Server anti-patterns and how do you fix them? ►

Best Practices

SELECT * in production code — fetches all columns; unnecessary I/O; breaks when schema changes. Fix: always specify needed columns.
Function on indexed column in WHERE — WHERE YEAR(OrderDate) = 2025 defeats the index. Fix: WHERE OrderDate >= '2025-01-01' AND OrderDate < '2026-01-01' (SARGable).
NOLOCK hint everywhere — “dirty reads” return incorrect, inconsistent data. Fix: enable RCSI instead; avoid NOLOCK except for truly approximate counts on static data.
Cursors for row-by-row processing — 100x slower than set-based operations. Fix: replace with set-based UPDATE/INSERT or window functions.
Non-SARGable predicates — any predicate that prevents index seeks: LIKE '%word', ISNULL(col,0) = 5, CONVERT(DATE, col). Fix: restructure to allow index seeks.
Missing indexes on foreign keys — SQL Server does NOT auto-create FK indexes. Fix: index all FK columns used in JOINs.
Implicit data type conversions — comparing NVARCHAR column to VARCHAR parameter causes a CONVERT_IMPLICIT warning and index scan. Fix: match data types exactly in parameters and variables.

-- ❌ Non-SARGable
WHERE CONVERT(DATE, CreatedAt) = '2026-01-15'   -- forces convert on every row
WHERE ISNULL(Status, 'Unknown') = 'Pending'     -- prevents index use
WHERE YEAR(OrderDate) = 2025

-- ✅ SARGable (index can be used)
WHERE CreatedAt >= '2026-01-15' AND CreatedAt < '2026-01-16'
WHERE (Status = 'Pending' OR Status IS NULL)
WHERE OrderDate >= '2025-01-01' AND OrderDate < '2026-01-01'

15 What is Database Sharding and how does it compare to SQL Server partitioning? ►

Architecture

Table Partitioning (SQL Server built-in) — divides a single logical table into physical partitions stored on the same SQL Server instance (potentially different filegroups/disks). Transparent to applications. One server still handles all load. Best for managing data lifecycle (archiving, fast partition switching) and query performance on time-series data.
Sharding — distributes data across multiple SQL Server instances (shards), each holding a subset of the data (e.g., customers A-M on shard 1, N-Z on shard 2). The application (or a routing layer) decides which shard to query. Provides horizontal scale-out. Much higher complexity: cross-shard joins, distributed transactions, resharding.

-- Partitioning: same server, different filegroups, transparent to app
-- Partitioning is suitable when:
-- 1. Table has 100M+ rows and queries filter by the partition column
-- 2. You need to archive old partitions quickly (SWITCH operation = instant)
-- 3. One SQL Server instance has enough capacity

-- Sharding: multiple servers, app-level routing
-- Sharding is suitable when:
-- 1. Single server I/O / memory / CPU is a genuine bottleneck
-- 2. Total data exceeds what one server can handle
-- 3. You can afford the complexity of distributed data management

-- Elastic Database (Azure SQL) — managed sharding for Azure SQL
-- Elastic Query — query across shards from a single connection point
-- Elastic Transactions — distributed transactions across Azure SQL shards

-- Alternative to sharding for read scale: Always On readable secondary
-- Route SELECT queries to the secondary replica; primary handles writes only

16 What is Query Plan Caching and how does it work? ►

Internals

-- SQL Server compiles each query into an execution plan and caches it in the plan cache
-- (part of the buffer pool). On the next identical query, the cached plan is reused.

-- The plan cache key includes the exact query text (case-sensitive in some configs)
-- These THREE queries each get their OWN cache entry:
SELECT * FROM Orders WHERE CustomerId = 1;
Select * From Orders Where CustomerId = 1;   -- different case
SELECT * FROM Orders WHERE CustomerId = 2;   -- different literal value!

-- Ad hoc queries: each unique literal value creates a new plan (plan cache pollution)
-- Solution: use sp_executesql with parameters
EXEC sp_executesql
    N'SELECT * FROM Orders WHERE CustomerId = @cid',
    N'@cid INT',
    @cid = 1;
-- Same plan is reused for @cid = 2, 42, 999 — one plan for all values

-- Optimise for ad hoc workloads (cache only single-use plan stubs, not full plans)
EXEC sp_configure 'optimize for ad hoc workloads', 1;
RECONFIGURE;  -- strongly recommended for OLTP servers with many unique queries

-- Inspect the plan cache
SELECT TOP 20
    qs.execution_count,
    qs.total_worker_time / 1000 AS total_cpu_ms,
    SUBSTRING(st.text, 1, 200)  AS query_text,
    qp.query_plan
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle)   AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
ORDER BY qs.total_worker_time DESC;

-- Clear the plan cache (development/testing only — never on production without reason)
DBCC FREEPROCCACHE;
DBCC FREEPROCCACHE (plan_handle);  -- clear a specific plan

17 What are the SQL Server backup types and how do you build a backup strategy? ►

Operations

Full backup — entire database; self-contained; longest restore but simplest. Taken weekly or daily.
Differential backup — all changes since the last FULL backup. Faster than full; restore requires full + latest differential.
Transaction log backup — all log records since the last log backup (FULL or BULK_LOGGED recovery only). Enables point-in-time recovery. Taken every 5-60 minutes.
Copy-only backup — standalone full or log backup that does not affect the backup chain (used for one-off copies).
File/filegroup backup — backup individual files for very large databases.

-- Example backup strategy for FULL recovery model (mission-critical):
-- Sunday 01:00  — Full backup
-- Mon-Sat 01:00 — Differential backup
-- Every 15 min  — Transaction log backup
-- Retention: 4 weekly full backups

-- Take backups
BACKUP DATABASE MyDb TO DISK = 'C:\Backups\MyDb_Full.bak'
    WITH COMPRESSION, CHECKSUM, STATS = 10;

BACKUP DATABASE MyDb TO DISK = 'C:\Backups\MyDb_Diff.bak'
    WITH DIFFERENTIAL, COMPRESSION, CHECKSUM;

BACKUP LOG MyDb TO DISK = 'C:\Backups\MyDb_Log_20260420_1400.bak'
    WITH COMPRESSION, CHECKSUM;

-- Restore sequence (point-in-time to 2026-04-20 14:23:45)
RESTORE DATABASE MyDb FROM DISK = 'C:\Backups\MyDb_Full.bak'
    WITH NORECOVERY;  -- keep in restoring state for more restores

RESTORE DATABASE MyDb FROM DISK = 'C:\Backups\MyDb_Diff.bak'
    WITH NORECOVERY;

RESTORE LOG MyDb FROM DISK = 'C:\Backups\MyDb_Log_20260420_1400.bak'
    WITH NORECOVERY,
    STOPAT = '2026-04-20 14:23:45';  -- stop at specific point

RESTORE DATABASE MyDb WITH RECOVERY;  -- bring online

-- Verify backup
RESTORE VERIFYONLY FROM DISK = 'C:\Backups\MyDb_Full.bak' WITH CHECKSUM;

18 What are common SQL Server 2022 new features? ►

SQL Server 2022

Azure Synapse Link for SQL Server — real-time analytical offload; replicates changes to Azure Synapse Analytics without ETL
Intelligent Query Processing (IQP) improvements — Parameter Sensitive Plan (PSP) optimisation; multiple plans per query for different parameter ranges (fixes parameter sniffing automatically)
Ledger tables — append-only, tamper-evident tables with cryptographic proof of data integrity; ideal for audit, financial, and compliance records
JSON improvements — JSON_PATH_EXISTS, JSON_OBJECT, JSON_ARRAY construction functions
DATETRUNC — truncate dates to a specified precision (year, month, day, hour, etc.)
TRIM improvements — TRIM(characters FROM string) to remove specific characters
IS [NOT] DISTINCT FROM — NULL-safe comparison operator
GENERATE_SERIES — generates a series of numbers (replaces recursive CTEs for this)
Resumable add constraint — add FK/unique constraints with WITH (ONLINE = ON, RESUMABLE = ON)
Improved online index operations — online index builds for Standard edition (limited)
Buffer pool parallel scan — faster checkpoint and database startup

-- GENERATE_SERIES — create a number table inline
SELECT value FROM GENERATE_SERIES(1, 100) AS s;
SELECT CAST(DATEADD(DAY, value-1, '2026-01-01') AS DATE) AS CalendarDate
FROM GENERATE_SERIES(1, 365);

-- DATETRUNC
SELECT DATETRUNC(MONTH, GETDATE());   -- first day of current month

-- IS DISTINCT FROM (NULL-safe compare)
SELECT * FROM Products p1, Products p2
WHERE p1.Price IS DISTINCT FROM p2.Price;   -- true even when both are NULL

19 How do you implement zero-downtime schema changes in SQL Server? ►

Operations Many DDL operations take a Schema Modification Lock (SCH-M) that blocks all reads and writes. For production tables receiving high traffic, careful phased migrations are required.

-- SAFE: Add a nullable column — instant, no lock
ALTER TABLE Orders ADD Priority INT NULL;  -- 0ms, no blocking

-- DANGEROUS: Add NOT NULL column with DEFAULT on large table
-- Rebuilds entire table in older SQL Server versions
ALTER TABLE Orders ADD IsUrgent BIT NOT NULL DEFAULT 0;
-- SQL Server 2012+: this is safe — default is stored as metadata,
-- not backfilled physically until a row is modified (online metadata change)
-- But large in-place backfill may still occur in some scenarios

-- Safe pattern: 3-phase migration
-- Phase 1: Add nullable column (instant)
ALTER TABLE Orders ADD Priority INT NULL;

-- Phase 2: Backfill in small batches (non-blocking)
DECLARE @batchSize INT = 10000;
WHILE 1 = 1
BEGIN
    UPDATE TOP (@batchSize) Orders SET Priority = 0 WHERE Priority IS NULL;
    IF @@ROWCOUNT = 0 BREAK;
    WAITFOR DELAY '00:00:00.100';  -- brief pause to reduce I/O pressure
END;

-- Phase 3: Add NOT NULL constraint NOT VALID, then validate (non-blocking)
ALTER TABLE Orders ADD CONSTRAINT DF_Orders_Priority DEFAULT 0 FOR Priority;
ALTER TABLE Orders ALTER COLUMN Priority INT NOT NULL;

-- Add index online (no table lock)
CREATE INDEX IX_Orders_Priority ON Orders (Priority) WITH (ONLINE = ON);

-- Add FK constraint without validation initially (instant), then validate in background
ALTER TABLE Orders ADD CONSTRAINT FK_Orders_Customer
    FOREIGN KEY (CustomerId) REFERENCES Customers(Id)
    WITH NOCHECK;   -- don't validate existing rows immediately
-- Later, when load is low:
ALTER TABLE Orders WITH CHECK CHECK CONSTRAINT FK_Orders_Customer;

20 What is SQL Server Replication and what types are available? ►

HA/DR

Snapshot Replication — periodically takes a full snapshot of the published data and delivers it to subscribers. High latency (hours to days). Simple but not real-time. Good for reference/lookup data that changes infrequently.
Transactional Replication — the Log Reader Agent reads committed transactions from the publisher’s transaction log and delivers them to subscribers via the Distribution database. Near-real-time (seconds to minutes). Good for read scale-out and reporting.
Merge Replication — both publisher and subscribers can modify data; changes are merged using a conflict resolution policy. Complex, higher overhead. Used for disconnected scenarios (laptops syncing to a central server).
Peer-to-Peer Transactional Replication — all nodes can read and write; changes propagate to all peers. Requires careful application design to avoid conflicts. Used for multi-site write scale-out.

-- Transactional replication components:
-- Publisher:    source database (tables being replicated)
-- Distributor:  holds the Distribution database (queue of transactions)
-- Subscriber:   destination database receiving changes

-- Monitor replication
SELECT * FROM distribution.dbo.MSdistribution_status;
SELECT * FROM sys.dm_repl_articles;
EXEC sp_replmonitorhelppublisher;   -- replication monitor info

-- Replication vs Always On:
-- Always On: block-level (entire database); automatic failover; built-in HA
-- Replication: row/article level; selective; subscribers can be on older SQL versions;
--              can filter by row/column; can transform data during replication

21 How do you design a highly available SQL Server architecture? ►

Architecture

Recommended production HA stack:

Always On Availability Groups — primary + 2 synchronous secondaries (one in the same datacenter, one in a DR site); automatic failover for local failure; manual for DR
Windows Server Failover Clustering (WSFC) — cluster health monitoring and orchestration for AG failover
AG Listener — virtual name clients connect to; redirects to current primary automatically
Readable secondary — route reporting/analytics queries to the synchronous secondary to offload primary
RCSI enabled — eliminates read/write blocking on the primary
Query Store enabled — persists query performance data, enables automatic plan correction
Backup strategy — full weekly, differential daily, log every 15 minutes; tested restores monthly
Maintenance plan — Ola Hallengren scripts for index maintenance, integrity checks, statistics updates

-- Target SLAs for this architecture:
-- RPO (Recovery Point Objective): ~0 seconds (synchronous commit)
-- RTO (Recovery Time Objective):  <30 seconds (automatic failover)
-- Availability: 99.99% (four nines)

-- Azure equivalent:
-- Azure SQL Database Business Critical tier (built-in AG, multiple replicas)
-- OR: SQL Managed Instance Business Critical (near 100% compatibility)

-- Monitoring checklist:
-- ✅ AG health and replication lag (sys.dm_hadr_database_replica_states)
-- ✅ Backup job success/failure (msdb.dbo.backupset)
-- ✅ Page life expectancy trend (target >= 300s)
-- ✅ Failed login alerts (SQL Server Audit)
-- ✅ Disk space usage (model database growth)
-- ✅ Long-running transactions (sys.dm_exec_sessions)
-- ✅ Wait stats baseline (weekly snapshot and compare)

📝 Knowledge Check

These questions mirror real senior-level SQL Server DBA and architect interview scenarios.

🧠 Quiz Question 1 of 5

🧠 Quiz Question 2 of 5

🧠 Quiz Question 3 of 5

🧠 Quiz Question 4 of 5

🧠 Quiz Question 5 of 5

Tip: Senior SQL Server interviews reward architectural depth. For wait statistics, name the top three waits and what each indicates before jumping to solutions. For HA, explain RPO and RTO requirements before recommending AG vs replication. For index design, explain seek vs scan trade-offs before choosing between clustered, non-clustered, and columnstore. Context, trade-offs, and operational thinking separate expert answers from intermediate ones.