Advanced SQL Server Interview Questions and Answers

๐Ÿ“‹ Table of Contents โ–พ
  1. Questions & Answers
  2. 📝 Knowledge Check

🗄️ Advanced SQL Server Interview Questions

This lesson targets mid-to-senior SQL Server roles. Topics include CTEs, window functions, MERGE, JSON, full-text search, execution plans, index strategies, triggers, transaction isolation levels, partitioning, temporal tables, and performance troubleshooting. These questions separate SQL writers from database architects.

Questions & Answers

01 What are CTEs (Common Table Expressions)? What is a recursive CTE?

SQL

-- Basic CTE โ€” named, reusable result set
WITH HighValueCustomers AS (
    SELECT CustomerId, SUM(Total) AS LifetimeValue
    FROM Orders
    WHERE Status = 'Delivered'
    GROUP BY CustomerId
    HAVING SUM(Total) > 5000
)
SELECT c.Name, h.LifetimeValue
FROM HighValueCustomers h
INNER JOIN Customers c ON c.Id = h.CustomerId
ORDER BY h.LifetimeValue DESC;

-- Multiple CTEs
WITH
MonthlySales AS (
    SELECT CAST(DATETRUNC(MONTH, CreatedAt) AS DATE) AS Month,
           SUM(Total) AS Revenue
    FROM Orders GROUP BY DATETRUNC(MONTH, CreatedAt)
),
AvgMonthly AS (SELECT AVG(Revenue) AS AvgRev FROM MonthlySales)
SELECT m.Month, m.Revenue, a.AvgRev,
       m.Revenue - a.AvgRev AS DiffFromAvg
FROM MonthlySales m CROSS JOIN AvgMonthly a
ORDER BY m.Month;

-- Recursive CTE โ€” traverse hierarchy (org chart, categories, BOM)
WITH OrgTree AS (
    -- Base case: top-level employees (no manager)
    SELECT Id, Name, ManagerId, 0 AS Level,
           CAST(Name AS NVARCHAR(MAX)) AS Path
    FROM Employees
    WHERE ManagerId IS NULL

    UNION ALL

    -- Recursive case: employees whose manager is in the previous level
    SELECT e.Id, e.Name, e.ManagerId, ot.Level + 1,
           ot.Path + N' > ' + e.Name
    FROM Employees e
    INNER JOIN OrgTree ot ON ot.Id = e.ManagerId
)
SELECT Level, REPLICATE('  ', Level) + Name AS IndentedName, Path
FROM OrgTree
ORDER BY Path
OPTION (MAXRECURSION 100);  -- safety limit (default 100, 0 = unlimited)

02 What are window functions in SQL Server? Give practical examples.

SQL

SELECT
    Name, Department, Salary,

    -- Ranking
    ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS RowNum,
    RANK()       OVER (PARTITION BY Department ORDER BY Salary DESC) AS Rank,
    DENSE_RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS DenseRank,
    NTILE(4)     OVER (ORDER BY Salary) AS Quartile,
    PERCENT_RANK() OVER (ORDER BY Salary) AS PctRank,

    -- Aggregate as window (no collapse โ€” keeps all rows)
    SUM(Salary)  OVER (PARTITION BY Department)  AS DeptTotal,
    AVG(Salary)  OVER (PARTITION BY Department)  AS DeptAvg,
    COUNT(*)     OVER (PARTITION BY Department)  AS DeptHeadcount,
    MIN(Salary)  OVER (PARTITION BY Department)  AS DeptMin,
    MAX(Salary)  OVER (PARTITION BY Department)  AS DeptMax,

    -- Running total (cumulative sum ordered by HireDate)
    SUM(Salary) OVER (ORDER BY HireDate
                      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal,

    -- 3-row moving average
    AVG(Salary) OVER (ORDER BY HireDate
                      ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS MovingAvg,

    -- Lead / Lag โ€” access adjacent rows
    LAG(Salary,  1, 0) OVER (PARTITION BY Department ORDER BY HireDate) AS PrevSalary,
    LEAD(Salary, 1, 0) OVER (PARTITION BY Department ORDER BY HireDate) AS NextSalary,

    -- First / Last value
    FIRST_VALUE(Name) OVER (PARTITION BY Department ORDER BY Salary DESC) AS TopEarner,
    LAST_VALUE(Name)  OVER (PARTITION BY Department ORDER BY Salary DESC
                            ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS BottomEarner

FROM Employees;

03 What is the MERGE statement in SQL Server?

SQL MERGE (often called upsert) performs INSERT, UPDATE, and DELETE in a single statement by comparing a source with a target. Useful for synchronising a staging table into a production table.

-- Synchronise staging data into the Products table
MERGE dbo.Products AS target
USING dbo.ProductsStaging AS source
    ON target.Sku = source.Sku          -- match condition

WHEN MATCHED AND target.Price <> source.Price THEN
    UPDATE SET
        target.Name       = source.Name,
        target.Price      = source.Price,
        target.UpdatedAt  = SYSDATETIME()

WHEN NOT MATCHED BY TARGET THEN         -- in source but not in target โ†’ INSERT
    INSERT (Sku, Name, Price, CreatedAt)
    VALUES (source.Sku, source.Name, source.Price, SYSDATETIME())

WHEN NOT MATCHED BY SOURCE THEN         -- in target but not in source โ†’ DELETE
    DELETE

OUTPUT $action AS MergeAction,          -- 'INSERT', 'UPDATE', or 'DELETE'
       inserted.Id, deleted.Id;         -- rows affected

-- Important: always add a semicolon before MERGE (parser ambiguity)
-- Important: MERGE can cause deadlocks โ€” consider disabling row-level locking or
-- using explicit transactions with UPDLOCK hints for high-concurrency scenarios

04 What is the execution plan and how do you read it?

Performance

-- Show estimated plan (does not execute)
SET SHOWPLAN_ALL ON;
SELECT * FROM Orders WHERE CustomerId = 42;
SET SHOWPLAN_ALL OFF;

-- Show actual plan (executes the query)
-- In SSMS: Ctrl+M then F5, or Query > Include Actual Execution Plan

-- Text-based plan
SET STATISTICS PROFILE ON;
SELECT * FROM Orders WHERE CustomerId = 42;
SET STATISTICS PROFILE OFF;

-- I/O and time statistics
SET STATISTICS IO ON;
SET STATISTICS TIME ON;

Key operators to recognise:

  • Clustered Index Scan โ€” reads ALL rows of a clustered table. Like a full table scan. Expensive on large tables.
  • Clustered Index Seek โ€” navigates the B-tree to find specific rows. Very fast. What you want.
  • Non-Clustered Index Seek + Key Lookup โ€” finds rows in a non-clustered index, then fetches additional columns from the table. The Key Lookup is the extra cost โ€” eliminate with a covering index.
  • Hash Match (Join) โ€” builds a hash table from one input and probes with the other. Good for large, unsorted inputs.
  • Nested Loops (Join) โ€” for each row in the outer input, search the inner input. Best when outer is small and inner has an index.
  • Merge Join โ€” requires both inputs sorted on the join column. Very efficient for sorted data.
  • Sort โ€” expensive; indicates missing index or ORDER BY without a supporting index.

Look for thick arrows (many rows), high estimated cost %, red warnings (missing statistics, spills), and large differences between estimated vs actual rows.

05 What are transaction isolation levels in SQL Server?

Transactions

  • READ UNCOMMITTED โ€” reads dirty (uncommitted) data. Fastest, no shared locks. Use only for approximate reporting where accuracy is not critical.
  • READ COMMITTED (default) โ€” reads only committed data. Holds shared lock only while reading, releases immediately. Can see different data on repeated reads within a transaction (non-repeatable reads).
  • REPEATABLE READ โ€” holds shared locks for the duration of the transaction. Prevents non-repeatable reads but not phantom rows.
  • SERIALIZABLE โ€” range locks prevent phantom rows. Strictest, most blocking.
  • SNAPSHOT โ€” row versioning; readers never block writers. Each statement sees a snapshot of committed data as of the statement start. Requires ALLOW_SNAPSHOT_ISOLATION ON.
  • READ COMMITTED SNAPSHOT (RCSI) โ€” row versioning applied to READ COMMITTED; the default for Azure SQL. Eliminates most blocking without changing application code.
-- Set for session
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

-- Set for a single query (hint)
SELECT * FROM Orders WITH (NOLOCK);  -- equivalent to READ UNCOMMITTED for this table
-- NOLOCK can return dirty reads, duplicate rows, missing rows โ€” use with care

-- Enable RCSI for a database (eliminates read/write blocking)
ALTER DATABASE MyDb SET READ_COMMITTED_SNAPSHOT ON;

-- Check current isolation level
SELECT transaction_isolation_level
FROM sys.dm_exec_sessions
WHERE session_id = @@SPID;

06 What are triggers in SQL Server? What are the types?

Objects

  • DML Triggers โ€” fire on INSERT, UPDATE, DELETE. AFTER (most common) or INSTEAD OF. Have access to INSERTED and DELETED pseudo-tables containing the new and old row values.
  • DDL Triggers โ€” fire on CREATE, ALTER, DROP, and other schema-change events. Used for auditing or preventing schema changes.
  • Logon Triggers โ€” fire when a user session is established. Used for connection auditing or restrictions.
-- AFTER trigger: maintain audit log
CREATE OR ALTER TRIGGER trg_Orders_Audit
ON Orders
AFTER INSERT, UPDATE, DELETE
AS
BEGIN
    SET NOCOUNT ON;

    -- INSERTED holds new rows; DELETED holds old rows
    -- For INSERT:  INSERTED has new rows; DELETED is empty
    -- For DELETE:  DELETED has old rows; INSERTED is empty
    -- For UPDATE:  INSERTED has new; DELETED has old

    INSERT INTO OrdersAudit (OrderId, Action, OldStatus, NewStatus, ChangedAt, ChangedBy)
    SELECT
        COALESCE(i.Id, d.Id),
        CASE WHEN d.Id IS NULL THEN 'INSERT'
             WHEN i.Id IS NULL THEN 'DELETE'
             ELSE 'UPDATE' END,
        d.Status,    -- old status (NULL for INSERT)
        i.Status,    -- new status (NULL for DELETE)
        SYSDATETIME(),
        SUSER_NAME()
    FROM inserted i
    FULL OUTER JOIN deleted d ON d.Id = i.Id;
END;
GO

-- INSTEAD OF trigger: handle modifications to a view
CREATE TRIGGER trg_vw_Product_InsteadOfInsert
ON vw_ProductWithCategory
INSTEAD OF INSERT
AS
BEGIN
    INSERT INTO Products (Name, Price, CategoryId)
    SELECT Name, Price, CategoryId FROM inserted;
END;

07 What is table partitioning in SQL Server?

Performance Partitioning divides a large table into smaller physical segments called partitions, while maintaining a single logical table. Queries against specific date ranges scan only relevant partitions (partition elimination), dramatically improving performance on large tables.

-- Step 1: Create a partition function (defines the boundary values)
CREATE PARTITION FUNCTION pf_OrdersByYear (DATE)
AS RANGE RIGHT FOR VALUES
    ('2023-01-01', '2024-01-01', '2025-01-01', '2026-01-01');
-- Creates 5 partitions:
-- 1: < 2023-01-01
-- 2: 2023-01-01 to 2023-12-31
-- 3: 2024-01-01 to 2024-12-31
-- 4: 2025-01-01 to 2025-12-31
-- 5: >= 2026-01-01

-- Step 2: Create a partition scheme (maps partitions to filegroups)
CREATE PARTITION SCHEME ps_OrdersByYear
AS PARTITION pf_OrdersByYear
ALL TO ([PRIMARY]);   -- all partitions on PRIMARY filegroup
-- Production: spread across multiple filegroups for I/O parallelism

-- Step 3: Create a partitioned table
CREATE TABLE Orders (
    Id          INT           NOT NULL,
    OrderDate   DATE          NOT NULL,
    CustomerId  INT           NOT NULL,
    Total       DECIMAL(12,2) NOT NULL
) ON ps_OrdersByYear (OrderDate);  -- partition by OrderDate

-- Create aligned clustered index
CREATE CLUSTERED INDEX IX_Orders_Date ON Orders (OrderDate, Id)
ON ps_OrdersByYear (OrderDate);

-- Check which partition a row lands in
SELECT $PARTITION.pf_OrdersByYear('2025-06-15');  -- returns 4

-- Partition switching (instant โ€” metadata operation)
ALTER TABLE Orders SWITCH PARTITION 2 TO OrdersArchive;  -- move 2023 partition out
ALTER TABLE Orders2027 SWITCH TO Orders PARTITION 6;     -- slide in new partition

08 What are temporal tables (system-versioned tables) in SQL Server?

SQL 2016+ Temporal tables (SQL Server 2016+) automatically maintain a complete history of all row changes. SQL Server inserts the old row version into a history table with the time range it was valid, enabling point-in-time queries.

-- Create a temporal table
CREATE TABLE Products (
    Id          INT           IDENTITY PRIMARY KEY,
    Name        NVARCHAR(200) NOT NULL,
    Price       DECIMAL(10,2) NOT NULL,

    -- Required columns for system versioning
    ValidFrom   DATETIME2     GENERATED ALWAYS AS ROW START NOT NULL,
    ValidTo     DATETIME2     GENERATED ALWAYS AS ROW END   NOT NULL,
    PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
)
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.ProductsHistory));

-- Normal DML โ€” history is maintained automatically
UPDATE Products SET Price = 29.99 WHERE Id = 1;  -- old row saved to history
DELETE FROM Products WHERE Id = 2;               -- row moved to history

-- Query current state
SELECT * FROM Products;

-- Query the state at a specific point in time
SELECT * FROM Products
FOR SYSTEM_TIME AS OF '2025-06-01 12:00:00';

-- Query a time range
SELECT * FROM Products
FOR SYSTEM_TIME BETWEEN '2025-01-01' AND '2025-12-31';

-- See full history
SELECT * FROM Products
FOR SYSTEM_TIME ALL
WHERE Id = 1
ORDER BY ValidFrom;

-- Disable versioning (to alter schema or drop)
ALTER TABLE Products SET (SYSTEM_VERSIONING = OFF);
ALTER TABLE Products ADD Column1 INT NULL;
ALTER TABLE Products SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.ProductsHistory));

09 What is a covering index and how does it eliminate key lookups?

Performance

-- Query that causes a Key Lookup:
SELECT OrderDate, Total, Status
FROM Orders
WHERE CustomerId = 42;

-- Index on CustomerId โ€” SQL Server finds matching rows but must fetch
-- OrderDate, Total, Status from the clustered index (KEY LOOKUP per row)
CREATE INDEX IX_Orders_CustomerId ON Orders (CustomerId);

-- EXECUTION PLAN shows:
-- Index Seek (IX_Orders_CustomerId) + Key Lookup (clustered)
-- Key Lookup is expensive for large result sets

-- COVERING INDEX โ€” include extra columns so no Key Lookup is needed
CREATE INDEX IX_Orders_Customer_Covering
ON Orders (CustomerId)
INCLUDE (OrderDate, Total, Status);
-- INCLUDE = columns stored at leaf level only, not part of index key
-- INCLUDE is better than adding columns to the key when they are only in SELECT

-- Now the query: Index Seek only โ€” no Key Lookup!

-- When to add columns to key vs INCLUDE:
-- Key columns: used in WHERE, JOIN, ORDER BY (need sortability)
-- INCLUDE columns: only in SELECT (just need the value, no sorting)

-- Missing index suggestions in SQL Server
SELECT * FROM sys.dm_db_missing_index_details;
SELECT * FROM sys.dm_db_missing_index_groups;
SELECT * FROM sys.dm_db_missing_index_group_stats;
-- Shows estimated impact and suggested index definitions

10 What is the difference between a function and a stored procedure in SQL Server?

Objects

  • Scalar function โ€” returns a single value; can be used in SELECT, WHERE, expressions. Cannot use non-deterministic functions (GETDATE, NEWID) in some contexts. Historically caused row-by-row execution when used in WHERE (avoid for performance).
  • Table-valued function (TVF) โ€” returns a table; can be used in FROM clause. Inline TVF (single SELECT) is often optimised well by the query optimiser.
  • Stored procedure โ€” can return multiple result sets, use OUTPUT parameters, modify data (INSERT/UPDATE/DELETE), manage transactions, use EXEC, return error codes. Cannot be used inside a SELECT.
-- Scalar function
CREATE OR ALTER FUNCTION dbo.fn_CalculateTax (@Price DECIMAL(10,2), @Rate FLOAT = 0.2)
RETURNS DECIMAL(10,2)
AS BEGIN
    RETURN ROUND(@Price * @Rate, 2);
END;

SELECT Name, Price, dbo.fn_CalculateTax(Price) AS Tax FROM Products;

-- Inline table-valued function (best performance โ€” treated like a view)
CREATE OR ALTER FUNCTION dbo.fn_GetOrdersByCustomer (@CustomerId INT)
RETURNS TABLE AS RETURN (
    SELECT o.Id, o.OrderDate, o.Total
    FROM Orders o
    WHERE o.CustomerId = @CustomerId
);

SELECT * FROM dbo.fn_GetOrdersByCustomer(42);

-- Multi-statement TVF (slower โ€” materialised temp table internally)
CREATE OR ALTER FUNCTION dbo.fn_GetOrderStats (@CustomerId INT)
RETURNS @result TABLE (Metric NVARCHAR(50), Value DECIMAL(12,2))
AS BEGIN
    INSERT INTO @result VALUES ('TotalOrders',  (SELECT COUNT(*) FROM Orders WHERE CustomerId = @CustomerId));
    INSERT INTO @result VALUES ('TotalRevenue', (SELECT SUM(Total) FROM Orders WHERE CustomerId = @CustomerId));
    RETURN;
END;

11 What is PIVOT and UNPIVOT in SQL Server?

SQL PIVOT transforms rows into columns. UNPIVOT transforms columns into rows.

-- Source data:
-- Year | Quarter | Sales
-- 2025 | Q1      | 100
-- 2025 | Q2      | 200
-- 2025 | Q3      | 150
-- 2025 | Q4      | 300

-- PIVOT โ€” turn Quarter rows into columns
SELECT Year, Q1, Q2, Q3, Q4
FROM (
    SELECT Year, Quarter, Sales FROM QuarterlySales
) AS Source
PIVOT (
    SUM(Sales) FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS PivotTable;

-- Result:
-- Year | Q1  | Q2  | Q3  | Q4
-- 2025 | 100 | 200 | 150 | 300

-- Dynamic PIVOT (when column values are unknown at design time)
DECLARE @cols  NVARCHAR(MAX) = '';
DECLARE @query NVARCHAR(MAX);

SELECT @cols += QUOTENAME(Quarter) + ','
FROM (SELECT DISTINCT Quarter FROM QuarterlySales) q ORDER BY Quarter;
SET @cols = LEFT(@cols, LEN(@cols) - 1);

SET @query = N'
SELECT Year, ' + @cols + '
FROM (SELECT Year, Quarter, Sales FROM QuarterlySales) src
PIVOT (SUM(Sales) FOR Quarter IN (' + @cols + ')) pvt';

EXEC sp_executesql @query;

-- UNPIVOT โ€” turn columns into rows
SELECT Year, Quarter, Sales
FROM PivotedSales
UNPIVOT (Sales FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])) AS unpvt;

12 What are indexes โ€” filtered, columnstore, and full-text?

Indexes

-- FILTERED INDEX โ€” index only rows matching a condition
-- Smaller, faster for selective queries
CREATE INDEX IX_Orders_Pending
ON Orders (CreatedAt, CustomerId)
WHERE Status = 'Pending';
-- Used only when query predicate matches the filter condition

CREATE UNIQUE INDEX UQ_Users_Email_Active
ON Users (Email)
WHERE IsActive = 1;  -- enforce uniqueness only for active users

-- COLUMNSTORE INDEX โ€” columnar storage for analytics/aggregations
-- Excellent for: data warehouses, large aggregations, GROUP BY on millions of rows

-- Non-clustered columnstore (add analytics to OLTP table)
CREATE NONCLUSTERED COLUMNSTORE INDEX IX_Orders_CS
ON Orders (OrderDate, CustomerId, Total, Status);

-- Clustered columnstore (DW table โ€” replaces row store entirely)
CREATE CLUSTERED COLUMNSTORE INDEX IX_FactSales_CS
ON FactSales;  -- entire table is stored column-by-column

-- FULL-TEXT INDEX โ€” text search beyond LIKE
CREATE FULLTEXT CATALOG ft_catalog AS DEFAULT;

CREATE FULLTEXT INDEX ON Articles (Title, Body)
KEY INDEX PK_Articles;

-- Full-text queries
SELECT * FROM Articles WHERE CONTAINS(Body, 'SQL Server AND performance');
SELECT * FROM Articles WHERE FREETEXT(Body, 'fast query execution');
SELECT * FROM Articles WHERE CONTAINSTABLE(Body, '"SQL Server"', 10)  -- top 10 ranked

13 What are common SQL Server performance troubleshooting DMVs?

Performance Dynamic Management Views (DMVs) are system views that expose the SQL Server internal state for real-time diagnostics.

-- Top 10 most expensive cached query plans by total CPU
SELECT TOP 10
    total_worker_time / execution_count  AS avg_cpu_us,
    total_elapsed_time / execution_count AS avg_elapsed_us,
    execution_count,
    total_logical_reads / execution_count AS avg_logical_reads,
    SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
          ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY avg_cpu_us DESC;

-- Current blocking chains
SELECT
    blocking_session_id,
    session_id,
    wait_type,
    wait_time / 1000.0 AS wait_secs,
    SUBSTRING(st.text, (r.statement_start_offset/2)+1, 100) AS query
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) st
WHERE blocking_session_id > 0;

-- Missing index recommendations
SELECT TOP 10
    mig.avg_total_user_cost * mig.avg_user_impact * (mig.user_seeks + mig.user_scans) AS improvement_measure,
    mid.statement AS table_name,
    mid.equality_columns, mid.inequality_columns, mid.included_columns
FROM sys.dm_db_missing_index_groups mig
JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle
JOIN sys.dm_db_missing_index_details    mid  ON mid.index_handle  = mig.index_handle
ORDER BY improvement_measure DESC;

-- Index usage (find unused indexes โ€” costly writes, no reads)
SELECT OBJECT_NAME(i.object_id) AS TableName, i.name AS IndexName,
       ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats ius ON ius.object_id = i.object_id AND ius.index_id = i.index_id
WHERE i.type > 0  -- skip heaps
ORDER BY ius.user_updates DESC;

14 What is the difference between INNER JOIN and EXISTS in SQL Server?

SQL

-- Find customers who have placed at least one order

-- INNER JOIN (may return duplicates if customer has multiple orders)
SELECT DISTINCT c.Id, c.Name
FROM Customers c
INNER JOIN Orders o ON o.CustomerId = c.Id;
-- DISTINCT needed โ€” customer with 5 orders appears 5 times without it
-- Can be slower due to join + deduplication

-- EXISTS (stops at first match, no duplicates, often faster)
SELECT c.Id, c.Name
FROM Customers c
WHERE EXISTS (SELECT 1 FROM Orders o WHERE o.CustomerId = c.Id);
-- Correlated โ€” checks existence only, stops at first row found
-- No deduplication needed, clearer intent

-- NOT EXISTS (anti-join โ€” customers with NO orders)
SELECT c.Id, c.Name
FROM Customers c
WHERE NOT EXISTS (SELECT 1 FROM Orders o WHERE o.CustomerId = c.Id);

-- NOT IN vs NOT EXISTS โ€” critical difference with NULLs!
-- NOT IN fails silently if the subquery returns any NULL
SELECT Name FROM Customers
WHERE Id NOT IN (SELECT CustomerId FROM Orders);
-- If ANY Orders.CustomerId is NULL, this returns 0 rows!

-- NOT EXISTS handles NULLs correctly โ€” PREFER this over NOT IN
SELECT Name FROM Customers c
WHERE NOT EXISTS (SELECT 1 FROM Orders o WHERE o.CustomerId = c.Id);

15 What is parameter sniffing in SQL Server and how do you deal with it?

Performance Parameter sniffing is the process by which SQL Server compiles a stored procedure’s execution plan based on the parameter values used the FIRST time it runs. The plan is then cached and reused for all subsequent calls โ€” even if those calls have very different parameters that would benefit from a different plan.

-- Problem scenario:
-- usp_GetOrders compiled with @CustomerId = 1 (1 order โ†’ Index Seek plan cached)
-- Next call with @CustomerId = 9999 (10,000 orders) uses the same Index Seek plan
-- Should use Table Scan for 10,000 rows but gets sub-optimal Index Seek plan

-- Solution 1: OPTION (RECOMPILE) โ€” recompile per execution (small overhead, safest)
CREATE PROCEDURE usp_GetOrders @CustomerId INT AS
BEGIN
    SELECT * FROM Orders WHERE CustomerId = @CustomerId
    OPTION (RECOMPILE);  -- fresh plan every time using actual parameter value
END;

-- Solution 2: WITH RECOMPILE on the procedure โ€” recompile every call
CREATE PROCEDURE usp_GetOrders @CustomerId INT WITH RECOMPILE AS ...

-- Solution 3: Optimise for unknown โ€” compile for average statistics (not specific value)
CREATE PROCEDURE usp_GetOrders @CustomerId INT AS
BEGIN
    SELECT * FROM Orders WHERE CustomerId = @CustomerId
    OPTION (OPTIMIZE FOR (@CustomerId UNKNOWN));
END;

-- Solution 4: Local variable (breaks sniffing โ€” no parameter statistics used)
CREATE PROCEDURE usp_GetOrders @CustomerId INT AS
BEGIN
    DECLARE @LocalId INT = @CustomerId;
    SELECT * FROM Orders WHERE CustomerId = @LocalId;
END;

-- Diagnose: check cached plans
SELECT qs.plan_generation_num, qs.execution_count, qp.query_plan
FROM sys.dm_exec_procedure_stats ps
CROSS APPLY sys.dm_exec_query_plan(ps.plan_handle) qp
WHERE OBJECT_NAME(ps.object_id) = 'usp_GetOrders';

16 What is APPLY (CROSS APPLY and OUTER APPLY) in SQL Server?

SQL APPLY invokes a table-valued function (or subquery) for each row of the outer table. It can reference columns from the outer table โ€” similar to a correlated subquery but in the FROM clause.

-- CROSS APPLY โ€” like INNER JOIN; excludes rows where the function returns no rows
-- Get the 3 most recent orders for each customer
SELECT c.Name, recent.OrderDate, recent.Total
FROM Customers c
CROSS APPLY (
    SELECT TOP 3 OrderDate, Total
    FROM Orders o
    WHERE o.CustomerId = c.Id       -- references outer c
    ORDER BY OrderDate DESC
) AS recent;

-- OUTER APPLY โ€” like LEFT JOIN; includes rows even if function returns nothing
SELECT c.Name, recent.OrderDate, recent.Total
FROM Customers c
OUTER APPLY (
    SELECT TOP 3 OrderDate, Total
    FROM Orders o
    WHERE o.CustomerId = c.Id
    ORDER BY OrderDate DESC
) AS recent;
-- Customers with no orders still appear with NULL recent.OrderDate

-- APPLY with a table-valued function
SELECT c.Name, stats.TotalOrders, stats.TotalRevenue
FROM Customers c
CROSS APPLY dbo.fn_GetCustomerStats(c.Id) AS stats;

-- Parse delimited string (each customer has a Tags column like 'VIP,Active,Gold')
SELECT c.Name, t.Tag
FROM Customers c
CROSS APPLY STRING_SPLIT(c.Tags, ',') AS t;  -- STRING_SPLIT is a TVF

17 What is JSON support in SQL Server?

JSON

-- JSON is stored as NVARCHAR โ€” not a native type (unlike PostgreSQL's JSONB)
-- SQL Server 2016+ added JSON functions

-- Read JSON data
DECLARE @json NVARCHAR(MAX) = '{"name":"Alice","age":30,"orders":[{"id":1,"total":99.99}]}';

SELECT JSON_VALUE(@json, '$.name')               -- 'Alice'
SELECT JSON_VALUE(@json, '$.orders[0].total')    -- '99.99'
SELECT JSON_QUERY(@json, '$.orders')             -- '[{"id":1,"total":99.99}]'

-- Parse JSON array into rows
SELECT *
FROM OPENJSON(@json, '$.orders')
WITH (
    Id    INT            '$.id',
    Total DECIMAL(10,2)  '$.total'
);

-- Modify JSON
SET @json = JSON_MODIFY(@json, '$.age', 31);
SET @json = JSON_MODIFY(@json, '$.email', 'alice@example.com');
SET @json = JSON_MODIFY(@json, '$.orders[0].status', 'shipped');

-- Validate JSON
SELECT ISJSON(@json);   -- 1 if valid, 0 if not

-- Convert rows to JSON
SELECT Id, Name, Email
FROM Customers
FOR JSON PATH, ROOT('customers');
-- {"customers":[{"Id":1,"Name":"Alice","Email":"..."},...]}

FOR JSON AUTO;   -- simpler, less control over structure

-- SQL Server 2022 โ€” new JSON functions
SELECT JSON_PATH_EXISTS('{"a":1}', '$.a');     -- 1 (exists)
SELECT JSON_OBJECT('name': 'Alice', 'age': 30);  -- '{"name":"Alice","age":30}'
SELECT JSON_ARRAY(1, 'two', NULL, true);          -- '[1,"two",null,true]'

18 How do you implement pagination in SQL Server?

SQL

-- Modern pagination: OFFSET-FETCH (SQL Server 2012+, ANSI standard)
-- GET /api/products?page=3&size=20

DECLARE @page INT = 3;
DECLARE @size INT = 20;

SELECT Id, Name, Price
FROM Products
ORDER BY Name ASC                          -- ORDER BY is REQUIRED with OFFSET
OFFSET  (@page - 1) * @size ROWS          -- skip first 40 rows
FETCH NEXT @size ROWS ONLY;               -- return next 20

-- With total count (for UI pagination)
SELECT
    Id, Name, Price,
    COUNT(*) OVER () AS TotalCount        -- window function โ€” no extra query
FROM Products
WHERE Category = 'Electronics'
ORDER BY Name
OFFSET  (@page - 1) * @size ROWS
FETCH NEXT @size ROWS ONLY;

-- Legacy pagination with ROW_NUMBER() (pre-2012 or when OFFSET is not available)
WITH NumberedRows AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Name) AS RowNum
    FROM Products
)
SELECT * FROM NumberedRows
WHERE RowNum BETWEEN (@page - 1) * @size + 1 AND @page * @size;

-- Performance: pagination on large tables requires an index on the ORDER BY column
-- OFFSET is O(n) โ€” it must skip rows; keyset pagination is O(1):

-- Keyset pagination (cursor-based) โ€” more efficient for large datasets
-- First page:
SELECT TOP 20 Id, Name, Price FROM Products ORDER BY Name, Id;

-- Next page (use last seen Name and Id from previous page):
SELECT TOP 20 Id, Name, Price FROM Products
WHERE (Name > @LastName OR (Name = @LastName AND Id > @LastId))
ORDER BY Name, Id;

19 What is READ_COMMITTED_SNAPSHOT ISOLATION (RCSI) and why is it important?

Concurrency

Under the default READ COMMITTED isolation level, readers take shared locks and writers take exclusive locks โ€” readers and writers block each other. RCSI replaces shared locks with row versioning: writers store old row versions in tempdb, and readers retrieve the committed version without any locking.

-- Enable RCSI (requires brief exclusive access)
ALTER DATABASE MyDatabase SET READ_COMMITTED_SNAPSHOT ON;

-- Verify
SELECT name, is_read_committed_snapshot_on
FROM sys.databases
WHERE name = 'MyDatabase';

-- EFFECT: Under RCSI, a SELECT never blocks an INSERT/UPDATE/DELETE
-- and an INSERT/UPDATE/DELETE never blocks a SELECT
-- This eliminates most blocking in typical OLTP workloads

-- Trade-offs:
-- tempdb usage increases (stores row versions)
-- VERSION_STORE in tempdb must be monitored
-- SELECT may read slightly stale data (last committed version)

-- Azure SQL Database uses RCSI by default (and cannot be turned off)

-- Monitor version store size
SELECT *
FROM sys.dm_db_session_space_usage
WHERE session_id = @@SPID;

SELECT
    reserved_MB  = SUM(reserved_page_count) * 8.0 / 1024,
    used_MB      = SUM(used_page_count) * 8.0 / 1024
FROM sys.dm_db_file_space_usage;

20 What is the difference between VARCHAR(MAX) and TEXT in SQL Server?

Data Types

  • TEXT โ€” legacy large object type (pre-SQL Server 2005). Stores up to 2 GB. Cannot be used in string functions, WHERE clauses with LIKE (limited), variable assignments, or expressions without conversion. Stored off-row always.
  • VARCHAR(MAX) โ€” modern replacement (SQL Server 2005+). Also up to 2 GB but behaves like a regular VARCHAR. Works in all string functions, WHERE, assignments, CAST/CONVERT. Stored in-row when โ‰ค 8,000 bytes; spills to off-row LOB pages only when larger.
  • NVARCHAR(MAX) โ€” Unicode equivalent of VARCHAR(MAX). Use for all user content that may contain international characters.
-- โŒ Legacy TEXT (deprecated โ€” avoid completely)
CREATE TABLE Articles (
    Id   INT PRIMARY KEY,
    Body TEXT   -- deprecated since SQL Server 2005
);
-- Cannot use: LEN(Body), CHARINDEX(..., Body), Body = @var (assignment)
-- Must use READTEXT / WRITETEXT / UPDATETEXT

-- โœ… Modern VARCHAR(MAX) / NVARCHAR(MAX)
CREATE TABLE Articles (
    Id   INT PRIMARY KEY,
    Body NVARCHAR(MAX)   -- up to 2 GB Unicode, works with all functions
);

-- Works normally
SELECT LEN(Body) AS Length FROM Articles;
DECLARE @body NVARCHAR(MAX) = (SELECT Body FROM Articles WHERE Id = 1);
UPDATE Articles SET Body = REPLACE(Body, 'oldword', 'newword') WHERE Id = 1;

-- Migrate TEXT to VARCHAR(MAX)
ALTER TABLE Articles ALTER COLUMN Body NVARCHAR(MAX);
-- May require DROP and recreate if TEXT column has dependencies

21 What is the FOR XML clause in SQL Server?

XML

-- FOR XML RAW โ€” each row becomes an element with column attributes
SELECT Id, Name, Price FROM Products FOR XML RAW;
-- <row Id="1" Name="Widget" Price="9.99"/>
-- <row Id="2" Name="Gadget" Price="24.99"/>

-- FOR XML AUTO โ€” table names become elements
SELECT p.Name, c.Name AS Category
FROM Products p
INNER JOIN Categories c ON c.Id = p.CategoryId
FOR XML AUTO;

-- FOR XML PATH โ€” most flexible; use XPath-like column aliases
SELECT
    Id          AS '@id',        -- attribute
    Name        AS 'name',       -- child element
    Price       AS 'price',
    (SELECT Id AS '@id', Name AS 'name'
     FROM Categories WHERE Id = p.CategoryId
     FOR XML PATH('category'), TYPE)
FROM Products p
FOR XML PATH('product'), ROOT('products');

-- Result:
-- <products>
--   <product id="1">
--     <name>Widget</name>
--     <price>9.99</price>
--     <category id="3"><name>Electronics</name></category>
--   </product>
-- </products>

-- Common use: aggregate strings without STRING_AGG (pre-2017)
SELECT
    DepartmentId,
    STUFF(
        (SELECT ', ' + Name FROM Employees e2
         WHERE e2.DepartmentId = e.DepartmentId
         FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),
        1, 2, ''
    ) AS EmployeeNames
FROM Employees e
GROUP BY DepartmentId;

📝 Knowledge Check

Test your understanding of advanced SQL Server patterns and features.

🧠 Quiz Question 1 of 5

What is the key advantage of a covering index over a non-clustered index that only includes key columns?





🧠 Quiz Question 2 of 5

What is parameter sniffing in SQL Server and why can it cause performance problems?





🧠 Quiz Question 3 of 5

Why should you prefer NOT EXISTS over NOT IN when checking for missing related rows?





🧠 Quiz Question 4 of 5

What does READ_COMMITTED_SNAPSHOT ISOLATION (RCSI) eliminate in SQL Server?





🧠 Quiz Question 5 of 5

What is the difference between CROSS APPLY and OUTER APPLY in SQL Server?