SQL Server Monitoring — DMVs, Extended Events and Performance Baselines

📋 Table of Contents
  1. Essential DMV Queries
  2. Common Mistakes

SQL Server exposes hundreds of Dynamic Management Views (DMVs) — system views that reveal the current state of the server: running queries, wait statistics, index usage, I/O performance, and memory pressure. DMVs require no additional tools — they are queryable with T-SQL. A handful of key DMVs cover 90% of performance troubleshooting needs: what is running now, what is waiting, which indexes are and are not being used, and whether the server has enough memory.

Essential DMV Queries

-- ── Active queries with their SQL text and wait info ─────────────────────
SELECT
    r.session_id,
    r.status,
    r.wait_type,
    r.wait_time / 1000.0      AS WaitSeconds,
    r.cpu_time / 1000.0       AS CpuSeconds,
    r.total_elapsed_time / 1000.0 AS ElapsedSeconds,
    r.logical_reads,
    r.blocking_session_id,
    SUBSTRING(t.text, (r.statement_start_offset/2)+1, 200) AS CurrentSQL,
    p.query_plan                      -- XML execution plan
FROM   sys.dm_exec_requests r
CROSS  APPLY sys.dm_exec_sql_text(r.sql_handle)  t
CROSS  APPLY sys.dm_exec_query_plan(r.plan_handle) p
WHERE  r.session_id > 50            -- exclude system sessions
  AND  r.session_id <> @@SPID       -- exclude this query
ORDER  BY r.total_elapsed_time DESC;

-- ── Wait statistics — what is SQL Server waiting for? ────────────────────
SELECT TOP 20
    wait_type,
    waiting_tasks_count,
    wait_time_ms / 1000.0          AS TotalWaitSec,
    max_wait_time_ms / 1000.0      AS MaxWaitSec,
    wait_time_ms / NULLIF(waiting_tasks_count, 0) AS AvgWaitMs
FROM   sys.dm_os_wait_stats
WHERE  wait_type NOT IN (   -- filter out benign waits
    'SLEEP_TASK', 'BROKER_TO_FLUSH', 'BROKER_TASK_STOP',
    'CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT', 'DISPATCHER_QUEUE_SEMAPHORE',
    'FT_IFTS_SCHEDULER_IDLE_WAIT', 'HADR_FILESTREAM_IOMGR_IOCOMPLETION',
    'HADR_WORK_QUEUE', 'LAZYWRITER_SLEEP', 'LOGMGR_QUEUE',
    'ONDEMAND_TASK_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH',
    'RESOURCE_QUEUE', 'SERVER_IDLE_CHECK', 'SLEEP_DBSTARTUP',
    'SLEEP_DCOMSTARTUP', 'SLEEP_MASTERDBREADY', 'SLEEP_MASTERMDREADY',
    'SLEEP_MASTERUPGRADED', 'SLEEP_MSDBSTARTUP', 'SLEEP_SYSTEMTASK',
    'SLEEP_TEMPDBSTARTUP', 'SNI_HTTP_ACCEPT', 'SP_SERVER_DIAGNOSTICS_SLEEP',
    'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 'XE_DISPATCHER_WAIT',
    'XE_TIMER_EVENT'
)
ORDER  BY wait_time_ms DESC;

-- ── Page Life Expectancy — memory pressure indicator ─────────────────────
-- PLE = seconds a page stays in buffer pool before being evicted
-- Rule of thumb: PLE should be > 300 seconds (5 minutes)
-- Low PLE (<300) means SQL Server is frequently evicting pages — needs more RAM
SELECT
    object_name,
    counter_name,
    cntr_value AS CurrentValue
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Page life expectancy'
  AND object_name LIKE '%Buffer Manager%';

-- ── Buffer pool cache hit ratio ───────────────────────────────────────────
-- > 95% is healthy; lower means excessive disk reads
SELECT
    (a.cntr_value * 1.0 / b.cntr_value) * 100 AS CacheHitRatioPct
FROM sys.dm_os_performance_counters a
JOIN sys.dm_os_performance_counters b
    ON b.counter_name = 'Buffer cache hit ratio base'
WHERE a.counter_name = 'Buffer cache hit ratio'
  AND a.object_name  LIKE '%Buffer Manager%';

-- ── I/O statistics per database file ─────────────────────────────────────
SELECT
    DB_NAME(vfs.database_id)  AS DatabaseName,
    mf.physical_name,
    vfs.io_stall_read_ms  / NULLIF(vfs.num_of_reads, 0)  AS AvgReadMs,
    vfs.io_stall_write_ms / NULLIF(vfs.num_of_writes, 0) AS AvgWriteMs,
    vfs.num_of_reads, vfs.num_of_writes
FROM sys.dm_io_virtual_file_stats(NULL, NULL) vfs
JOIN sys.master_files mf ON mf.database_id = vfs.database_id
                        AND mf.file_id      = vfs.file_id
ORDER BY (vfs.io_stall_read_ms + vfs.io_stall_write_ms) DESC;
Note: The wait statistics query is the starting point for any SQL Server performance investigation. The wait type tells you exactly what SQL Server is waiting for: PAGEIOLATCH_SH = reading pages from disk (I/O bottleneck or missing index), LCK_M_X = waiting for an exclusive lock (blocking), CXPACKET = parallel query thread coordination (normal for complex queries on multi-CPU systems), ASYNC_NETWORK_IO = client not consuming results fast enough. Match the wait type to the appropriate fix rather than guessing at performance problems.
Tip: Enable Query Store on the BlogApp database to automatically capture query performance history. Query Store tracks every query’s execution plan, duration, CPU, and I/O over time. When a query suddenly becomes slow (plan regression after an index change or statistics update), Query Store shows you both plans side-by-side and lets you force the old plan back. Enable it: ALTER DATABASE BlogApp SET QUERY_STORE = ON (OPERATION_MODE = READ_WRITE, QUERY_CAPTURE_MODE = AUTO).
Warning: DMV data accumulates since SQL Server last restarted — wait statistics and index usage stats are cumulative totals. If SQL Server was restarted last week, the stats represent one week of activity. If it was restarted this morning, they represent only a few hours. Always note the SQL Server start time (SELECT sqlserver_start_time FROM sys.dm_os_sys_info) when interpreting DMV statistics to understand the time window the data covers.

Common Mistakes

Mistake 1 — Ignoring wait statistics (guessing at performance problems)

❌ Wrong — adding indexes randomly when queries are slow; the real problem is LOCK waits (blocking), not missing indexes.

✅ Correct — start with wait statistics; identify the dominant wait type; address the root cause.

Mistake 2 — Not enabling Query Store (no visibility into plan regressions)

❌ Wrong — query suddenly slower after deployment; no history of what changed; cannot identify the regression.

✅ Correct — enable Query Store on all production databases; detect and force good plans when regressions occur.

🧠 Test Yourself

SQL Server’s Page Life Expectancy drops from 800 seconds to 45 seconds after deploying a new query. What does this indicate?