Node.js Diagnostics — perf_hooks, diagnostics_channel, and Heap Snapshots

📚 MEAN Stack 📂 Chapter 4: Node.js Advanced 📄 Lesson 2630 Advanced 🕒 September 15, 2025

Node.js’s diagnostic infrastructure provides deep visibility into application behaviour at runtime — without attaching a debugger or modifying production code. The perf_hooks module measures execution time with nanosecond precision and hooks into V8 GC events; async_hooks tracks the propagation of async context across all async operations (the foundation for AsyncLocalStorage); diagnostics_channel provides a publish-subscribe bus for structured instrumentation messages; and the --inspect flag opens the Chrome DevTools Protocol for interactive debugging, profiling, and heap analysis. This lesson builds a complete production diagnostic toolkit for the task manager API.

Diagnostic API Comparison

API	Measures	Overhead	Use For
`perf_hooks.performance`	High-resolution timestamps, user timings, GC events	Minimal	Route timing, operation benchmarking
`perf_hooks.PerformanceObserver`	GC duration, HTTP request duration, DNS lookup time	Minimal	Automated GC monitoring, Node.js built-in metrics
`async_hooks`	Async context creation, propagation, destruction	Medium (2–5% CPU)	Distributed tracing, AsyncLocalStorage internals
`diagnostics_channel`	Custom named channels — publish structured events	Negligible when no subscribers	Library instrumentation, APM integration
V8 Profiler API	CPU samples, heap snapshots programmatically	High during sampling	On-demand profiling triggered by conditions

Note: diagnostics_channel (stable since Node.js 19) is the correct way for libraries and frameworks to publish instrumentation data without coupling to any specific APM tool. Express, Mongoose, and undici (Node.js’s built-in HTTP client) publish events to diagnostic channels. APM tools (Datadog, New Relic, OpenTelemetry) subscribe to these channels to collect traces and metrics. You can subscribe to them directly for custom monitoring without installing an APM agent.

Tip: Use performance.timerify(fn) to wrap a function so that every call creates a PerformanceEntry observable with a PerformanceObserver. This is cleaner than manually calling performance.mark() and performance.measure() around each call site. For Express middleware timing, wrap the route handler: const timedHandler = performance.timerify(originalHandler). The observer receives entries with the function name and duration automatically.

Warning: async_hooks has a non-trivial performance overhead (2–5% CPU in tight async loops) because it fires callbacks on every async resource creation, destruction, and context switch. Do not enable it unconditionally in production. AsyncLocalStorage uses async_hooks internally but is optimised to minimise overhead. For production tracing, use diagnostics_channel (subscribe to existing channels) rather than raw async_hooks — the overhead is negligible when no consumer is subscribed.

Complete Diagnostic Toolkit

const {
    performance, PerformanceObserver,
    createHistogram,
} = require('perf_hooks');
const diagnostics_channel = require('diagnostics_channel');
const { AsyncResource }   = require('async_hooks');

// ── 1. High-resolution timing with performance marks ──────────────────────
async function timedDatabaseQuery(filter) {
    performance.mark('query-start');
    const result = await Task.find(filter).lean();
    performance.mark('query-end');

    const measure = performance.measure('task-query', 'query-start', 'query-end');
    console.log(`Query took ${measure.duration.toFixed(2)}ms`);

    // Clean up marks to avoid memory leak
    performance.clearMarks('query-start');
    performance.clearMarks('query-end');
    performance.clearMeasures('task-query');

    return result;
}

// ── 2. PerformanceObserver — GC monitoring ────────────────────────────────
const gcObserver = new PerformanceObserver(list => {
    for (const entry of list.getEntries()) {
        if (entry.detail?.kind === 4) {  // kind 4 = major GC
            console.warn(`Major GC: ${entry.duration.toFixed(2)}ms — potential pause`);
        }
    }
});

gcObserver.observe({ entryTypes: ['gc'] });

// ── 3. Histogram for percentile tracking ─────────────────────────────────
const requestHistogram = createHistogram();

function trackRequestDuration(durationMs) {
    // Record in nanoseconds (histogram uses ns)
    requestHistogram.record(durationMs * 1e6);
}

function getLatencyStats() {
    return {
        p50:  requestHistogram.percentile(50)  / 1e6,   // back to ms
        p95:  requestHistogram.percentile(95)  / 1e6,
        p99:  requestHistogram.percentile(99)  / 1e6,
        mean: requestHistogram.mean            / 1e6,
        max:  requestHistogram.max             / 1e6,
        count:requestHistogram.count,
    };
}

// ── 4. diagnostics_channel — custom instrumentation ───────────────────────
// Publisher side (e.g. in task service)
const taskChannel = diagnostics_channel.channel('taskmanager:task');

async function createTask(dto, userId) {
    const task = await Task.create({ ...dto, user: userId });

    // Publish only when someone is subscribed — zero overhead otherwise
    if (taskChannel.hasSubscribers) {
        taskChannel.publish({
            type:      'created',
            taskId:    task._id.toString(),
            userId,
            priority:  task.priority,
            timestamp: Date.now(),
        });
    }

    return task;
}

// Subscriber side (e.g. in metrics or APM module)
diagnostics_channel.subscribe('taskmanager:task', (message) => {
    if (message.type === 'created') {
        tasksCreatedCounter.inc({ priority: message.priority });
    }
});

// Subscribe to built-in Node.js channels
diagnostics_channel.subscribe('undici:request:create', ({ request }) => {
    console.log(`Outgoing HTTP: ${request.method} ${request.origin}${request.path}`);
});

// ── 5. On-demand heap snapshot triggered by memory threshold ───────────────
const v8 = require('v8');
const fs  = require('fs');
const path= require('path');

let lastSnapshotTime = 0;
const SNAPSHOT_COOLDOWN_MS = 5 * 60 * 1000;  // max one snapshot per 5 minutes

function checkMemoryAndSnapshot() {
    const heapUsed  = process.memoryUsage().heapUsed;
    const heapLimit = v8.getHeapStatistics().heap_size_limit;
    const pct       = heapUsed / heapLimit;

    if (pct > 0.85 && Date.now() - lastSnapshotTime > SNAPSHOT_COOLDOWN_MS) {
        lastSnapshotTime = Date.now();
        const filename   = path.join('/tmp', `heap-${Date.now()}.heapsnapshot`);
        const snapshot   = v8.writeHeapSnapshot(filename);
        console.warn(`Heap at ${(pct * 100).toFixed(1)}% — snapshot written: ${snapshot}`);
    }
}

setInterval(checkMemoryAndSnapshot, 30000);

// ── 6. AsyncResource for context propagation ─────────────────────────────
// Propagates async context into callbacks that were created outside
// the current async context (e.g. event emitter listeners)
class TaskEventHandler extends AsyncResource {
    constructor(requestContext) {
        super('TaskEventHandler');
        this.requestContext = requestContext;
    }

    handleEvent(event) {
        // Runs in the async context that created this handler
        // AsyncLocalStorage values from the originating request are available
        this.runInAsyncScope(() => {
            processTaskEvent(event, this.requestContext);
        });
    }
}

How It Works

Step 1 — performance.mark() Has Nanosecond Resolution

performance.now() returns a high-resolution timestamp in milliseconds with sub-millisecond precision (typically microsecond or nanosecond on modern hardware). performance.mark(name) records a named timestamp; performance.measure(name, start, end) computes the duration. The resulting PerformanceMeasure entry has duration in milliseconds. This is more accurate than Date.now() (millisecond resolution) for measuring sub-millisecond operations.

Step 2 — PerformanceObserver Receives Entries Asynchronously

Rather than polling for metrics, PerformanceObserver receives entries as a callback when new entries of the observed types arrive. GC entries (entryTypes: ['gc']) are posted after each garbage collection cycle with the duration. This enables monitoring GC pauses without any polling overhead. A major GC over 100ms will cause a visible event loop pause — logging these in production helps correlate latency spikes with GC activity.

Step 3 — diagnostics_channel Is Zero-Cost Without Subscribers

channel.hasSubscribers is a fast property check. If no one has subscribed to a channel, the entire publish path is a single boolean check — no message serialisation, no function calls. This makes it safe to add diagnostic channel publications in hot code paths in production — they contribute near-zero overhead until an APM agent or custom subscriber attaches.

Step 4 — Heap Snapshots Are On-Demand in V8

v8.writeHeapSnapshot(path) writes a JSON snapshot of the entire V8 heap to a file. This is the same format opened by Chrome DevTools Memory tab. Taking snapshots programmatically (triggered by a memory threshold or an HTTP endpoint) enables capturing heap state at the exact moment of a memory issue — far more useful than a snapshot taken after the fact. The cooldown prevents cascading: the snapshot itself is expensive and should not run when memory is already critical.

Step 5 — AsyncResource Preserves Context for Out-of-Band Callbacks

AsyncLocalStorage context is automatically propagated along the async chain of a request. But callbacks registered on event emitters or custom event buses outside the request’s async chain lose the context. AsyncResource.runInAsyncScope(fn) re-enters the async context that was active when the resource was created, making AsyncLocalStorage values (like requestId) available inside the callback even if it fires in a different async context.

Quick Reference

Task	Code
Time an operation	`performance.mark('start'); ...; performance.measure('op', 'start')`
Monitor GC	`new PerformanceObserver(cb).observe({ entryTypes: ['gc'] })`
Percentile histogram	`const h = createHistogram(); h.record(ns); h.percentile(95)`
Publish diagnostic event	`diagnostics_channel.channel('name').publish(data)`
Subscribe to channel	`diagnostics_channel.subscribe('name', msg => ...)`
Zero-cost check	`if (channel.hasSubscribers) channel.publish(data)`
Heap snapshot	`v8.writeHeapSnapshot('/tmp/heap.heapsnapshot')`
Preserve async context	`new AsyncResource('name').runInAsyncScope(fn)`

Diagnostic API Comparison #

Complete Diagnostic Toolkit #

How It Works #

Step 1 — performance.mark() Has Nanosecond Resolution #

Step 2 — PerformanceObserver Receives Entries Asynchronously #

Step 3 — diagnostics_channel Is Zero-Cost Without Subscribers #

Step 4 — Heap Snapshots Are On-Demand in V8 #

Step 5 — AsyncResource Preserves Context for Out-of-Band Callbacks #

Quick Reference #

🧠 Test Yourself #

📚 More in this Tutorial Series

Diagnostic API Comparison

Complete Diagnostic Toolkit

How It Works

Step 1 — performance.mark() Has Nanosecond Resolution

Step 2 — PerformanceObserver Receives Entries Asynchronously

Step 3 — diagnostics_channel Is Zero-Cost Without Subscribers

Step 4 — Heap Snapshots Are On-Demand in V8

Step 5 — AsyncResource Preserves Context for Out-of-Band Callbacks

Quick Reference

🧠 Test Yourself