Caching Best Practices — TTL, Stampedes and Consistency

📚 DotNet Fullstack + Angular 📂 Chapter 43: Caching — In-Memory, Redis and Response Caching 📄 Lesson 43050 Advanced 🕒 March 17, 2026

Production caching requires more than just adding cache calls — it requires thoughtful TTL design, protection against cache stampedes, graceful degradation when the cache is down, and continuous measurement of cache effectiveness. These best practices are what separate a fragile cache that causes intermittent 500 errors from a robust caching layer that reliably delivers 10× performance improvement under load.

Cache Stampede Prevention

// ── Cache stampede: when a hot key expires, many requests hit DB simultaneously ──
// Solution 1: SemaphoreSlim — only one request rebuilds, others wait

public class StampedeProtectedCacheService(
    IMemoryCache cache,
    IPostRepository repo)
{
    private readonly SemaphoreSlim _lock = new(1, 1);

    public async Task<PostDto?> GetByIdAsync(int id, CancellationToken ct)
    {
        var key = $"post:{id}";

        if (cache.TryGetValue(key, out PostDto? cached))
            return cached;

        await _lock.WaitAsync(ct);
        try
        {
            // Double-check after acquiring lock (another thread may have populated)
            if (cache.TryGetValue(key, out cached))
                return cached;

            var post = await repo.GetByIdAsync(id, ct);
            var dto  = post?.ToDto();

            if (dto is not null)
                cache.Set(key, dto, TimeSpan.FromMinutes(10));

            return dto;
        }
        finally
        {
            _lock.Release();
        }
    }
}

// ── Solution 2: Probabilistic early expiration (background refresh) ────────
// Instead of waiting for expiry, probabilistically refresh slightly before expiry
// Prevents the sharp cliff where all simultaneous users hit a miss at once
public async Task<T?> GetWithEarlyRefreshAsync<T>(
    string key, Func<Task<T?>> factory,
    TimeSpan ttl, CancellationToken ct)
{
    if (cache.TryGetValue(key, out (T? Value, DateTime ExpiresAt) entry))
    {
        var remaining = (entry.ExpiresAt - DateTime.UtcNow).TotalSeconds;
        var beta      = 1.0;  // tune: higher = refresh earlier
        // Randomly refresh when remaining time is less than -beta * ln(random)
        var shouldRefresh = remaining < -beta * Math.Log(Random.Shared.NextDouble());

        if (!shouldRefresh) return entry.Value;
    }

    var freshValue = await factory();
    cache.Set(key, (freshValue, DateTime.UtcNow.Add(ttl)), ttl);
    return freshValue;
}

Note: The SemaphoreSlim double-check pattern is critical: after acquiring the lock, always re-check the cache before rebuilding. The thread that waited for the semaphore should find the cache already populated (the thread that held the semaphore just rebuilt it). Without the second check, every queued thread would rebuild the cache, defeating the stampede prevention. The second check is what makes the pattern efficient: only the first thread does the database work; all waiting threads get the cached result.

Tip: Measure cache effectiveness with a hit ratio metric: cacheHits / (cacheHits + cacheMisses). A well-tuned cache should achieve 80%+ hit ratio for frequently accessed data. If the ratio drops significantly, it indicates either: the TTL is too short (data expires before being reused), cache keys are not granular enough (too many unique keys), or data patterns have changed. Expose hit ratio as a metric endpoint or push it to Application Insights for alerting when it drops below threshold.

Warning: Implement a circuit breaker for your cache tier. If Redis has a network partition and every cache call takes 5 seconds before timing out, your API’s latency becomes 5 seconds for every request that tries to read from cache. A circuit breaker opens after N failures and bypasses Redis entirely (falling back to the database) until the circuit closes after a recovery window. Polly provides a ready-made circuit breaker: Policy.Handle<RedisException>().CircuitBreakerAsync(5, TimeSpan.FromSeconds(30)).

Layered Caching Architecture (L1 + L2)

// ── L1 (in-process IMemoryCache) + L2 (Redis IDistributedCache) ────────────
public class LayeredCacheService(IMemoryCache l1, IRedisCacheService l2) : ICacheService
{
    public async Task<T?> GetOrCreateAsync<T>(
        string key, Func<Task<T?>> factory,
        TimeSpan? l1Ttl = null, TimeSpan? l2Ttl = null,
        CancellationToken ct = default)
    {
        // Check L1 (in-process, ~nanoseconds)
        if (l1.TryGetValue(key, out T? l1Value)) return l1Value;

        // Check L2 (Redis, ~milliseconds)
        var l2Value = await l2.GetAsync<T>(key, ct);
        if (l2Value is not null)
        {
            // Populate L1 from L2 (short TTL — L1 is ephemeral)
            l1.Set(key, l2Value, l1Ttl ?? TimeSpan.FromSeconds(30));
            return l2Value;
        }

        // L1 and L2 miss — load from database (~milliseconds to seconds)
        var value = await factory();
        if (value is not null)
        {
            l2.SetAsync(key, value, l2Ttl ?? TimeSpan.FromMinutes(5), ct: ct);
            l1.Set(key, value, l1Ttl ?? TimeSpan.FromSeconds(30));
        }
        return value;
    }
}
// Typical hit rates: L1 ~40%, L2 ~50%, DB ~10%
// Requests served without hitting DB: 90%

Common Mistakes

Mistake 1 — No circuit breaker on Redis (slow Redis makes entire API slow)

❌ Wrong — Redis timeout of 5s; every request waits 5s on each cache miss; latency spikes.

✅ Correct — use Polly circuit breaker; open circuit after N Redis failures; fall back to database.

Mistake 2 — Same TTL for all data types (hot data expires too early, cold data cached too long)

❌ Wrong — all cache entries have 5-minute TTL; frequently updated user feeds and static category lists have same TTL.

✅ Correct — tune TTL per data type: config (1 hour), published posts (10 min), search results (1 min), user feeds (30 sec).

🧠 Test Yourself

A layered cache has L1 (30s TTL) and L2 Redis (10min TTL). 100 concurrent users request the same post. The L1 cache is cold (just expired). What happens?

All 100 requests hit the database simultaneously — L1 expiry causes a stampede
All 100 requests wait for the L1 lock to be released
All 100 requests miss L1, go to L2 (Redis). Redis still has the entry (10-min TTL). The first request reads from Redis, populates L1, and returns. Subsequent requests immediately find the data in L1. Only the database is avoided entirely — exactly one Redis read occurred (from the first concurrent miss) and then all subsequent requests served from L1 in nanoseconds. This is why layered caching is effective: the L2 cache absorbs database load even when L1 expires
100 Redis reads occur simultaneously, one per request

Cache Stampede Prevention #

Layered Caching Architecture (L1 + L2) #

Common Mistakes #

Mistake 1 — No circuit breaker on Redis (slow Redis makes entire API slow) #

Mistake 2 — Same TTL for all data types (hot data expires too early, cold data cached too long) #

🧠 Test Yourself #

📚 More in this Tutorial Series