Rate Limiting and API Security — Sliding Window, Redis Lua, and Abuse Detection

Rate limiting and API security are the defences that stand between your Express API and the internet. Rate limiting prevents abuse — brute-force attacks, credential stuffing, scraping, and DoS — while still serving legitimate users. Input sanitisation and validation prevent injection attacks at the application layer. Security headers from Helmet prevent a class of client-side attacks. This lesson builds a complete, layered API security implementation with Redis-backed sliding window rate limiting (more accurate than fixed windows), request fingerprinting, and automated abuse detection.

Rate Limiting Algorithms

Algorithm Accuracy Memory Best For
Fixed window Allows burst at window boundary (2× rate at worst) O(1) per key Simple counting — login attempts per hour
Sliding window log Exact — records every request timestamp O(requests in window) Exact rate limiting, low traffic APIs
Sliding window counter Approximate — weighted average of two windows O(1) per key High-traffic APIs — Redis-efficient approach
Token bucket Allows controlled bursts up to bucket size O(1) per key APIs that legitimately need short bursts
Leaky bucket Smoothed output rate O(1) per key Downstream services with strict rate limits
Note: Always rate limit on multiple dimensions simultaneously: global rate limit (100 req/min per IP), route-specific limits (10 req/min on auth endpoints), and user-specific limits after authentication (1000 req/hour per user). These three tiers prevent different attack vectors: global limits stop unauthenticated abuse, route-specific limits protect high-value endpoints, and user limits prevent authenticated API abuse. Store rate limit state in Redis — not in-process memory — so limits are enforced across all server instances.
Tip: Use Redis sorted sets for a true sliding window: ZADD requests timestamp timestamp (score and member both set to timestamp), then ZREMRANGEBYSCORE requests 0 (now-window) to remove old entries, then ZCARD requests to count. This gives exact count over a rolling window. Wrap in a Lua script to make it atomic — a Redis Lua script executes atomically, preventing race conditions between the ZADD, ZREMRANGEBYSCORE, and ZCARD operations.
Warning: Rate limiting by IP address is the minimum baseline but not sufficient alone. Shared NAT (corporate networks, mobile carriers) can put thousands of users behind one IP — a per-IP limit of 100 requests/minute would block the entire company. Use IP as the fallback for unauthenticated requests, but switch to user ID for authenticated requests. Also consider rate limiting by API key, account tier, or a combination of IP + User-Agent fingerprint for better accuracy.

Complete Rate Limiting and Security

// ── Redis sliding window rate limiter ─────────────────────────────────────
// src/middleware/rate-limiter.js
const { getRedisClient } = require('../config/redis');

// Lua script — atomic sliding window check
const slidingWindowScript = `
local key      = KEYS[1]
local now      = tonumber(ARGV[1])
local window   = tonumber(ARGV[2])
local limit    = tonumber(ARGV[3])
local expiry   = tonumber(ARGV[4])

-- Remove timestamps older than the window
redis.call('ZREMRANGEBYSCORE', key, '-inf', now - window)

-- Count requests in the current window
local count = redis.call('ZCARD', key)

if count >= limit then
    return { 0, count, redis.call('ZRANGE', key, 0, 0)[1] }
end

-- Add current request (score=now, member=unique id using now+count)
redis.call('ZADD', key, now, now .. '-' .. count)
redis.call('EXPIRE', key, expiry)
return { 1, count + 1, 0 }
`;

async function slidingWindowRateLimit(identifier, windowMs, limit) {
    const redis  = await getRedisClient();
    const now    = Date.now();
    const key    = `ratelimit:${identifier}`;
    const expiry = Math.ceil(windowMs / 1000) + 1;

    const [allowed, count, oldestMs] = await redis.eval(
        slidingWindowScript,
        { keys: [key], arguments: [String(now), String(windowMs), String(limit), String(expiry)] }
    );

    const resetAt   = oldestMs ? Number(oldestMs) + windowMs : now + windowMs;
    const retryAfter= Math.ceil((resetAt - now) / 1000);

    return {
        allowed:    Boolean(allowed),
        count:      Number(count),
        limit,
        retryAfter: allowed ? 0 : retryAfter,
        resetAt:    new Date(resetAt).toISOString(),
    };
}

// ── Rate limit middleware factory ──────────────────────────────────────────
function rateLimitMiddleware({ windowMs, limit, keyFn, message }) {
    return async (req, res, next) => {
        try {
            const identifier = keyFn(req);
            const result     = await slidingWindowRateLimit(identifier, windowMs, limit);

            // Always set informational headers
            res.setHeader('X-RateLimit-Limit',     result.limit);
            res.setHeader('X-RateLimit-Remaining', Math.max(0, result.limit - result.count));
            res.setHeader('X-RateLimit-Reset',     result.resetAt);

            if (!result.allowed) {
                res.setHeader('Retry-After', result.retryAfter);
                return res.status(429).json({
                    message,
                    retryAfter: result.retryAfter,
                    resetAt:    result.resetAt,
                });
            }
            next();
        } catch (err) {
            // Redis unavailable — fail open (allow) to prevent Redis outage = API outage
            console.error('Rate limit check failed:', err.message);
            next();
        }
    };
}

// Pre-configured rate limiters
const globalLimiter = rateLimitMiddleware({
    windowMs: 60_000,
    limit:    100,
    keyFn:    req => `ip:${req.ip}`,
    message:  'Too many requests. Please slow down.',
});

const authLimiter = rateLimitMiddleware({
    windowMs: 15 * 60_000,
    limit:    10,
    keyFn:    req => `auth:${req.ip}`,
    message:  'Too many login attempts. Try again in 15 minutes.',
});

const userLimiter = rateLimitMiddleware({
    windowMs: 60_000,
    limit:    300,
    keyFn:    req => `user:${req.user?.sub || req.ip}`,
    message:  'API rate limit exceeded.',
});

module.exports = { globalLimiter, authLimiter, userLimiter };

// ── Abuse detection — track suspicious patterns ───────────────────────────
// src/middleware/abuse-detection.js
const SUSPICIOUS_PATTERNS = [
    /\b(union|select|insert|update|delete|drop|exec|execute)\b/i,  // SQL injection
    /]*>/i,           // XSS attempt
    /\.\.[/\\]/,                  // path traversal
    /\${.*}/,                     // template injection
];

function detectAbuse(req, res, next) {
    const toCheck = JSON.stringify({
        query:  req.query,
        body:   req.body,
        params: req.params,
    });

    const suspicious = SUSPICIOUS_PATTERNS.some(pattern => pattern.test(toCheck));

    if (suspicious) {
        // Log for security monitoring (don't reveal detection to attacker)
        logger.warn('Suspicious request pattern detected', {
            ip:     req.ip,
            method: req.method,
            path:   req.path,
            ua:     req.get('User-Agent'),
        });
        // Return generic 400 — don't explain what was suspicious
        return res.status(400).json({ message: 'Invalid request' });
    }

    next();
}

How It Works

Step 1 — Lua Scripts Execute Atomically in Redis

Redis is single-threaded — commands execute one at a time. A Lua script runs as a single atomic unit: no other Redis command can interleave between the ZREMRANGEBYSCORE, ZCARD, and ZADD operations. Without atomicity, two concurrent requests could both read a count of 9 (under the limit of 10), both increment to 10, and both be allowed — resulting in 11 requests for a limit of 10. The Lua script prevents this race condition entirely.

Step 2 — Sliding Window Is More Fair Than Fixed Window

A fixed window counter resets at fixed intervals (every minute on the clock). A user can send 100 requests at 00:59 and 100 more at 01:00 — 200 requests in 2 seconds — and both windows allow them. The sliding window counts requests in the rolling 60 seconds before the current moment — the same 200 requests in 2 seconds would be caught because the 60-second window always contains all recent requests.

Step 3 — Rate Limit Headers Communicate State to Clients

X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset follow the GitHub/Stripe convention. Well-behaved API clients read these headers to self-throttle rather than waiting for 429 responses. The Retry-After header on 429 responses tells clients exactly how many seconds to wait — preventing exponential backoff algorithms from being needed.

Step 4 — Fail Open Prevents Redis Outage from Killing the API

If Redis is unavailable when the rate limit check runs, the middleware calls next() — allowing the request through. This “fail open” approach trades security (temporarily unenforced rate limits) for availability (API keeps serving during Redis outage). The alternative “fail closed” — returning 503 when Redis is down — would make the entire API dependent on Redis uptime. Choose based on your threat model: high-security APIs should fail closed; general APIs should fail open.

Step 5 — Pattern Detection Catches Injection Attempts

Basic pattern detection in request content catches obvious injection attempts before they reach the database. This is not a replacement for parameterised queries and MongoDB sanitisation — it is an additional layer that logs suspicious patterns for security monitoring and blocks the most obvious attacks early. False positives are possible (a legitimate task titled “Select the best option”) — tune patterns carefully and consider logging without blocking for borderline cases.

Quick Reference

Limiter Config
Global (all routes) 100 req/min per IP
Auth endpoints 10 req/15min per IP
Authenticated users 300 req/min per user ID
Sliding window key ZADD key now now; ZREMRANGEBYSCORE key 0 (now-window); ZCARD key
Atomic execution Redis Lua script via redis.eval(script, { keys, arguments })
Fail open catch (err) { next(); } — allow on Redis error
Rate limit headers X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
Retry header Retry-After: seconds on 429 responses

🧠 Test Yourself

A fixed window rate limiter allows 100 requests per minute (resets at :00 each minute). An attacker sends 100 requests at 00:59 and 100 requests at 01:00. How many total requests get through, and why does a sliding window prevent this?