A Node.js Express application deployed in production needs more than just node server.js. It needs environment configuration management that prevents secrets from leaking, a structured application lifecycle with clear startup and shutdown phases, zero-downtime deployment patterns that eliminate maintenance windows, and health and readiness endpoints that integrate with load balancers and orchestration platforms. This lesson builds the complete production application architecture for the MEAN Stack task manager API.
Deployment Patterns
| Pattern | Downtime | Complexity | Rollback |
|---|---|---|---|
| Restart (naive) | Brief | Low | Manual |
| PM2 graceful reload | Zero | Low | pm2 revert |
| Blue-Green (Docker) | Zero | Medium | Switch load balancer back |
| Rolling (Kubernetes) | Zero | High | Automatic on health check failure |
Health Check Types
| Endpoint | Returns | Used By | Checks |
|---|---|---|---|
/health/live |
200 if process is running | Kubernetes liveness probe | Process alive — if fails, restart container |
/health/ready |
200 if ready for traffic | Kubernetes readiness probe + load balancer | DB connected, cache connected, not shutting down |
/health |
200 with full status JSON | Monitoring dashboards, humans | All dependencies with status details |
dotenv-flow instead of plain dotenv for environment configuration. dotenv-flow loads files in order: .env (base), .env.local (local overrides, gitignored), .env.production (production defaults, committed), .env.production.local (production local overrides, gitignored). Variables in later files override earlier ones. This layered approach allows committed base configuration with local and environment-specific overrides without complex CI configuration./health/ready endpoint must return 503 during startup (before dependencies are connected) and during shutdown (after server.close() is called). If it returns 200 during startup, the load balancer routes traffic before the app is ready — requests fail. If it keeps returning 200 during shutdown, the load balancer keeps sending traffic to a draining server — responses are dropped as connections close. The isReady flag is the most critical state in the application lifecycle.Complete Production Application Architecture
// src/config/env.js — Environment validation and access
const path = require('path');
// Load .env files in order (dotenv-flow pattern)
['', '.local', `.${process.env.NODE_ENV}`, `.${process.env.NODE_ENV}.local`]
.filter(Boolean)
.forEach(suffix => {
const envFile = path.resolve(process.cwd(), `.env${suffix}`);
try {
require('dotenv').config({ path: envFile, override: false });
} catch {} // file may not exist
});
// Validate required variables
const schema = {
NODE_ENV: { required: true, default: 'development' },
PORT: { required: false, default: '3000', type: 'number' },
MONGO_URI: { required: true },
JWT_SECRET: { required: true, minLength: 32 },
REFRESH_SECRET: { required: true, minLength: 32 },
REDIS_URL: { required: false, default: 'redis://localhost:6379' },
LOG_LEVEL: { required: false, default: 'info' },
CORS_ORIGINS: { required: false, default: 'http://localhost:4200' },
};
const errors = [];
const env = {};
for (const [key, rules] of Object.entries(schema)) {
const raw = process.env[key] ?? rules.default;
if (rules.required && !raw) {
errors.push(`Missing required: ${key}`);
continue;
}
if (rules.minLength && raw && raw.length < rules.minLength) {
errors.push(`${key} must be at least ${rules.minLength} characters`);
}
env[key] = rules.type === 'number' ? parseInt(raw, 10) : raw;
}
if (errors.length) {
console.error('Environment configuration errors:\n' + errors.join('\n'));
process.exit(1);
}
module.exports = env;
// ── src/server.js — production application lifecycle ─────────────────────
const mongoose = require('mongoose');
const app = require('./app');
const env = require('./config/env');
const { logger } = require('./config/logger');
const { getRedisClient } = require('./config/redis');
// Application state machine
const state = {
isStarting: true,
isReady: false,
isShuttingDown:false,
startTime: Date.now(),
};
// ── Health endpoints ─────────────────────────────────────────────────────
// Liveness — is the process running?
app.get('/health/live', (req, res) => {
if (state.isShuttingDown) return res.status(503).json({ status: 'shutting_down' });
res.json({ status: 'alive', uptime: process.uptime() });
});
// Readiness — is the process ready to serve traffic?
app.get('/health/ready', async (req, res) => {
if (!state.isReady || state.isShuttingDown) {
return res.status(503).json({ status: 'not_ready' });
}
res.json({ status: 'ready', uptime: process.uptime() });
});
// Full health — dependency status
app.get('/health', async (req, res) => {
const mongoReady = mongoose.connection.readyState === 1;
let redisReady = false;
try {
const redis = await getRedisClient();
await redis.ping();
redisReady = true;
} catch {}
const healthy = mongoReady && redisReady;
res.status(healthy ? 200 : 503).json({
status: healthy ? 'healthy' : 'degraded',
version: process.env.npm_package_version,
uptime: process.uptime(),
timestamp: new Date().toISOString(),
dependencies: {
mongodb: { status: mongoReady ? 'connected' : 'disconnected' },
redis: { status: redisReady ? 'connected' : 'disconnected' },
},
});
});
// ── Graceful shutdown ────────────────────────────────────────────────────
let httpServer;
const SHUTDOWN_TIMEOUT = 10_000;
async function shutdown(signal) {
if (state.isShuttingDown) return;
state.isShuttingDown = true;
state.isReady = false; // stop readiness probe → LB stops routing
logger.info(`${signal} received — starting graceful shutdown`);
// Phase 1: stop new connections
httpServer.close(async () => {
logger.info('HTTP server closed');
// Phase 2: drain connections and close dependencies
try {
await mongoose.disconnect();
const redis = await getRedisClient();
await redis.quit();
logger.info('All connections closed — exiting cleanly');
process.exit(0);
} catch (err) {
logger.error('Error during shutdown:', { error: err.message });
process.exit(1);
}
});
setTimeout(() => {
logger.error('Graceful shutdown timed out — forcing exit');
process.exit(1);
}, SHUTDOWN_TIMEOUT);
}
process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT', () => shutdown('SIGINT'));
// ── Startup sequence ─────────────────────────────────────────────────────
async function start() {
logger.info('Starting application...', { env: env.NODE_ENV, pid: process.pid });
// 1. Connect to MongoDB
await mongoose.connect(env.MONGO_URI, { maxPoolSize: 10, serverSelectionTimeoutMS: 5000 });
logger.info('MongoDB connected');
// 2. Connect to Redis
await getRedisClient();
logger.info('Redis connected');
// 3. Start HTTP server
httpServer = app.listen(env.PORT, () => {
state.isStarting = false;
state.isReady = true; // ← signal readiness probe
logger.info(`Server ready on port ${env.PORT}`, {
pid: process.pid,
port: env.PORT,
});
// Signal PM2 that we are ready (for zero-downtime reload)
process.send?.('ready');
});
httpServer.on('error', err => {
logger.error('HTTP server error', { error: err.message });
if (err.code === 'EADDRINUSE') process.exit(1);
});
}
start().catch(err => {
logger.error('Failed to start', { error: err.message, stack: err.stack });
process.exit(1);
});
How It Works
Step 1 — Environment Validation at Startup Catches Misconfiguration Early
Validating all required environment variables before starting the server catches misconfiguration at deployment time — not at the moment a route handler first uses a missing value. The validation schema also documents all configuration options in one place and enforces minimum security requirements (JWT secrets must be at least 32 characters). Exit code 1 ensures the deployment pipeline recognises the failure.
Step 2 — State Machine Controls Health Probe Responses
The application state (isStarting, isReady, isShuttingDown) drives the health endpoint responses. During startup: readiness returns 503 (load balancer does not route). After all connections: readiness returns 200. On SIGTERM: readiness immediately returns 503 (load balancer stops routing) before server.close() begins. This precise ordering ensures zero requests are dropped during deployments — the load balancer stops sending traffic before the server stops accepting connections.
Step 3 — Ordered Startup Prevents Race Conditions
Starting the HTTP server before establishing database connections means requests can arrive before the application is ready to handle them. The startup sequence ensures: (1) MongoDB connects, (2) Redis connects, (3) HTTP server starts. Only after all dependencies are ready does the server begin listening. The isReady flag in the health endpoint provides a safety net even if deployment infrastructure does not use readiness probes.
Step 4 — process.send(‘ready’) Integrates with PM2
PM2’s wait_ready: true configuration makes PM2 wait for the process to send 'ready' before considering the deployment complete. Sending it in the server.listen() callback means PM2 only marks the new instance as ready after the HTTP server is successfully bound. For zero-downtime reload (pm2 reload app), PM2 starts the new process, waits for ‘ready’, routes traffic to it, then sends SIGTERM to the old process.
Step 5 — server.close() Drains Without Dropping
Node.js’s server.close(callback) stops the server from accepting new TCP connections. Existing connections — including keep-alive HTTP connections — remain open until clients close them. The callback fires when the last connection is closed. For long-lived keep-alive connections (browsers typically keep connections open for 5–120 seconds), server.closeIdleConnections() (Node.js 18.2+) closes idle connections while active requests finish. This drains the server gracefully in seconds rather than minutes.
Quick Reference
| Task | Code |
|---|---|
| Validate env schema | Check required keys, types, min lengths at startup |
| Liveness probe | GET /health/live → 200 if not shutting down |
| Readiness probe | GET /health/ready → 200 only when isReady && !isShuttingDown |
| Set ready after listen | server.listen(port, () => { state.isReady = true; process.send?.('ready'); }) |
| Stop on SIGTERM | state.isReady = false; server.close(...) |
| Force exit timeout | setTimeout(() => process.exit(1), 10_000) |
| PM2 ready signal | process.send?.('ready') + wait_ready: true in ecosystem.config.js |