Selenium Grid 4 Architecture — Hub, Node, Router and Sessions

Selenium Grid 4, released with Selenium 4, was a complete rewrite of the original Grid architecture. It replaced the monolithic Hub-Node model with a modern, modular architecture built on well-defined components: the Router, Distributor, Session Map, Session Queue, and Nodes. Understanding these components helps you troubleshoot connection issues, capacity problems, and session routing failures when running tests at scale.

Grid 4 Components — How Tests Reach Browsers

When your test script calls webdriver.Remote(), a chain of components routes the request to an available browser on an appropriate node.

# Selenium Grid 4 Architecture

GRID_COMPONENTS = [
    {
        "component": "Router",
        "role": "Entry point — receives all incoming WebDriver requests",
        "details": (
            "The Router is the single URL your tests connect to (e.g. http://grid:4444). "
            "It forwards new session requests to the Distributor and existing session "
            "commands to the correct Node via the Session Map."
        ),
    },
    {
        "component": "Distributor",
        "role": "Matches new session requests to available Node slots",
        "details": (
            "When a test requests a new session with specific capabilities (e.g. Chrome on Linux), "
            "the Distributor finds a Node that matches those capabilities and has an available slot. "
            "If no slot is available, the request goes to the Session Queue."
        ),
    },
    {
        "component": "Session Map",
        "role": "Maps active session IDs to their Node addresses",
        "details": (
            "After a session is created, every subsequent command (find_element, click) "
            "includes the session ID. The Session Map tells the Router which Node owns "
            "that session so commands are forwarded correctly."
        ),
    },
    {
        "component": "Session Queue",
        "role": "Holds pending session requests when all Nodes are busy",
        "details": (
            "If all browser slots are occupied, new requests wait in the queue. "
            "When a slot frees up, the oldest queued request is fulfilled first (FIFO). "
            "Requests time out after a configurable duration (default: 300 seconds)."
        ),
    },
    {
        "component": "Node",
        "role": "Runs browser instances and executes WebDriver commands",
        "details": (
            "Each Node registers with the Distributor, advertising its capabilities "
            "(browser types, max sessions). A Node can run multiple concurrent sessions "
            "(e.g. 5 Chrome instances). Nodes can be on the same machine or distributed "
            "across a network."
        ),
    },
    {
        "component": "Event Bus",
        "role": "Internal messaging between Grid components",
        "details": (
            "Components communicate via an event bus. When a Node starts, it publishes "
            "a registration event. When a session ends, the Node publishes a removal event. "
            "This decoupled design allows components to be deployed independently."
        ),
    },
]

# Request flow: test script → Grid → browser
REQUEST_FLOW = [
    "1. Test calls: driver = webdriver.Remote(grid_url, options)",
    "2. Router receives the new session request",
    "3. Router forwards to Distributor",
    "4. Distributor checks Node capabilities and availability",
    "5.   If slot available → create session on matching Node",
    "6.   If no slot → queue request (wait for slot to free up)",
    "7. Node starts browser, returns session ID",
    "8. Session Map stores: session_id → node_address",
    "9. Test receives session ID — WebDriver is ready",
    "",
    "Subsequent commands (find_element, click, etc.):",
    "10. Test sends command with session ID",
    "11. Router looks up Node in Session Map",
    "12. Router forwards command to the correct Node",
    "13. Node executes command in the browser",
    "14. Response flows back: Node → Router → Test",
]

# Grid deployment modes
DEPLOYMENT_MODES = [
    {
        "mode": "Standalone",
        "components": "All-in-one process (Router + Distributor + Node)",
        "use": "Local development, quick testing, small teams",
        "command": "java -jar selenium-server-4.x.jar standalone",
    },
    {
        "mode": "Hub and Node",
        "components": "Hub (Router + Distributor + Queue) + separate Nodes",
        "use": "Distributed testing across multiple machines",
        "command": "Hub: java -jar selenium-server-4.x.jar hub\n"
                   "                    Node: java -jar selenium-server-4.x.jar node --hub http://hub:4444",
    },
    {
        "mode": "Fully Distributed",
        "components": "Each component runs as a separate process",
        "use": "Large-scale enterprise deployments with dynamic scaling",
        "command": "Each component started individually with its own config",
    },
    {
        "mode": "Docker (Recommended)",
        "components": "Pre-built Docker images for Hub and Node browsers",
        "use": "CI/CD pipelines, reproducible environments, easy scaling",
        "command": "docker compose up -d (with selenium/hub + selenium/node-chrome)",
    },
]

print("Selenium Grid 4 Architecture")
print("=" * 65)
for comp in GRID_COMPONENTS:
    print(f"\n  {comp['component']}")
    print(f"    Role: {comp['role']}")

print(f"\n\nRequest Flow:")
for step in REQUEST_FLOW:
    print(f"  {step}")

print(f"\n\nDeployment Modes:")
for mode in DEPLOYMENT_MODES:
    print(f"\n  {mode['mode']}")
    print(f"    Components: {mode['components']}")
    print(f"    Use: {mode['use']}")
Note: Grid 4’s modular architecture means you can scale individual components independently. If you have enough browser capacity but session routing is slow, you can add more Router instances. If sessions are queuing, you can add more Nodes. This component-level scaling is a significant improvement over Grid 3, where the Hub was a monolithic bottleneck that handled routing, distribution, and session management in a single process.
Tip: For most teams, the Docker deployment mode is the best starting point. Selenium provides official Docker images (selenium/hub, selenium/node-chrome, selenium/node-firefox, selenium/node-edge) that are pre-configured and production-ready. A docker compose up command gives you a complete Grid with Hub and multiple browser nodes in seconds — no Java installation, no driver management, no manual configuration.
Warning: The Session Queue has a default timeout of 300 seconds (5 minutes). If all Node slots are occupied for longer than this, queued requests fail with a timeout error. In CI/CD pipelines with many parallel jobs, this can cause cascading failures when the Grid is overloaded. Monitor queue depth and increase Node capacity (more containers or more sessions per Node) before the queue starts timing out regularly.

Common Mistakes

Mistake 1 — Running all Grid components on a single machine without resource limits

❌ Wrong: Starting a Hub and 20 Node containers on a laptop with 8 GB RAM — each Chrome instance consumes 300+ MB, quickly exhausting memory.

✅ Correct: Setting memory limits per container (--shm-size=2g for Chrome) and limiting concurrent sessions to what the machine can sustain (typically 4-6 Chrome instances per 8 GB RAM).

Mistake 2 — Not monitoring Grid health in CI/CD

❌ Wrong: Deploying Grid to CI with no monitoring — sessions silently queue and timeout, causing intermittent test failures reported as “flaky.”

✅ Correct: Accessing the Grid console at http://grid:4444/ui to monitor active sessions, available slots, and queue depth. Adding alerts when queue depth exceeds zero for more than 60 seconds.

🧠 Test Yourself

In Selenium Grid 4, what happens when a test requests a new Chrome session but all Chrome Node slots are occupied?