Cypress Cloud Dashboard — Test Analytics, Flake Detection and Trends

Running tests and collecting pass/fail results is necessary but not sufficient. You need to know which tests fail, how often they fail, which ones are flaky, and whether quality is trending up or down. Cypress Cloud’s dashboard provides this analytics layer — tracking every test run, identifying flaky tests automatically, showing failure screenshots and videos, and visualising trends over time. Even if you do not use Cypress Cloud for parallelisation, the dashboard alone provides diagnostic value that transforms how you manage test suite health.

The dashboard organises data into runs, specs, and individual tests — each with detailed diagnostics.

// ── Dashboard data hierarchy ──

/*
  Organisation
    └── Project (one per repo/app)
        └── Runs (one per CI pipeline execution)
            ├── Run metadata: branch, commit, CI provider, duration
            ├── Pass/fail summary: 195 passed, 3 failed, 2 flaky
            └── Specs (individual .cy.ts files)
                ├── Spec timing and status
                └── Tests (individual it() blocks)
                    ├── Status: passed / failed / flaky / pending
                    ├── Duration
                    ├── Screenshots (on failure)
                    ├── Video recording
                    ├── Error message and stack trace
                    └── Command log (full Cypress command history)
*/


// ── Flake detection — how Cypress Cloud identifies flaky tests ──

/*
  A test is marked FLAKY when:
    - It fails on the first attempt
    - It passes on an automatic retry (configured via retries in cypress.config)
    - Cypress Cloud records: "this test needed a retry to pass"

  Dashboard shows:
    - Flaky test count per run
    - Flaky rate trend over time (should decrease, not increase)
    - Top flaky tests ranked by frequency
    - For each flaky test: the error it throws, the screenshot at failure,
      and the command log showing exactly where it failed before passing on retry

  This is how you find and fix the tests that erode trust in your suite.
*/


// ── Key dashboard metrics ──

const DASHBOARD_METRICS = [
    {
        metric: 'Run Duration Trend',
        insight: 'Is the suite getting slower? A rising trend means new tests are added without optimisation',
        action: 'Split slow specs, increase parallelism, or optimise test setup',
    },
    {
        metric: 'Pass Rate Over Time',
        insight: 'Consistent 95%+ = healthy suite. Declining = growing flakiness or real defects',
        action: 'Investigate runs with lower pass rates; correlate with code changes',
    },
    {
        metric: 'Flaky Test Rate',
        insight: 'Percentage of tests that pass only after retry. Target: < 2%',
        action: 'Fix top 5 flaky tests each sprint; use the dashboard to prioritise by frequency',
    },
    {
        metric: 'Top Failures',
        insight: 'Which tests fail most often? These are either real bugs or poorly written tests',
        action: 'Fix the test or file a defect; a consistently failing test provides no value',
    },
    {
        metric: 'Spec Duration Ranking',
        insight: 'Which specs take the longest? These are candidates for splitting or optimisation',
        action: 'Split specs over 2 minutes into smaller files; speed up setup in slow specs',
    },
    {
        metric: 'Branch Comparison',
        insight: 'Did this PR introduce new test failures compared to main?',
        action: 'Require zero new failures before merging; flag regressions in PR review',
    },
];


// ── Using dashboard data in sprint retrospectives ──

const RETRO_TEMPLATE = `
Sprint Quality Metrics (from Cypress Cloud):
  Total runs this sprint:     42
  Average pass rate:          96.8%
  Flaky test rate:            1.4% (target: < 2%) ✅
  Top flaky test:             test_checkout_discount (failed 6 times, passed on retry)
  New failures introduced:    2 (both fixed by end of sprint)
  Suite duration trend:       12 min avg → 13 min avg (slight increase — new specs added)

Actions:
  1. Fix test_checkout_discount flakiness (owner: QA lead)
  2. Split product-search.cy.ts (3.5 min) into two smaller specs
  3. Target: maintain flaky rate below 2% next sprint
`;

console.log('Cypress Cloud Dashboard Metrics:');
DASHBOARD_METRICS.forEach(m => {
  console.log(`\n  ${m.metric}`);
  console.log(`    Insight: ${m.insight}`);
  console.log(`    Action:  ${m.action}`);
});
Note: Cypress Cloud’s flake detection is automatic and requires no configuration beyond enabling retries in your cypress.config. When a test fails on the first attempt but passes on a retry, Cypress Cloud flags it as flaky — not as a failure. This distinction is critical: a “flaky” result means the test found a timing or environment issue (not a product defect), while a “failed” result means the test found a genuine problem. Separating signal from noise is the dashboard’s core value.
Tip: Review the “Top Flaky Tests” report weekly and assign the top 3-5 to team members for investigation. Each flaky test has a full diagnostic package: the error message, a screenshot at the moment of failure, a video of the failed attempt, and the complete command log. This evidence usually makes the root cause obvious — a missing wait, a data dependency, or an animation race condition. Fixing 5 flaky tests per sprint drives the flake rate down consistently over time.
Warning: Cypress Cloud’s free tier includes 500 test recordings per month. A 200-test suite running 3 times per day consumes 600 recordings per day — exceeding the free tier in one day. Calculate your expected recording volume before choosing a tier: tests_per_run x runs_per_day x 30 days. For high-volume teams, consider sorry-cypress (self-hosted, unlimited recordings) or a paid Cypress Cloud plan.

Common Mistakes

Mistake 1 — Not reviewing flaky test data from the dashboard

❌ Wrong: Enabling retries and Cypress Cloud recording but never opening the dashboard to review flaky tests — flakiness accumulates silently.

✅ Correct: Checking the flaky test report weekly, assigning top offenders for investigation, and tracking the flake rate as a team metric in sprint retrospectives.

❌ Wrong: Adding 20 new tests per sprint without monitoring total suite duration — the pipeline grows from 10 minutes to 30 minutes over six months.

✅ Correct: Tracking run duration in the dashboard and taking action when it exceeds the target (e.g., 15 minutes). Actions include splitting slow specs, optimising test setup, and increasing parallelism.

🧠 Test Yourself

Cypress Cloud marks a test as “flaky” rather than “failed.” What does this specifically mean?