Flaky Tests and Pipeline Stability

Flaky tests are one of the biggest threats to a healthy CI/CD pipeline because they break trust; if a test sometimes fails for no good reason, people start ignoring all failures. QA engineers must learn how to detect, manage and eliminate flakiness.

Understanding and Managing Flaky Tests

A flaky test is one that can pass and fail on the same code due to timing, environment issues, data clashes or hidden dependencies. In CI/CD, such tests cause intermittent red builds, reruns and frustration.

# Example: marking a flaky job for investigation (conceptual)
jobs:
  e2e_tests:
    runs-on: ubuntu-latest
    continue-on-error: true  # temporarily, while investigating
    steps:
      - name: Run E2E tests
        run: npm run test:e2e
      - name: Upload flaky test report
        if: failure()
        run: ./scripts/collect-flaky-tests.sh
Note: Treat flakiness as a bug in the test or environment, not as something β€œnormal” that teams must live with.
Tip: Track flaky tests in a shared list, add temporary annotations if needed, but always assign an owner and a deadline for fixing them.
Warning: Simply adding retries without analysis can hide real product issues and keep bad tests in the suite indefinitely.

Typical fixes include improving waits, stabilising test data, isolating environments and removing hidden dependencies on time or external systems.

Common Mistakes

Mistake 1 β€” Accepting flaky tests as β€œjust how UI tests are”

This destroys confidence.

❌ Wrong: Ignoring flaky failures or always clicking β€œrerun” without investigation.

βœ… Correct: Log, triage and resolve flaky tests as high-priority work.

Mistake 2 β€” Overusing retries to β€œfix” flakiness

This hides problems.

❌ Wrong: Setting high retry counts so tests eventually pass.

βœ… Correct: Use limited retries mainly for diagnostics, then fix root causes.

🧠 Test Yourself

What is the healthiest way to treat flaky tests in CI/CD?