Code Quality for Test Code — Linting, Reviews, Refactoring and Technical Debt

📚 QA Engineering 📂 Chapter 77: The SDET Role and Design Patterns — Engineering-Grade Test Automation 📄 Lesson 77050 Advanced 🕒 April 8, 2026

Test code that is not reviewed, linted, or refactored decays just like production code — except faster, because test suites grow rapidly and rarely get the same engineering attention. An SDET treats test code as a first-class software product: it has coding standards, automated linting, mandatory code reviews, regular refactoring, and tracked technical debt. This discipline is what separates a framework that lasts years from one that is rewritten every six months.

Engineering Practices for Test Code Quality

The same practices that keep production code healthy apply to test code — with some test-specific additions.

# ── PRACTICE 1: Linting and formatting ──

LINTING_CONFIG = {
    "Python": {
        "linter": "ruff or flake8",
        "formatter": "black or ruff format",
        "config": "pyproject.toml with max-line-length=120",
        "ci": "Run on every PR; block merge on violations",
    },
    "TypeScript/JavaScript": {
        "linter": "ESLint with cypress/recommended plugin",
        "formatter": "Prettier",
        "config": ".eslintrc.json with cypress rules enabled",
        "ci": "Run on every PR; block merge on violations",
    },
}


# ── PRACTICE 2: Code review checklist for test code ──

REVIEW_CHECKLIST = [
    "Does the test name describe the expected behaviour? (test_valid_login_shows_dashboard)",
    "Does the test have a single assertion focus? (one behaviour per test)",
    "Are locators using stable selectors? (data-cy, data-testid — not fragile classes)",
    "Are there no hardcoded waits? (time.sleep / cy.wait(ms) without justification)",
    "Is test data generated uniquely? (no shared accounts, timestamps in emails)",
    "Are page objects used? (no raw find_element / cy.get in tests)",
    "Are assertions in the test, not in page objects?",
    "Is the test independent? (can run alone, in any order, in parallel)",
    "Are custom commands / page objects documented with docstrings?",
    "Is the test readable without consulting the page object code?",
]


# ── PRACTICE 3: Refactoring triggers ──

REFACTORING_TRIGGERS = [
    {
        "smell": "Duplicated setup code across 5+ test files",
        "refactoring": "Extract into a custom command, fixture, or beforeEach hook",
    },
    {
        "smell": "Page object with 200+ lines",
        "refactoring": "Split into smaller page objects by page section or component",
    },
    {
        "smell": "Test file with 500+ lines",
        "refactoring": "Split into separate files by feature area or user flow",
    },
    {
        "smell": "Same locator used in 3+ page objects",
        "refactoring": "Extract into a shared component class (NavBar, Footer)",
    },
    {
        "smell": "Flaky test fixed by adding sleep/retry instead of proper wait",
        "refactoring": "Replace with explicit wait, intercept, or DOM assertion",
    },
    {
        "smell": "Test that takes 60+ seconds to execute",
        "refactoring": "Replace UI setup with API calls; split into smaller tests",
    },
]


# ── PRACTICE 4: Technical debt tracking ──

TECH_DEBT_CATEGORIES = [
    {
        "category": "Flaky tests",
        "metric": "Flake rate (% of runs involving retries)",
        "target": "< 2%",
        "action": "Fix top 5 flakiest tests each sprint",
    },
    {
        "category": "Slow tests",
        "metric": "Tests exceeding 30 seconds",
        "target": "< 5% of tests",
        "action": "Optimise setup with API calls; split long tests",
    },
    {
        "category": "Skipped tests",
        "metric": "Count of @skip / @pending tests",
        "target": "< 3% of total tests",
        "action": "Fix or delete; skipped tests provide no value and rot silently",
    },
    {
        "category": "Suppressed linting rules",
        "metric": "Count of eslint-disable / noqa comments",
        "target": "Decrease each sprint",
        "action": "Fix the underlying issue rather than suppressing the warning",
    },
    {
        "category": "Outdated dependencies",
        "metric": "Cypress / Selenium / plugin versions behind latest",
        "target": "Within 2 minor versions of latest",
        "action": "Quarterly dependency update with test verification",
    },
]

print("Code Review Checklist for Test Code:")
for i, item in enumerate(REVIEW_CHECKLIST, 1):
    print(f"  {i:>2}. {item}")

print("\n\nRefactoring Triggers:")
for trigger in REFACTORING_TRIGGERS:
    print(f"\n  Smell: {trigger['smell']}")
    print(f"  Fix:   {trigger['refactoring']}")

print("\n\nTechnical Debt Categories:")
for cat in TECH_DEBT_CATEGORIES:
    print(f"\n  {cat['category']} — Target: {cat['target']}")
    print(f"    Metric: {cat['metric']}")
    print(f"    Action: {cat['action']}")

Note: Code reviews for test code should be as rigorous as reviews for production code. A bad locator, a missing wait, or an assertion hidden in a page object can silently break dozens of tests. The review checklist above covers the ten most common test code quality issues — use it as a PR template for test code changes. Teams that skip test code reviews consistently end up with a framework that passes on the reviewer’s machine but fails in CI, contains duplicated logic, and is impossible for new team members to understand.

Tip: Track test technical debt as a sprint metric alongside product technical debt. Report flake rate, slow test count, skipped test count, and suppressed linting rules in retrospectives. Making these metrics visible creates accountability: the team sees the debt accumulating and can prioritise cleanup before it becomes a crisis. A dashboard showing “12 skipped tests, 3.5% flake rate, 8 tests over 60 seconds” is more actionable than a vague “we should clean up the tests.”

Warning: Skipped tests (@pytest.mark.skip, it.skip(), @Disabled) are the most dangerous form of test debt. They silently rot: the feature they tested may have changed, new bugs may have been introduced in their coverage area, and nobody notices because skipped tests produce no output. If a test is skipped for more than two sprints, it should be either fixed or deleted. A deleted test is honest — it tells the team there is no coverage. A skipped test is deceptive — it suggests coverage exists when it does not.

Common Mistakes

Mistake 1 — Not reviewing test code in pull requests

❌ Wrong: PR reviews focus only on production code; test code changes are approved without examination.

✅ Correct: Applying the 10-point review checklist to every test code change. Bad test code that passes review becomes permanent technical debt.

Mistake 2 — Treating skipped tests as “will fix later”

❌ Wrong: 30 skipped tests accumulate over 6 months — nobody remembers why they were skipped or whether they are still relevant.

✅ Correct: Every skipped test has a Jira ticket with a fix target date. If the date passes without a fix, the test is deleted and the coverage gap is documented.

🧠 Test Yourself

A test suite has 200 tests, 15 of which are skipped with @skip. What is the correct approach?

Leave them skipped — they are probably outdated and would fail if unskipped.
Unskip all 15 and let them fail — failures will force someone to fix them.
Delete all 15 immediately — skipped tests provide no value.
Review each of the 15 skipped tests individually: for each one, determine why it was skipped (bug in the app, test environment issue, flakiness, outdated feature). If the reason is fixable within a sprint, create a ticket and assign it. If the reason no longer applies (feature removed, environment fixed), unskip and verify it passes. If the test is outdated and the feature has changed significantly, delete the test and write a new one if coverage is still needed. Track skipped test count as a metric: target fewer than 3% of total tests. The goal is zero skipped tests — every test either runs and provides value or is deleted and the coverage gap is acknowledged.

Engineering Practices for Test Code Quality #

Common Mistakes #

Mistake 1 — Not reviewing test code in pull requests #

Mistake 2 — Treating skipped tests as “will fix later” #

🧠 Test Yourself #

📚 More in this Tutorial Series

Engineering Practices for Test Code Quality

Common Mistakes

Mistake 1 — Not reviewing test code in pull requests

Mistake 2 — Treating skipped tests as “will fix later”

🧠 Test Yourself