Test code that is not reviewed, linted, or refactored decays just like production code — except faster, because test suites grow rapidly and rarely get the same engineering attention. An SDET treats test code as a first-class software product: it has coding standards, automated linting, mandatory code reviews, regular refactoring, and tracked technical debt. This discipline is what separates a framework that lasts years from one that is rewritten every six months.
Engineering Practices for Test Code Quality
The same practices that keep production code healthy apply to test code — with some test-specific additions.
# ── PRACTICE 1: Linting and formatting ──
LINTING_CONFIG = {
"Python": {
"linter": "ruff or flake8",
"formatter": "black or ruff format",
"config": "pyproject.toml with max-line-length=120",
"ci": "Run on every PR; block merge on violations",
},
"TypeScript/JavaScript": {
"linter": "ESLint with cypress/recommended plugin",
"formatter": "Prettier",
"config": ".eslintrc.json with cypress rules enabled",
"ci": "Run on every PR; block merge on violations",
},
}
# ── PRACTICE 2: Code review checklist for test code ──
REVIEW_CHECKLIST = [
"Does the test name describe the expected behaviour? (test_valid_login_shows_dashboard)",
"Does the test have a single assertion focus? (one behaviour per test)",
"Are locators using stable selectors? (data-cy, data-testid — not fragile classes)",
"Are there no hardcoded waits? (time.sleep / cy.wait(ms) without justification)",
"Is test data generated uniquely? (no shared accounts, timestamps in emails)",
"Are page objects used? (no raw find_element / cy.get in tests)",
"Are assertions in the test, not in page objects?",
"Is the test independent? (can run alone, in any order, in parallel)",
"Are custom commands / page objects documented with docstrings?",
"Is the test readable without consulting the page object code?",
]
# ── PRACTICE 3: Refactoring triggers ──
REFACTORING_TRIGGERS = [
{
"smell": "Duplicated setup code across 5+ test files",
"refactoring": "Extract into a custom command, fixture, or beforeEach hook",
},
{
"smell": "Page object with 200+ lines",
"refactoring": "Split into smaller page objects by page section or component",
},
{
"smell": "Test file with 500+ lines",
"refactoring": "Split into separate files by feature area or user flow",
},
{
"smell": "Same locator used in 3+ page objects",
"refactoring": "Extract into a shared component class (NavBar, Footer)",
},
{
"smell": "Flaky test fixed by adding sleep/retry instead of proper wait",
"refactoring": "Replace with explicit wait, intercept, or DOM assertion",
},
{
"smell": "Test that takes 60+ seconds to execute",
"refactoring": "Replace UI setup with API calls; split into smaller tests",
},
]
# ── PRACTICE 4: Technical debt tracking ──
TECH_DEBT_CATEGORIES = [
{
"category": "Flaky tests",
"metric": "Flake rate (% of runs involving retries)",
"target": "< 2%",
"action": "Fix top 5 flakiest tests each sprint",
},
{
"category": "Slow tests",
"metric": "Tests exceeding 30 seconds",
"target": "< 5% of tests",
"action": "Optimise setup with API calls; split long tests",
},
{
"category": "Skipped tests",
"metric": "Count of @skip / @pending tests",
"target": "< 3% of total tests",
"action": "Fix or delete; skipped tests provide no value and rot silently",
},
{
"category": "Suppressed linting rules",
"metric": "Count of eslint-disable / noqa comments",
"target": "Decrease each sprint",
"action": "Fix the underlying issue rather than suppressing the warning",
},
{
"category": "Outdated dependencies",
"metric": "Cypress / Selenium / plugin versions behind latest",
"target": "Within 2 minor versions of latest",
"action": "Quarterly dependency update with test verification",
},
]
print("Code Review Checklist for Test Code:")
for i, item in enumerate(REVIEW_CHECKLIST, 1):
print(f" {i:>2}. {item}")
print("\n\nRefactoring Triggers:")
for trigger in REFACTORING_TRIGGERS:
print(f"\n Smell: {trigger['smell']}")
print(f" Fix: {trigger['refactoring']}")
print("\n\nTechnical Debt Categories:")
for cat in TECH_DEBT_CATEGORIES:
print(f"\n {cat['category']} — Target: {cat['target']}")
print(f" Metric: {cat['metric']}")
print(f" Action: {cat['action']}")
@pytest.mark.skip, it.skip(), @Disabled) are the most dangerous form of test debt. They silently rot: the feature they tested may have changed, new bugs may have been introduced in their coverage area, and nobody notices because skipped tests produce no output. If a test is skipped for more than two sprints, it should be either fixed or deleted. A deleted test is honest — it tells the team there is no coverage. A skipped test is deceptive — it suggests coverage exists when it does not.Common Mistakes
Mistake 1 — Not reviewing test code in pull requests
❌ Wrong: PR reviews focus only on production code; test code changes are approved without examination.
✅ Correct: Applying the 10-point review checklist to every test code change. Bad test code that passes review becomes permanent technical debt.
Mistake 2 — Treating skipped tests as “will fix later”
❌ Wrong: 30 skipped tests accumulate over 6 months — nobody remembers why they were skipped or whether they are still relevant.
✅ Correct: Every skipped test has a Jira ticket with a fix target date. If the date passes without a fix, the test is deleted and the coverage gap is documented.