Production quality is what your users actually experience once software is live. Traditional testing focuses on pre-release activities, but modern teams also use production signals such as uptime, error rates, and latency to understand whether the system is โgood enough.โ Service-level objectives (SLOs) make this explicit.
From Testing Activities to Quality Outcomes
Running many tests does not guarantee a reliable service. SLOs define target levels for key user-centric metrics, such as request success rate or page load time, over a rolling window. They connect engineering work to user expectations and business risk.
# Example SLOs
- 99.9% of checkout requests succeed over 30 days.
- 99% of homepage loads complete within 1.5 seconds.
- 99.5% of API calls respond without server error.
Error budget = 1 - SLO target (e.g., 0.1% allowed failures for a 99.9% SLO).
Error budgets express how much unreliability you are willing to tolerate within a period. They help teams balance feature delivery and reliability work by providing a shared, quantitative view of risk.
QAโs Role in Production Quality
QA engineers can help define meaningful SLOs, interpret production charts, and connect incidents back to gaps in test design or environments. This shifts the role from gatekeeping to partnership in ongoing quality improvement.
Common Mistakes
Mistake 1 โ Treating SLOs as purely an SRE concern
Quality is cross-functional.
โ Wrong: Assuming testers have no role once code reaches production.
โ Correct: Use SLOs and incident data to refine tests and test environments.
Mistake 2 โ Defining SLOs only in technical terms
Users feel outcomes, not implementation details.
โ Wrong: Focusing solely on CPU usage or internal queue sizes.
โ Correct: Anchor SLOs in user-visible behaviours like success rates and latency.