Observability is the ability to understand what is happening inside a system by looking at its outputs. Logs, metrics, and traces are the core building blocks. For QA, observability turns opaque systems into explainable ones, supporting faster debugging and better test design.
Logs, Metrics, and Traces
Logs record discrete events and messages, metrics capture numeric time series such as counts and durations, and traces show how a single request flows through services. Together, they give you multiple lenses on system behaviour.
# Examples of observability signals
Logs:
- Validation error messages with user IDs removed or anonymised.
- Warnings when timeouts or retries occur.
Metrics:
- Request rate, error rate, and latency percentiles.
- Queue lengths or worker utilisation.
Traces:
- End-to-end request timelines across microservices.
- Spans showing where time is spent.
QA engineers benefit from understanding dashboards and query tools for the observability stack in use (for example, Prometheus, Grafana, OpenTelemetry-based systems, or log search tools). This helps connect test results with runtime behaviour.
Designing for Testability and Observability
Systems are easier to test when they emit clear, structured signals. Testers can influence designs by requesting meaningful error messages, correlation IDs, and structured logs that connect user actions to backend behaviour.
Common Mistakes
Mistake 1 โ Treating observability as a post-launch add-on
Retrofits are harder.
โ Wrong: Adding logs only after a major incident.
โ Correct: Plan observability alongside features and tests.
Mistake 2 โ Ignoring observability tools during testing
Valuable context is lost.
โ Wrong: Relying solely on UI symptoms when investigating failures.
โ Correct: Combine UI observations with logs, metrics, and traces.