Technology Radar
Mutation testing remains the most honest signal for evaluating the real fault-detection capability of a test suite. Unlike traditional code coverage, which only tracks line execution, this technique introduces deliberate bugs, or mutations, into source code to verify that tests fail when behavior breaks. If a mutation goes undetected, it reveals a gap in validation rather than just a lack of coverage. This distinction is critical in an era of AI-assisted development, where high coverage percentages can mask logically hollow tests or generated code that has never been meaningfully asserted.
With AI-generated test cases now commonplace, mutation testing acts as a reinforcement layer for catching "perpetually green" tests — those that pass regardless of logic changes due to missing assertions or decoupled mocks. By using tools such as Stryker, Pitest or cargo-mutants, we shift the focus from how much code is executed to how much code is actually verified, particularly in core domain logic. The goal is to ensure that a passing test suite is a reliable signal of functional correctness, rather than simply a report of which lines were executed.