Dane Powell

Principal Software Engineer

Dane has been building enterprise Drupal websites since 2008. He has led and enabled technical teams across a variety of industries on projects ranging in scale from one-off decoupled sites for conferences to multi-thousand tenant multisite platforms for financial services agencies. He works with customers and partners to transform business requirements into technical solutions that exceed expectations, enjoys solving hard problems, and always aims to enable customers and partners to do the same. Dane is also an active member of and regular contributor to Drupal and the larger open-source community. He has contributed and maintained several enterprise modules, contributed over 1000 patches, commits, or pull requests, and is currently a maintainer of BLT, an automation framework for Drupal. He has also presented at several Drupal camps, with a focus on configuration management and automation. Outside of Drupal, Dane is a regular contributor to open-source projects (such as Serverless and Symfony) and enjoys developing for a number of other ecosystems including Amazon Alexa, Google Home, and Android. He is the creator and maintainer of several Alexa Skills and Android apps on the Google Play store. Several of these are open source and make use of server-side components such as Node.js and MongoDB. Through his work, he has also discovered and responsibly disclosed multiple security vulnerabilities in popular enterprise applications. Prior to joining the Drupal community, Dane was a NASA Space Technology Research Fellow at Rice University and Johnson Space Center, where he researched human-computer interaction and developed haptic teleoperation interfaces for NASA’s Robonaut program. His other interests include rock climbing, traveling, and spending time outdoors.

Better tests through mutation testing; or, killing mutants for fun and profit

February 27, 2023
9 minute read
Like 5

Dislike 0

Quis custodiet ipsos custodes (who watches the watchers?)

Automated unit tests are one of the best ways of ensuring code quality. But how do you measure and ensure the quality of your unit tests?

The coverage metric is the traditional way of measuring test quality. Coverage measures what percentage of a codebase is executed during unit tests. For instance, if your codebase has 100 lines of code and only 90 of them run during unit tests, it is 90% covered.

Coverage alone isn't enough

Coverage alone is a poor measure of test quality because it doesn’t ensure the tests actually test your code, only that they run it. Having 100% coverage only guarantees that your code doesn’t fatally error. But saying that your code won’t burn down, fall over, and sink into the swamp is hardly a measure of quality.

Mutation testing complements coverage by ensuring that your tests... well, actually test something! 🤯

It does this by mutating (i.e., intentionally breaking) your codebase and then checking that your tests catch the breaking change. If the change is caught, the mutant is “killed”. If the change is not caught, the mutant “escapes”. The ratio of killed mutants to total mutants, expressed as a percentage, is the mutation score indicator, (MSI). Like coverage, a higher MSI is better!

Mutation testing by example

https://github.com/danepowell/mutation-example

Fortunately, mutation testing is easy to implement in PHP using the Infection framework. Consider this codebase that implements the card game War. Follow along by cloning it yourself (you’ll just need PHP 8.1 and the PHP pcov extension installed):

pecl install pcov # If you don't already have pcov installed
git clone https://github.com/danepowell/mutation-example.git --branch coverage-only
cd mutation-example
composer install
./vendor/bin/phpunit --coverage-text

If you check out the coverage-only branch and run ./vendor/bin/phpunit --coverage-text, you’ll see it has 100% coverage. Sounds great, right? But look more closely at the test cases. What do you notice?

public function testAnnounceWinner($card1, $card2, $expectedWinner): void {
  $war = new War($card1, $card2);
  $this->assertStringContainsString('!', $war->announceWinner());
}

None of the test cases have meaningful assertions. Imagine what would happen if we "accidentally" flip this inequality on line 36:

public function announceWinner(): string
{
  $card1value = self::getCardValue($this->card1);
  $card2value = self::getCardValue($this->card2);
  if ($card1value > $card2value) {
    return "Player 1 wins!";
  }
}

Our program would return the wrong winner 100% of the time. So much for great test coverage!

This thought experiment is exactly what mutation testing implements as a practice. To see this in action, run ./vendor/bin/infection --show-mutations:

$ ./vendor/bin/infection --show-mutations
...
20 mutations were generated:
       8 mutants were killed
       0 mutants were configured to be ignored
       0 mutants were not covered by tests
      12 covered mutants were not detected
       0 errors were encountered
       0 syntax errors were encountered
       0 time outs were encountered
       0 mutants required more time than configured

Metrics:
         Mutation Score Indicator (MSI): 40%
         Mutation Code Coverage: 100%
         Covered Code MSI: 40%

Notice the 40% MSI alongside 100% code coverage. This tells a fuller story of our test "quality" (such as it is). Now look at the escaped mutants.

10) /Users/dane.powell/src/danepowell/mutation-example/src/War.php:36    [M] GreaterThanNegotiation

--- Original
+++ New
@@ @@
     {
         $card1value = self::getCardValue($this->card1);
         $card2value = self::getCardValue($this->card2);
-        if ($card1value > $card2value) {
+        if ($card1value <= $card2value) {
             return "Player 1 wins!";
         }
         if ($card1value < $card2value) {

These mutants demonstrate the exact scenario we mentioned, i.e., changing the inequality so that the wrong winner is returned. The fact that our tests didn't fail in response to these mutations makes them escaped mutants.

Now for the fun part: killing those mutants! The best way to kill a mutant is to make the same change as the mutation locally, i.e., in your IDE. Run your tests to ensure they still pass after making the "breaking" change. Now add an assertion such as this (or check out the main branch, which includes all assertions):

public function testAnnounceWinner($card1, $card2, $expectedWinner): void {
  $war = new War($card1, $card2);
  $this->assertSame($expectedWinner, $war->announceWinner());
}

Run your unit tests again to ensure they fail, as expected. Then re-run the mutation tests.

$ git checkout main
$ ./vendor/bin/infection --show-mutations
20 mutations were generated:
      20 mutants were killed
       0 mutants were configured to be ignored
       0 mutants were not covered by tests
       0 covered mutants were not detected
       0 errors were encountered
       0 syntax errors were encountered
       0 time outs were encountered
       0 mutants required more time than configured

Metrics:
         Mutation Score Indicator (MSI): 100%
         Mutation Code Coverage: 100%
         Covered Code MSI: 100%

You’ll see that by simply adding a few assert statements, we've increased the MSI to 100% and, more importantly, significantly improved the quality of our tests.

Don't be discouraged if you see a large number of mutants. Remember that Infection can mutate a single line of code multiple times (once per applicable mutator); a single assertion can often kill a half dozen mutants!

Mutators

Going back to the coverage-only branch, let's look at how the mutations were generated. You'll see a "mutator" name printed to the right of each mutation, such as GreaterThan and LessThan.

11) src/War.php:40    [M] LessThan

--- Original
+++ New
@@ @@
         if ($card1value > $card2value) {
             return "Player 1 wins!";
         }
-        if ($card1value < $card2value) {
+        if ($card1value <= $card2value) {
             return "Player 2 wins!";
         }
         return "It's a war!";
     }
 }

The Infection library has dozens of mutators built-in, each of which trying to break your code in a unique way. In addition to GreaterThan and Lessthan, which flip inequalities, you’ll see, among others, DecrementInteger and IncrementInteger mutators, which work by changing the value of any integer they find.

8) src/War.php:28    [M] IncrementInteger

--- Original
+++ New
@@ @@
             'J' => 11,
             'Q' => 12,
             'K' => 13,
-            'A' => 14,
+            'A' => 15,
         };
     }
     public function announceWinner() : string

Implementing mutation testing

Keep reading for some tips and best practices, and when you're ready, refer to the thorough Infection documentation to get started.

The best way to implement mutation testing is via continuous integration. For a brand-new codebase, implement it from the start and require a minimum MSI for all pull requests. When adding mutation testing to an existing codebase, consider running infection with the --git-diff-lines option so that only changed lines are mutated. These practices ensure that test quality gradually improves over time without requiring a major up-front investment in mutant-killing.

If you use GitHub Actions, make sure the annotations logger (enabled by default) is working so you can see escaped mutants right in the PR!

Also consider setting up a Stryker Dashboard account to see mutation results beautifully rendered and navigable, as in this example from Acquia CLI.

Finally, keep in mind that Infection simply runs your existing PHPUnit tests under the hood. Furthermore, it runs these tests in parallel, dozens of times each, and in random order. This means that your PHPUnit tests need to be idempotent and thread-safe, and any underlying stability or performance issues will be exacerbated by mutation testing.

Happy hunting!

Mutation testing is one of the best ways to ensure test quality, and it takes just a few minutes to set up if your tests already follow best practices. Get started and get support by referring to the Infection docs and Infection GitHub page. As a reference, also check out Acquia CLI, an open-source Acquia product which implements mutation testing on its own codebase via a GitHub Actions workflow.

Now go forth, kill some mutants, and improve your tests. Happy hunting!