Dane Powell

Principal Software Engineer

Dane has been building enterprise Drupal websites since 2008. He has led and enabled technical teams across a variety of industries on projects ranging in scale from one-off decoupled sites for conferences to multi-thousand tenant multisite platforms for financial services agencies. He works with customers and partners to transform business requirements into technical solutions that exceed expectations, enjoys solving hard problems, and always aims to enable customers and partners to do the same. Dane is also an active member of and regular contributor to Drupal and the larger open-source community. He has contributed and maintained several enterprise modules, contributed over 1000 patches, commits, or pull requests, and is currently a maintainer of BLT, an automation framework for Drupal. He has also presented at several Drupal camps, with a focus on configuration management and automation. Outside of Drupal, Dane is a regular contributor to open-source projects (such as Serverless and Symfony) and enjoys developing for a number of other ecosystems including Amazon Alexa, Google Home, and Android. He is the creator and maintainer of several Alexa Skills and Android apps on the Google Play store. Several of these are open source and make use of server-side components such as Node.js and MongoDB. Through his work, he has also discovered and responsibly disclosed multiple security vulnerabilities in popular enterprise applications. Prior to joining the Drupal community, Dane was a NASA Space Technology Research Fellow at Rice University and Johnson Space Center, where he researched human-computer interaction and developed haptic teleoperation interfaces for NASA’s Robonaut program. His other interests include rock climbing, traveling, and spending time outdoors.

Mutation testing in the real world

May 3, 2024
9 minute read
Like 1

Dislike 0

In my last article, I helped you set up mutation testing using the Infection library on an example project that implemented a simple card game. Of course, the real world is not so simple. When you try to implement mutation testing in your own project, you’re likely to encounter stumbling blocks and questions such as:

How do I interpret a mutation test report?
How should I handle timeouts and errors?
Why do tests that pass in PHPUnit fail in Infection?
How do I exclude test cases from mutation testing?

These issues can seem overwhelming at first, but fear not! In this post I’ll walk you through how to get mutation tests running (and running well) on a real project.

Interpreting mutation reports

I’ve updated my example Mutation Testing application to demonstrate two of the most common problems in mutation testing: errors and timeouts. Try it yourself:

pecl install pcov # If you don't already have pcov installed
git clone https://github.com/danepowell/mutation-example.git --branch errors
cd mutation-example
composer install
./vendor/bin/infection --show-mutations

The results should look something like this:

It may not be immediately obvious what each part of this report indicates, and indeed, whether it’s good or bad. Let’s break it down:

The first few lines indicate that Infection was able to generate 43 mutations. This means that Infection attempted to mutate (break) your code 43 times. Recall that Infection has a finite set of mutators it can apply to your source code, so the number of mutations generated will be a function of the size of your code base and how many mutators are applicable. This number is not a reflection of code quality.
We then see 38 mutants were killed. This means that of those 43 mutations, 38 caused tests to fail. This is a good thing! Killed mutants are good and indicate high quality tests. Recall that the ratio of killed mutants to mutations generated is the MSI (38/43 or 93% in this case).
3 covered mutants were not detected. This means that of 43 mutations, three did not cause tests to fail. Uncovered mutants are bad and indicate poor quality tests. See the first post in the series for tips on how to kill mutants.
1 errors were encountered. This means that when Infection mutated your code, it didn’t just break tests, it actually caused a fatal error such as memory exhaustion via an infinite loop. See the next section for an example and tips on how to fix these errors. Errors are bad because they slow down tests, but they don’t indicate any problem with your code.
1 time outs were encountered. Like errors, timeouts indicate a test took too long. Often this is due to a mutation causing a wait condition to never terminate. See the next section on how to fix them. Timeouts are bad because they slow down tests, but they don’t indicate any problem with your code.
0 mutants required more time than configured. Remember how Infection runs each test case once before applying mutators? Part of the purpose of this check run is to measure how long each test case takes. If the runtime is longer than the configured timeout, Infection skips the test entirely to avoid wasting time on tests it is reasonably certain will time out anyway. Skipped mutants are bad because they reduce test coverage. Try to speed up these tests or increase the timeout.

Handling timeouts and errors

As mentioned, timeouts and errors don’t indicate a problem with the quality of your source code; they are simply a byproduct of mutation testing. However, they are still a cause for concern because over time they can degrade the performance of mutation tests by increasing the amount of required time and resources (especially memory.)

For instance, consider the following mutation which generates an error:

In this case, the Increment mutator causes an infinite loop, leading to memory exhaustion. Besides wasting time and resources on a test that you know will fail, this may have knock-on effects and cause stability issues on the machine running tests, so it’s best to address these errors by disabling mutators on the affected line using comments such as /** @infection-ignore-all */.

The same mutation could result in a timeout instead of an error if the process runs out of time before it runs out of other resources:

In either case, the easiest fix is to disable mutators for that line of code.

You might be inclined to increase the Infection timeout in order to “fix” timeouts. Unless you know that your code legitimately takes longer than the timeout to function, this is likely to just exacerbate the problem and either convert timeouts to errors (i.e., memory exhaustion) or make tests take longer.

Why tests pass in PHPUnit and fail in Infection

When you add mutation testing to an existing project, the problem you’re most likely to encounter is that mutation tests fail when running the initial test suite, resulting in a rather alarming error:

Before actually running any mutation tests, Infection ensures your tests are passing by running the initial test suite. This is an important step, because otherwise failing tests would appear as caught mutants, erroneously boosting your mutation score indicator (MSI).

You might wonder why your tests pass in PHPUnit and fail in Infection. The most likely answer is that Infection runs tests in parallel and random order, whereas PHPUnit by default runs tests serially in a static order.

To ensure PHPUnit behaves like Infection, use a tool like ParaTest to run tests in parallel and configure PHPUnit to run tests in random order by updating your phpunit.xml:

<phpunit executionOrder="random">
   <!--  ...  -->
</phpunit>

The Infection documentation provides guidance on the most common root causes of test failures and some workarounds. Having fully independent test cases that can run in parallel will make your tests much faster and more robust. For instance, Acquia CLI is a great example of how to implement mutation testing on a complex application and is able to run over 350 functional tests in less than 3 seconds.

If you follow the best practices defined here and run PHPUnit tests yourself prior to running Infection, you can save even more time by using the --skip-initial-test flag.

Excluding tests from Infection

If you have tests that cannot be parallelized, or that otherwise are incompatible with mutation testing, it’s easy to exclude them by adding the testFrameworkOptions directive to infection.json5. For instance, a real-world configuration file that excludes test cases annotated as “serial” might look like this:

{
    "$schema": "vendor/infection/infection/resources/schema.json",
    "source": {
        "directories": [
            "src"
        ]
    },
    "logs": {
        "stryker": {
            "report": "main"
        },
        "github":  true,
        "html": "var/infection.html"
    },
    "mutators": {
        "@default": true
    },
    "timeout": 300,
    "testFrameworkOptions": "--exclude-group=serial"
}

Any value you provide for testFrameworkOptions will be passed directly to PHPUnit, so you could also use --filter or similar arguments.

Now it’s your turn

Hopefully this demystifies mutation testing and you're ready to go improve your tests. If you get stuck, be sure to check the comprehensive Infection documentation, and if you're still stuck open an issue on GitHub.