Announcing Dekking: Next generation code coverage reports for Haskell

This post announces Dekking, a next-generation code coverage tool for Haskell. In the short-term, Dekking unblocked me by easily letting me make multi-package code coverage reports where I previously failed to do so. In the medium-to-long term, Dekking intends to replace HPC.

Why not "just" use HPC?

Multi-package code coverage reports

Most of my project consist of multiple packages. The smos repository, for example, consists of more than 20 Haskell packages. Among those are:

  • Packages with only a library (smos-data)

  • Packages with a library and test suite (smos)

  • Packages with a library that exists only for testing (smos-data-gen)

  • Packages with a test suite for other packages (smos-data-gen tests smos-data)

To make a good code coverage report, I need to correctly:

  • Include smos-data in the coverables, even though it has no test suite to produce coverage.

  • Include both the coverables and the coverage for smos. This also produces coverage for smos-data that needs to be included as well.

  • Not include the library of smos-data-gen in the coverables, because its library exists only for testing.

  • Include the coverage information that smos-data-gen's test suite produces. This should also produce coverage information about smos-data, of course.

I have read most of HPC's code and documentation, and I could not figure out how to make multi-package code coverage reports from Nix. (It's literally taken me less time to build a new tool than I've already wasted trying to use HPC for this.) Over the years I have tried to hack together multiple solutions but they have all turned out very messy.

There is only one tool I know about that can do this: Stack. However, calling Stack from Nix is a mess. So, as part of the feature parity with stack issue, I figured we could address code coverage using a new tool.

Machine-readable output

HPC uses .tix, .mix, and .pix files. These are all custom formats that are only really readable using the Haskell library for hpc. This is not a problem for most uses of code coverage of course, but it makes hpc quite difficult to integrate with.

More machine-readable formats are required for advanced use-cases like custom displays of reports and mutation testing.

Confusing colours

HPC uses the following colour-coding to display coverage information:

  • Red: uncovered (Always false)

  • Green: uncovered (Always true)

  • Yellow: uncovered

  • Clear: covered OR un-coverable

This makes for the rather confusing situation in which any colour means uncovered, but some non-colour means covered.

The colour coding that HPC uses

In its report indexes, it also uses a different colour-coding:

  • Red: Uncovered

  • Green: Covered

This makes the meaning of red before even more confusing.

A HPC coverage report index

Unfixed Bugs

HPC has some unfortunate (minor) unfixed bugs:

Confusing coverables

HPC has support for three types of coverables:

  • Expressions

  • Top-level bindings

  • Alternatives

categories of HPC coverables

Expression coverage is great. Let's keep that.

However, top-level binding coverage clutters the report while adding little actionable information. Top-level bindings are not magically more important for coverage in my view. They are also subsumed by expression coverage because top-level bindings consist of expressions as well. Alternatives suffer from the same issues.

Coverage report performance

Firefox can slow down quite hard on big modules because of the way hpc renders its module reports with very deeply nested HTML for unknown reasons:

Deeply nested html in a hpc coverage report

Why use Dekking?

Decoupling from GHC

It would be great to be able to rip code coverage out of GHC, so that the GHC team has less code to maintain.

To this end, dekking is a post-parse source-to-source-transformation plugin. It transforms the code by replacing every coverable expression e by something like unsafePerformIO (markAsCovered "identifier for e" >> pure e). This uses the problem of unsafePerformIO only evaluating the IO once as a way to keep coverage reporting fast.

This way, code coverage does not have to be built into GHC, as long as this transformation is sound. (Foreshadowing: it isn't in general, see the known issues below.)

See the relevant section of the README for more information.

Multi-package code coverage

Dekking is built from the ground up with multi-package code coverage in mind. It allows you to view all coverage summaries in the index, summaries per package in the per-package index, and per-module coverage in detailed per-module reports.

A Dekking coverage report index
A HPC coverage report module

Nix integration

Nix integration is an important deliverable for Dekking.

Here is the code that produces Smos' code coverage report

coverage-report = pkgs.dekking.makeCoverageReport {
  name = "test-coverage-report";
  packages = [
    "smos"
    "smos-api"
    "smos-archive"
    "smos-calendar-import"
    "smos-client"
    "smos-cursor"
    "smos-data"
    "smos-github"
    "smos-notify"
    "smos-query"
    "smos-report"
    "smos-report-cursor"
    "smos-scheduler"
    "smos-server"
    "smos-single"
    # "smos-stripe-client" # No need for coverage for generated code
    "smos-sync-client"
    "smos-web-server"
    "smos-web-style"
  ];
  # No need for coverables for test packages
  coverage = [
    "smos-api-gen"
    "smos-cursor-gen"
    "smos-data-gen"
    "smos-report-cursor-gen"
    "smos-report-gen"
    "smos-server-gen"
    "smos-sync-client-gen"
    # Coverage for docs site is not interesting, but it runs parts of the rest
    "smos-docs-site"
  ];

Fitting colours

Dekking uses the following colour coding:

  • Clear: Uncoverable

  • Yellow: Uncovered

  • Green: Covered

We didn't use use red to represent uncovered because not covering an expression is not necessarily a bad thing. I am still debating replacing green by blue or so, to also acknowledge that covering an expression isn't necessarily good either.

Only expression coverage

Dekking only uses expression coverables.

There is no distinction between top-level binding coverage, alternative coverage, and expression coverage. There is only coverage: An expression is either uncoverable, uncovered, or covered.

Machine-readable output and reports

All dekking-related files are machine readable.

  • The .coverables files are json files.

    {
        "coverables": [
            {
                "location": {
                    "end": 18,
                    "line": 7,
                    "start": 16
                },
                "value": "()"
            },
            [...]
            {
                "location": {
                    "end": 17,
                    "line": 10,
                    "start": 13
                },
                "value": "pure"
            }
        ],
        "module-name": "Lib",
        "package-name": "example-0.0.0",
        "source": "module Lib\n  ( covered,\n  )\nwhere\n\ncovered :: IO ()\ncovered = pure ()\n\nuncovered :: IO ()\nuncovered = pure ()\n"
    }
  • The .coverage files are text files where each line represents a unit of coverage:

    main Main 4 8 15
    [...]
    example-0.0.0 Lib 7 11 15
    
  • The coverage reports are first and foremost a JSON file: report.json.

Known issues

It turns out the code transformation mentioned in the README is not actually sound. There are expressions e which no longer type-check when you turn them into adaptValue "constant string" e, or even id e.

In the face of RankNTypes, GHC can sometimes no longer type-check an expression. Consider for example the following expression:

f :: Int -> (forall a. a -> a)

GHC needs ImpredicativeTypes to be turned on in order to type-check id f.

But there are some expressions that don't even type-check with ImpredicativeTypes. Unfortunately, Servant's hoistServer is one of them, as is Yesod's loginErrorMessageI.

I have tried (and failed) to turn Dekking into a type-checking plugin in order to try to pinky-promise to GHC that this transformation will type-check. In the mean time, I have also implement ways to turn off coverable generation selectively:

1. With an `--exception` for the plugin: `-fplugin-opt=Dekking.Plugin:--exception=My.Module`
2. With a module-level annotation: `{-# ANN module "NOCOVER" #-}`
3. With a function-level annotation: `{-# ANN hoistServerWithContext "NOCOVER" #-}`

Conclusion

Dekking is ready to try out! I already have code coverage reports for all my projects now, and they're being built in CI where that was previously impossible for me.

Previous
2023; year in review

Start your Haskell project from a template

Haskell templates
Next
Automate your feedback loops using feedback