Property testing in depth: genvalidity-criterion and genvalidity-* performance improvements

The genvalidity library and its companion libraries have recently gotten some nice random distribution and performance upgrades. This post will announce genvalidity-criterion and give a simple overview of the performance improvements.

In a previous blogpost, I covered the size parameter for property testing generators in depth. That blogpost covers some important considerations for the improvements that have recently been made to genvalidity-* so you may want to read that one first.

Generating a list.

Generating a list of elements in QuickCheck looks like this:

listOf :: Gen a -> Gen [a]
listOf gen =
  sized $ \n -> do
    k <- choose (0,n)
    replicateM k gen

In other words, the size parameter is not distributed across the list at all, but still used to generate the size of the list. This means that arbitrary :: Gen [a] will generate lists of lengths up to the size, arbitrary :: Gen [[a]] will generate lists with a quadratic number of elements with respect to the size parameter, etc.

Generating a list of elements in genvalidity generates an arbitrary partition of the size parameter. That is to say, a list of integers that sum to the size parameter. This list is then used to distribute the total size across the elements of the list.

genListOf :: Gen a -> Gen [a]
genListOf func =
    sized $ \n -> do
        pars <- arbPartition n
        forM pars $ \i -> resize i func

The latest version of genvalidity (0.9.1.0, at the time of writing) still does this, but generates the arbitrary partition in a much more intelligent way. Previous versions would only generate fairly small lists, albeit with well distributed sizes.

arbPartition :: Int -> Gen [Int]
arbPartition i = go i >>= shuffle
  where
    go k
        | k <= 0 = pure []
        | otherwise = do
            first <- choose (1, k)
            rest <- arbPartition $ k - first
            return $ first : rest

Indeed, this code makes it unlikely that a list of a length around size is generated. Most lengths would have been around size 5.

The current version certainly also generates lists with a length up to the size parameter, still with well distributed sizes.

arbPartition 0 = pure []
arbPartition i = genListLengthWithSize i >>= go i
  where
    go :: Int -> Int -> Gen [Int]
    go size len = do
      us <- replicateM len $ choose (0, 1)
      let invs = map (invE 0.25) us
      -- Rescale the sizes to (approximately) sum to the given size.
      pure $ map (round . (* (fromIntegral size / sum invs))) invs

    -- Use an exponential distribution for generating the
    -- sizes in the partition.
    invE :: Double -> Double -> Double
    invE lambda u = - log (1 - u) / lambda

-- Generate a list length with the given size
genListLengthWithSize :: Int -> Gen Int
genListLengthWithSize maxLen = round . invT (fromIntegral maxLen) <$> choose (0, 1)
  where
    -- Use a triangle distribution for generating the
    -- length of the list
    -- with minimum length '0', mode length '2'
    -- and given max length.
    invT :: Double -> Double -> Double
    invT m u =
      let a = 0
          b = m
          c = 2
          fc = (c - a) / (b - a)
      in if u < fc
        then a + sqrt (u * (b - a) * (c - a) )
        else b - sqrt ((1 - u) * (b - a) * (b - c))

This code is much more complex, and it requires some knowledge of the inverse transform sampling trick, triangle distributions and exponential distributions. It turns out that we can use some fancy maths to generate a value from any distribution as long as we know some particular things about it. In this code, we use a triangle distribution to generate the length of the list, and an exponential distribution to generate the size of the elements in the list.

In short: list sizes are now much more likely to be of a size closer to the value of the size parameter while the size parameter is still distributed accross the elements of the list.

Benchmarking generators with `genvalidity-criterion`

When property test suites are slow, usually the problem is that the generators are slow. Some of the genvalidity-* generators could use improvements, so I set about speeding them up.

There is little that annoys me more about programming than developers spending time on performance. I have yet to meet a single developer who writes correct enough code that they should be allowed to care about performance. Nevertheless, sometimes performance is a feature, and in this case better performance will mean more correct code.

The first step, as usual, is to set up a feedback loop. How do I know if my performance improvements are indeed improvements? We use benchmarks. How do we benchmark property testing generators? Using genvalidity-criterion.

The new genvalidity-criterion allows you to simply benchmark generators. There are these two important benchmark combinators:

genValidBench @a :: Benchmark
genBench :: String -> Gen a -> Benchmark

Here is an example:

{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE TypeApplications #-}

module Main where
import Criterion.Main as Criterion

import Data.GenValidity.Criterion

main :: IO ()
main =
  Criterion.defaultMain
    [ genValidBench @()
    , genValidBench @Bool
    , genValidBench @Ordering
    , genValidBench @Char
    , genValidBench @Int
    , genValidBench @Word
    ]

This new little library allowed me to accurately make the following improvements.

Speeding up generators

Speeding up `genvalidity-text`

Text used to be generated using Data.Text.pack <$> genValid. That is to say, text was generated by generating a list of characters and then turning them into a Text value.

We were able to speed up text generation 3-5x by using Data.Text.unfoldrN directly. As a nice benefit, that approach works for any Gen Char:

genTextBy :: Gen Char -> Gen Text

To generate a Char in the default GenValid instance, the size parameter is not considered. (The QuickCheck instance mostly generates ASCII characters.)

Speeding up `genvalidity-containers`

Generating trees used to have a strong bias towards the root element being bigger than the sub forest. We managed to speed up generating trees slightly, while also fixing this problem. The size of the elements of the tree is now determined ahead of time, by generating a nonempty list first and then turning it into a tree randomly.

-- | Generate a tree of values that are generated as specified.
--
-- This takes the size parameter much better into account
genTreeOf :: Gen a -> Gen (Tree a)
genTreeOf func = do
    ne <- genNonEmptyOf func
    turnIntoTree ne
 where
    turnIntoTree :: NonEmpty a -> Gen (Tree a)
    turnIntoTree (e :| es) = do
      groups <- turnIntoGroups es
      subtrees <- mapM turnIntoTree groups
      pure (Node e subtrees)

    turnIntoGroups :: [a] -> Gen [NonEmpty a]
    turnIntoGroups = go []
      where
        go :: [a] -> [a] -> Gen [NonEmpty a]
        go acc [] =
          case NE.nonEmpty acc of
            Nothing -> pure []
            Just ne -> pure [ne]
        go acc (e:es) =
          frequency
            [ ( 1
              , do rest <- go [] es
                   pure ((e :| acc) : rest))
            , (4, go (e : acc) es)
            ]

Property testing in depth: genvalidity-criterion and genvalidity-* performance improvements

Generating a list.

Benchmarking generators with genvalidity-criterion

Speeding up generators

Speeding up genvalidity-text

Speeding up genvalidity-containers

Benchmarking generators with `genvalidity-criterion`

Speeding up `genvalidity-text`

Speeding up `genvalidity-containers`