This post announces
genvalidity-0.10.0.0 and recently released companion libraries.
This post explores the changes to the generators of
QuickCheck approach to generating values of type
Word uses the
-- | Generates an integral number. The number can be positive or negative -- and its maximum absolute value depends on the size parameter. arbitrarySizedIntegral :: Integral a => Gen a arbitrarySizedIntegral = sized $ \n -> inBounds fromInteger (choose (-toInteger n, toInteger n))
Values of types
Word64 are generated using the
(You don't really need to understand how this works.)
-- | Generates an integral number from a bounded domain. The number is -- chosen from the entire range of the type, but small numbers are -- generated more often than big numbers. Inspired by demands from -- Phil Wadler. arbitrarySizedBoundedIntegral :: (Bounded a, Integral a) => Gen a arbitrarySizedBoundedIntegral = withBounds $ \mn mx -> sized $ \s -> do let bits n | n == 0 = 0 | otherwise = 1 + bits (n `quot` 2) k = (toInteger s*(bits mn `max` bits mx `max` 40) `div` 80) -- computes x `min` (2^k), but avoids computing 2^k -- if it is too large x `minexp` k | bits x < k = x | otherwise = x `min` (2^k) -- x `max` (-2^k) x `maxexpneg` k = -((-x) `minexp` k) n <- choose (toInteger mn `maxexpneg` k, toInteger mx `minexp` k) return (fromInteger n)
All that matters from the above piece of code is that you realise that both of these generators mostly only generate very small values. Observe:
λ> sample (arbitrary :: Gen Word) 0 2 2 6 8 8 1 5 10 7 8
λ> sample (arbitrary :: Gen Word64) 1 1 5 10 10 193 405 1536 172 13414 7990
In particular, values around
maxBound are never tried.
As a result, the following property just passes:
λ> quickCheck $ \n -> abs (n :: Int) >= 0 +++ OK, passed 100 tests.
This is a clear example of a bad generator that hides a real failure.
Indeed, a better generator may have generated
minBoud :: Int to find the following failure:
λ> minBound :: Int -9223372036854775808 λ> abs (minBound :: Int) -9223372036854775808 λ> abs (minBound :: Int) >= 0 False
genUnchecked Int instance uses a more sophisticated approach.
It does not distinguish between
Int64, and uses the following helper function:
-- | Generate Int, Int8, Int16, Int32 and Int64 values smartly. -- -- * Some at the border -- * Some around zero -- * Mostly uniformly genIntX :: forall a. (Integral a, Bounded a, Random a) => Gen a genIntX = frequency [ (1, extreme) , (1, small) , (8, uniform) ] where extreme :: Gen a extreme = sized $ \s -> oneof [ choose (maxBound - fromIntegral s, maxBound) , choose (minBound, minBound + fromIntegral s) ] small :: Gen a small = sized $ \s -> choose (- fromIntegral s, fromIntegral s) uniform :: Gen a uniform = choose (minBound, maxBound)
This generator generates small values, just like
QuickCheck does, 10% of the time.
It generates extreme values, around the bounds, 10% of the time.
And finally, it generates uniformly distributed values the rest of the time.
This generator does not hide the failure from before, indeed it finds the
λ> quickCheck $ forAll genUnchecked $ \n -> abs (n :: Int) >= 0 *** Failed! Falsified (after 2 tests): -9223372036854775808
Word values, the new
genvalidity generators use a similar approach, but obviously do not try to generate negative values.
Generating floating point values has been controversial already.
To generate a floating point number,
QuickCheck uses the
-- | Generates a fractional number. The number can be positive or negative -- and its maximum absolute value depends on the size parameter. arbitrarySizedFractional :: Fractional a => Gen a arbitrarySizedFractional = sized $ \n -> let n' = toInteger n in do b <- choose (1, precision) a <- choose ((-n') * b, n' * b) return (fromRational (a % b)) where precision = 9999999999999 :: Integer
The biggest problem with this generator is that it mostly only generates small values. As a result, we get the following false negative:
λ> quickCheck $ \f -> (f :: Double) < 200 +++ OK, passed 100 tests.
The next big problem is that this generator never generates denormalised values. Hence the following false negative (with a counterexample):
λ> quickCheck $ \f -> f == (f :: Double) +++ OK, passed 100 tests. λ> let inf = read "NaN" :: Double in inf == inf False
The last problem is that this generator does not generate values around the bounds of
This poses the following false negative:
λ> quickCheck $ \f -> length (show (round (f :: Double) :: Integer)) < 100 +++ OK, passed 100 tests. λ> Prelude> length (show (round (1E200))) 200
Granted, this last one does not sound like a big problem, but a problem like this has already occurred in practice in a central package in the ecosystem.
genvalidity, the generators are implemented as
genFloatX castWord32ToFloat for
genFloatX castWord64ToDouble for
-- | Generate floating point numbers smartly: -- -- * Some denormalised -- * Some around zero -- * Some around the bounds -- * Some by encoding an Integer and an Int to a floating point number. -- * Some accross the entire range -- * Mostly uniformly via the bitrepresentation -- -- The function parameter is to go from the bitrepresentation to the floating point value. genFloatX :: forall a w. (Read a, RealFloat a, Bounded w, Random w) => (w -> a) -> Gen a genFloatX func = frequency [ (1, denormalised) , (1, small) , (1, aroundBounds) , (1, uniformViaEncoding) , (6, reallyUniform) ] where denormalised :: Gen a denormalised = elements [ read "NaN" , read "Infinity" , read "-Infinity" , read "-0" ] -- This is what Quickcheck does, -- but inlined so QuickCheck cannot change -- it behind the scenes in the future. small :: Gen a small = sized $ \n -> do let n' = toInteger n let precision = 9999999999999 :: Integer b <- choose (1, precision) a <- choose ((-n') * b, n' * b) pure (fromRational (a % b)) upperSignificand :: Integer upperSignificand = floatRadix (0.0 :: a) ^ floatDigits (0.0 :: a) lowerSignificand :: Integer lowerSignificand = - upperSignificand -- Floating point numbers are symmetric (lowerExponent, upperExponent) = floatRange (0.0 :: a) aroundBounds :: Gen a aroundBounds = do s <- sized $ \n -> oneof [ choose (lowerSignificand, lowerSignificand + fromIntegral n) , choose (upperSignificand - fromIntegral n, upperSignificand) ] e <- sized $ \n -> oneof [ choose (lowerExponent, lowerExponent + n) , choose (upperExponent - n, upperExponent) ] pure $ encodeFloat s e uniformViaEncoding :: Gen a uniformViaEncoding = do s <- choose (lowerSignificand, upperSignificand) e <- choose $ floatRange (0.0 :: a) pure $ encodeFloat s e -- Not really uniform, but good enough reallyUniform :: Gen a reallyUniform = func <$> choose (minBound, maxBound)
This generator generates some denormalised values, some small values (like QuickCheck does), some values around the boundaries of the type, some values via the encoding (no denormalised values), but most values in a truly uniform manner.
You may wonder what the point is of the truly uniform generator if there is already the
uniformViaEncoding generator will not generate different
Infinity values, while the
realyUniform generator certainly will generate different
I am pleased to say that this new generator has turned all the above false negatives into true positives:
λ> quickCheck $ forAllUnchecked $ \f -> (f :: Double) < 200 *** Failed! Falsified (after 1 test and 867 shrinks): 6.953710695e12 λ> quickCheck $ forAllUnchecked $ \f -> (f :: Double) == f *** Failed! Falsified (after 2 tests): NaN λ> quickCheck $ forAllUnchecked $ \f -> length (show (round (f :: Double) :: Integer)) < 100 *** Failed! Falsified (after 2 tests and 458 shrinks): 1.0005000767033805e99
Here is a sample of what is being generated:
λ> sample (genUnchecked :: Gen Double) NaN -1.5382021342352537e-256 -1.4064318498034637e-185 -1.8756650198989648 2.3970675652314379e297 -Infinity -0.0 1.5822616745384519e58 0.0 -4.436269974742419 Infinity
Natural values in
QuickCheck also uses the
That means that these generators have the same problems as discussed above, but most importantly:
They never generate the values that these types were made to contain.
If your application uses
Natural values, then I will argue that that means you expect to be dealing with values that are too big (or too small) for
arbitrary generator will never generate those:
λ> sample (arbitrary :: Gen Integer) 0 -2 0 -4 3 0 8 -3 -15 -6 -2
genvalidity generator treats
Integer like more of a list-like type:
genInteger :: Gen Integer genInteger = sized $ \s -> oneof $ (if s >= 10 then (genBiggerInteger :) else id) [ genIntSizedInteger , small ] where small = sized $ \s -> choose (- toInteger s, toInteger s) genIntSizedInteger = toInteger <$> (genIntX :: Gen Int) genBiggerInteger = sized $ \s ->do (a, b, c) <- genSplit3 s ai <- resize a genIntSizedInteger bi <- resize b genInteger ci <- resize c genIntSizedInteger pure $ ai * bi + ci
It generates small integers,
Int-sized integers and bigger integers:
λ> sample (genUnchecked :: Gen Integer) 0 1892136562016313391 -3 3 8567266908859058210 78953779483980475410579428557318571161 -8861699909278588009 -366627378773369755 8322693498186903360 -6 -12194936973173690077167751826204497798
Real world impact
If you are already using validity-based testing (kudos to you), then you should expect some more test failures when you upgrade to the latest version.
This upgrade has already found bugs in validity itself, in smos, in mergeless, in mergeful, in persistent, in intray and in tickler. This is essentially every project I tried it out on before releasing and one more (persistent).