In this post, I'll give a short update on the zigthesis library.
What is Shrinking?
Shrinking is the mechanism by which a property-based testing (PBT) library generates a minimally falsifying case.
Let's assume we wish to test an implementation of a sort function. Unfortunately, the function is super buggy and thus our PBT library returns the falsifying case:
While correct, a better falsifying case is,
Another example, is if we want to test the following distributive property:
This property is wrong, but instead of throwing a seemingly random falsifying case such as,
many PBT libraries will return something of the form,
While these examples are simple in nature, they illuminate the idea that many falsifying cases may be too complex or indecipherable. Implementing a shrinking function is what allows a PBT library to return clearer results like (2) and (4) rather than (1) and (3).
How to Shrink?
I recently went through Edsko de Vries blog post about Haskell's new PBT library, falsify along with the accompanying paper. Moreover, I perused various posts by David McIver on hypothesis.works.
To summarize the articles above at a very high level: generating inputs relies on pseudo-random number generators (PRNGs). Hypothesis introduced the idea that instead of shrinking generated values, one can shrink the samples produced by the PRNG. In their model, unfolding a PRNG gives us an infinite stream of samples and once unfolded, the focus becomes on parsing the stream of random samples. Haskell's Falsify takes the same idea, but posits that unfolding a PRNG gives us an infinite tree of samples. This shift in definition of what an unfolding is brings with it added benefits and hardships.
The articles cited above are well written, so I'd highly recommend giving them a look for greater understanding!
Shrinking in Zigthesis
The current zigthesis implementation of "shrinking" is a post-facto process. We are not shrinking as part of the generating process. First, a falsifying test case is generated then shrunk by doing basic math operations and retesting if the new values still lead to failing the property. This is an extremely limiting methodology! See our current implementation of shrinking here.
To give an example of this limitation: take the test of whether a list of length three can have entries whose sum exceeds 100. Zigthesis first generates values to find a failing case such as {203, 76, 30}. Then "shrinking" is applied to it, giving us {0, 76, 30}. The simplest test case should be something like, {0,0,101}.
Hypothesis and Falsify shrink within their generating process:
- Hypothesis introduced the idea that we can unfold the PRNG, get samples from this generator, and then parse these samples to produce values.
- Falsify does something similar but represents the unfolding of PRNG as an infinite tree of samples. Thinking of it this way alleviates the burden of deciding how to distribute samples to parsers and helps with predictability.
My understanding of Falsify is still quite hazy. For example, I'm unsure how their process would fare, if one wishes to generate, say, real numbers uniformly (i.e., unbiased) across a range. Even so, I like their take and hope to implement something like it within zigthesis soon.