Test-case reducers are underappreciated debugging tools(tratt.net)

151 pointsby ltratt3 days ago9 comments

WalterBright3 days ago
Dustmite is a fantastic tool for finding a bug in your program, by removing parts of the code until the result is the bug.
https://dlang.org/blog/2020/04/13/dustmite-the-general-purpo...
Created by Vladimir Panteleev
mrkeen3 days ago
> To make things even worse, the community that has most thoroughly embraced them are compiler authors, who many programmers think of as being an impossibly skilled elite
The article's approach seems super ad-hoc, leaving you to have to think hard, do all the work, and make all the mistakes.
If you were to go down the other path, you might try dividing and conquering the problem. An arbitrary Pair<A,B> is trivially constructed from an arbitrary A and an arbitrary B. So if you can generate a string, and a number, you could generate a User full of number and string fields. If your generate function accepts a number describing how complex a string to make, then you can also choose how complicated to make your User. That's all shrinking needs to be. Repeatedly trying smaller Ns while the problem still happens (the problem being one of your unit tests - not an additional "interestingness" test you need to write.)
You'll probably way more likely to hit boundary cases by using the structure of the input and making interesting variations that way, rather than hoping you can permute the right bytes from the CLI.
- pfdietz2 days ago
  For more than 20 years I've been doing automatic test input reduction as part of testing Common Lisp compilers. The reduction is on randomly generated inputs, but they are structured in such a way that reduction always gives a valid program that should (in the absence of compiler errors) not signal an error.
  It's a tremendously economical way to test compilers. For a modest and finite investment in testing infrastructure I get an unlimited number of tests. Over the years I've run many billions of test inputs on various Common Lisp implementations, although I'm mostly focusing on sbcl these days. When a bug is found the input quickly reduces to a something small that usually immediately tells the developers where the problem is (usually but not always something introduced recently.)
  I also have a testing harness that cobbles together usually erroneous Lisp code and sees if the compiler blows up (the sbcl compiler as designed must never throw an error condition even on erroneous input.) This exploits a corpus of public Common Lisp code, combining and mutating the code in various ways.
skybrian3 days ago
Property-based testing frameworks will often do test case reduction as well (called shrinking).
- Jtsummers3 days ago
  Shrink Ray, described in the article, is developed by D.R. MacIver who also developed Hypothesis. I remember when it was announced a while back but had forgotten about it, I guess I have something to play with tonight.
  - akshayshah3 days ago
    These days, he’s also working on Hegel - bringing test case reduction and PBT to more languages.
    https://hegel.dev
- macintux3 days ago
  Brilliant tools, well worth investigating for any system-critical applications. They don't seem to get enough attention outside of the FP community.
hungryhobbit3 days ago
I read the first part of this article, then gave up and Googled "Test-case Reducers".
I'm not sure if that's an article failure (that I didn't want to read a whole ton of text and C code details), or a success (as it got me interested in the topic). I guess both?
- Hnrobert423 days ago
  I read the whole article, and I am still confused. I get that test case reducers find the smallest error causing string. I don't understand why that is particularly valuable.
  Also, do test case reducers work on integers or other numbers? What about reducing some other complexity? Is this for developing unit tests or just debugging?
  - kg3 days ago
    A reduced test case means you run less code to process the test case, which means your breakpoints trigger less frequently (and the remaining breakpoint triggers are more likely to be relevant to the actual bug). It also means all your debugging steps are likely to run faster and produce less data to sort through. Your log files will be shorter and easier to read/grep, etc.
    Imagine being handed a sheet of 10 equations and being told "1 of these equations is wrong." Now imagine that someone came in and erased 8 of the correct equations - they just saved you a bunch of time.
  - seanhunter2 days ago
    Making the smallest test case that reproduces some bug is hugely valuable when debugging complex systems, especially if you have a wierd heisenbug that is hard to manifest reliably. Having a small reproducible case massively narrows the scope of the search for the bug.
    Similarly, narrowing test cases to the smallest case that reproduces a particular behaviour so you're only actually testing a very targeted thing will make the test suite faster and also make it easier to fix tests which break because they exercise a very narrow path.
  - Doxina day ago
    Reduced test cases make it way easier to figure out what the actual bug is. E.g. a real world example is when I tested a "slugify" function with hypothesis. It almost immediately spat out this failing test case:
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaẞ
    Boy I sure do wonder what character could possibly be causing issues there. whereas without shrinking it might instead spit out something like
    ЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДẞЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмноп
    Which makes finding the offending character a lot harder.
- Jtsummers3 days ago
  > I read the first part of this article, then gave up and Googled "Test-case Reducers".
  It's answered pretty early on:
  >> Test-case reducers try to reduce the length of an input
  If that still doesn't answer the question, try this extension:
  >> Test-case reducers try to reduce the length of an [error causing or interesting] input
- chriswarbo2 days ago
  > I didn't want to read a whole ton of text and C code details
  There's no C in there? It seems to be Python and shell scripts.
nnunley3 days ago
I have a similar tool to shrink ray, called [bonsai](https://github.com/nnunley/bonsai). I designed it to allow me to try to inline and reduce code for both simplifying single file examples, as well working across multiple files. It uses Tree-Sitter for syntax awareness, and the [Perses algorithm](https://doi.org/10.1109/ICSE.2018.00046) as the methodology for simplification.
I'd love to get some feedback if anyone's interested.
- ltratt2 days ago
  This looks interesting, and definitely useful for non-C/Python languages, which existing reducers I know of mostly don't have explicit support for! I can't get it to build though (I've filed an issue).
  I was also wondering: is there a UI while reduction happens? I've found Shrink Ray's UI improvements in the last year to be much more useful than I first expected: not just because it gives me something to look at, but because it really helps me understand if reduction is on the right path or not. [Some of the new Shrink Ray extras like being able to rewind reduction to a past point and to skip passes are also really useful too.]
- anitil2 days ago
  What a great idea to include tree sitter
anitil2 days ago
> Unfortunately, Shrink Ray has no principled way for me to express this. Fortunately, I have no principles, and use unsafe hacks like this
I really appreciated the humour in this article. I've always wanted to use creduce, but it looks like Shrink Ray is easier to set up and get running (pip install, set up a harness, run)
sigbottle3 days ago
I've only ever known about these through compilers, very cool.
On one project, through a variety of circumstances, dead code elimination was straight up not working, but we wanted to show the theoretical improvement of some approach - but we couldn't figure out why at the moment (we did spend a whole week chasing down the root cause after - maybe worth in hindsight...).
We were doing it by hand at one point, but someone suggested using CReduce for shrinking the code. Definitely was an interesting test-iterate loop...
bobbiechen3 days ago
Nice share. Increasingly I am thinking about ways to improve verification ("interestingness tests"), ever since reading https://www.jasonwei.net/blog/asymmetry-of-verification-and-...
pedromlsreis2 days ago
[flagged]