This is a beautiful articulation of a major pet peeve when using these coding tools. One of my first review steps is just looking for all the extra optional arguments it's added instead of designing something good.
To solve this permanently, use a linter and apply a "ratchet" in CI so that the LLM cannot use ignore comments
I will, often go back, after the fact, and ask for refactors and documentation.
It works. Probably a lot slower than using agents, but I test every step, and it is a lot faster than I would do it, unassisted.
I use a test harness, and step through the code, look at debug logs, and abuse the code, as much as possible.
Kind of a pain, but I find unit tests are a bit of a "false hope" kind of thing: https://littlegreenviper.com/testing-harness-vs-unit/
https://testing.googleblog.com/2025/10/simplify-your-code-fu...
The key insight of FCIS is that complicated logic with large dependencies leads to a large test suite that runs slowly. The solution is to isolate the complicated logic in the functional core. Test that separately from the simpler, more sequential tests of the imperative shell.
For instance, two things I'm currently working on: - A reasonably complicated indie game project I've been doing solo for four years. - A basic web API exposing data from a legacy database for work.
I can see how the API could be developed mostly by agents - it's a pretty cookie cutter affair and my main value in the equation is just my knowledge of the legacy database in question.
But for the game... man, there's a lot of stuff in there that's very particular when it comes to performance and the logic flow. An example: entities interacting with each other. You have to worry about stuff like the ordering of events within a frame, what assumptions each entity can make about the other's state, when and how they talk to each other given there's job based multi-threading, and a lot of performance constraints to boot (thousands of active entities at once). And that's just a small example from a much bigger iceberg.
I'm pretty confident that if I leaned into using agents on the game I'd spend more time re-explaining things to them than I do just writing the code myself.
I was shocked recently when it helped me diagnose a musl compile issue, fork a sys package, and rebuild large parts of it in 2 hours. Would've taken me atleast 2 weeks to do it without AI.
Don't want to reveal the specific task, but it was a far out of training data problem and it was able to help me take what would've normally taken 2 weeks down to 2 hours.
Since then I've been going pretty hard at maximizing my agent usage, and tend to have a few going at most times.
I started working on something today I hadn't touched in a couple years. I asked for a summary of code structure, choices I made, why I made them, required inputs and expected outputs. Of course it wasn't perfect, but it was a very fast way to get back up to speed. Faster than picking through my old code to re-familiarize myself for sure.
This is my take on how to not write slop.
In a blameless postmortem style process, you would look at not just the mistake itself but the factors influencing the mistake and how to mitigate them. E.g., doctor was tired AND the hospital demanded long hours AND the industry has normalized this.
So yes, the programmers need to hold the line AND ALSO the velocity of the tool makes it easy to get tired AND and its confidence and often-good results promote laziness or maybe folks just don’t know better AND it can thrash your context and bounce you around the code base making it hard to remember the subtleties AND on and on.
Anyway, strong agree on “dude, review better” as a key part of the answer. Also work on all this other stuff and understand the cost of VeLOciTy…
If you're not familiar with the patch enough to answer any question about it, you shouldn't submit it for review.
And you’re completely right, humans are still the ones in control here. It’s entirely possible to use AI without lowering your standards.
I generally agree on this as best practice today, though I think it will become irrelevant in the next 2 generations of models.