The "professional report version nobody read" observation is important and undersold. Feedback only works if people actually process it, and sharp tone forces processing in a way that polite summaries don't. The risk is overshooting into mean-for-laughs rather than mean-for-change, but that's a calibration problem not a direction problem.
The requirement inflation catch is the most genuinely useful thing here. "8 years of Kubernetes" is funny as an example but the underlying pattern, requirements copy-pasted from a previous hire's resume rather than derived from what the role actually needs, is endemic and causes real damage. It shrinks the pool, filters for credential-holders over performers, and signals to good candidates that nobody thought carefully about the role. A roast that makes that legible is doing something useful.
On the architecture decision: separating deterministic scoring from LLM commentary is the right call and worth explaining more prominently. It's what makes the scores comparable across JDs, which is the thing that would make this actually useful at scale for recruiting teams rather than just a one-off novelty. Right now it reads like a fun tool. That separation is what makes it a platform.
Remote clarity is worth adding. In 2026 "hybrid" without defining what hybrid means is its own category of dishonesty.