the caveman skill has been going crazy, so I built a claude-code harness to implement it on real tasks
result: 26.8 percent token reduction.
zero observed degradation. zero failed
parses. zero verifier regressions
across the test suite.
(credit to @juliusbrussee)