To reach true Human Outside The Loop stability, I’m experimenting with two "math gates"
Ambiguity Gate: The Socratic interview phase doesn't end until the calculated ambiguity drops below 0.2
Convergence: Instead of fixed iterations, the loop stops when the system's schema stabilizes across generations
At a recent hackathon in Korea, I used this to let an agent run unsupervised for 7 hours overnight. It generated ~100k lines of code (including 70k lines of tests/mocks) and successfully built a hardware-integrated system while I slept.
The "Deep Interview" pattern from this project was recently merged into the oh-my-claudecode (OMC) v4.6.0 release as an official skill.I'd love to hear your thoughts on how to measure "real" architectural progress in long-running agent loops versus shallow code churn.