3 pointsby smb062 hours ago2 comments

smb062 hours ago
It is a separation-of-duties argument for code written by AI agents and also reviewed by the same AI agents.
If you ask Agent1 to write code, the PR diff coming out of it should really be reviewed by Agent2 (or 3 or whatever), the reviewer should be a different system than the code gen agent.
The core claim is that a model is a poor judge of its own work — it shares the same priors and blind spots that produced a bug, so self-review trends toward a shallow review rather than catching blind spots.
The system that writes the code shouldn't also be the thing that signs off that it's safe to ship.
yxz123an hour ago
spot on! You shouldn't be grading your own homework