RFC for a stricter way to evaluate software agents: not just whether they finish with a correct patch, but whether they can operate safely over persistent state, recover from failure, obey policy, coordinate with other agents, and explain what changed with verifiable provenance.
Every write → immutable version → SHA-256 hash-chained audit entry Any version → restore in <50 ms (metadata flip, no blob copy)