Our new paper introduces a verification layer that checks every diagnostic claim an AI makes before it reaches a clinician. When our system says a diagnosis is supported, it's been mathematically proven - not just guessed. Every model we tested improved significantly after verification, with our best result hitting 99% soundness.
We're excited about what comes next in building verifiably correct AI systems.