MCP server that audits AI agent reasoning before decisions commit(espiradev.org)

2 pointsby aespira5 hours ago1 comment

aespira5 hours ago
I built SENTINEL to solve a problem I kept hitting in healthcare AI: agents report high confidence on decisions backed by stale evidence. An agent says 89% confidence on a prior auth denial, but the payer policy it's referencing is 14 months old and retrieval only got 40% of the step therapy docs. Pattern accuracy for that combination is 23%. SENTINEL sits behind agentgateway as an MCP server (Go, Streamable HTTP) and runs a four-stage pipeline on every agent decision:
Signal fidelity audit — checks evidence staleness, completeness, weight divergence, confidence calibration Pattern classification — matches fidelity flags against learned failure signatures with historical accuracy data Reliability scoring — rolling per-agent × per-payer accuracy with drift detection and ECE tracking Authority gate — issues verdicts: full autonomy, act-and-notify, human-required, or quarantine
The key insight: an agent's stated confidence and its actual reliability are different things, and the gap grows silently as upstream data drifts. SENTINEL tracks that gap at the MCP protocol layer. Built for healthcare prior auth but the architecture is domain-agnostic — anything where an agent reasons over retrieved evidence before making a consequential decision. agentgateway handles RBAC (CEL policies per MCP tool), session management, and audit logging. SENTINEL handles the reasoning quality audit. Integrations with Datadog (drift monitors), Braintrust (eval scoring), and Cleric (incident escalation). Repo: https://github.com/espirado/agent-secure Blog post with architecture details and demo walkthrough: https://espiradev.org/blog/sentinel-ai-reasoning-observatory...