Agentura adds behavioral evals to your CI pipeline. On every PR, it runs your agent against expected outputs, compares scores to your main branch baseline, and shows you exactly which cases regressed before you merge. 100% Free, Open Source.
Try it locally:
npx agentura@latest init npx agentura@latest run --local
Live playground (online): https://playground.agentura.run
Website: https://agentura.run
Repo: https://github.com/SyntheticSynaptic/agentura
Welcome all comments + feedback