Author here. We built a pipeline that turns policy prose into structured, executable rules + test cases (“policy → tests”), so governance can be enforced and measured in CI/CD / guardrails rather than living in PDFs. The paper walks through the DSL + evaluation and what broke / what worked. Feedback welcome, especially from folks doing policy-as-code / OPA / AI eval