1 pointby policycortex6 hours ago1 comment
  • policycortex6 hours ago
    I've spent 12 years in DoD/DoE environments watching the same failure mode: a storage bucket gets misconfigured, five different tools alert on it, and it takes three days to fix because every tool detects but none of them acts. PolicyCortex is my attempt to build the tool that actually closes the loop.

    What it does

    PolicyCortex is an autonomous cloud engineer. When it detects a violation -- say, a publicly accessible Azure Storage account -- it doesn't just fire an alert. It authenticates with Azure via managed identity or service principal, analyzes the configuration, disables public blob access, creates a private endpoint, updates the associated NSG rules, verifies encryption is enabled, runs a compliance check, and generates an audit trail. That full 8-step sequence completes in under 3 minutes. No human touch required -- unless you want it.

    The "safety sandwich" / Gated Mode

    The thing I was most careful about: any write operation requires explicit human approval before execution. The AI agent (we call it Xovyr) does all the analysis -- proposes the remediation steps, calculates blast radius, explains business and security impact -- then pauses and waits. A human reviews, approves, and then it executes.

    I'm calling this the "safety sandwich" internally. Autonomous analysis + human-gated writes + autonomous execution post-approval. The goal is to get the cognitive load off engineers without removing their override authority. This feels like the right default for anything touching production infrastructure.

    Technical architecture (current state)

    - Policy engine evaluates resources against a ruleset (currently 155 Azure governance checks). Violations are scored by an AI layer for confidence. Auto-remediation only triggers at 85%+ confidence; lower-confidence findings surface for human review.

    - Remediation actions call Azure Resource Manager APIs, Azure Storage Management APIs, and Azure Network Management APIs directly -- not a wrapper around the CLI. Each action is idempotent and logged.

    - Audit trail is generated as a byproduct of every operation: timestamp, caller identity, before/after state, API response codes. This is what feeds our ATO evidence packs (CMMC L2/L3, NIST 800-171, FedRAMP Moderate).

    - FinOps module polls Azure Cost Management APIs on a configurable interval for real-time anomaly detection and 30/60/90-day forecasting.

    - AI Observability tracks LLM API spend for teams running models in Azure (OpenAI, etc.).

    Honest about stage

    Pre-revenue. Azure-only today. AWS is next. We have design partners but no paying customers yet. The autonomous remediation piece works in controlled environments; I'm being careful about hardening it before pushing for wider production use.

    The hardest problems so far: (1) making the AI's remediation reasoning auditable enough that a compliance officer will trust it, and (2) handling blast-radius edge cases where "fix the misconfiguration" has unintended downstream effects on dependent resources.

    What I'd love feedback on

    - What would make you trust an AI agent to hold write permissions in your production cloud? - How are you handling the ATO evidence problem today? - Anyone solved the "policy as code + real-time enforcement" problem in a way they're happy with?

    Demo and more at https://policycortex.com. Happy to answer technical questions here.

    -- Leonard (founder, 12 yrs DoD/DoE, Dallas TX)