Show HN: Multicorn Shield – Open-source permissions and approvals for AI agents(github.com)

1 pointby rachelle-r15 hours ago1 comment

rachelle-r15 hours ago
I'm a backend engineer on Atlassian's Rovo Agents team. A few weeks ago, OpenAI acquired the OpenClaw project and I started thinking about what happens when agents get broad access to your data with no permission layer. Six days later, Summer Yue (Director of Alignment at Meta) posted about her OpenClaw agent deleting 200+ emails while ignoring her stop commands. The root cause was context window compaction dropping her safety instruction. The agent kept working. It just lost the part where it was supposed to ask first. Shield is a plugin that hooks into OpenClaw's tool system and intercepts every action before it executes. Permissions are enforced outside the model's context window, so they can't be compressed away. You set what each agent can access (read/write/execute per service), and anything outside those boundaries gets blocked or routed through an approval workflow with time-limited grants. TypeScript, open source, MIT licensed. The plugin, dashboard, and docs are all live. Happy to answer questions about the architecture or how the OpenClaw Plugin API integration works.
- verdverm14 hours ago
  OpenClaw is a security nightmare, there are much better frameworks out there that don't need these afterthought addons. There are also dozens and dozens of this same project in /show the last month+ now.
  Please tell me you are not using OpenClaw with Rovo when I see it show up in Atlassian products.
  What you are talking about is called "the lethal trifecta", worth looking up and understanding if you are not familiar.
  - rachelle-r9 hours ago
    You're right that OpenClaw's security model is controversial, and the lethal trifecta is a real concern. Shield is specifically designed to break it.
    Simon Willison's lethal trifecta is the combination of private data access, untrusted content, and external communication. The recommended mitigation is to cut off at least one leg. That's exactly what Shield's per-service read/write/execute controls do. Revoke write access on email, and the agent can't exfiltrate through it. Revoke execute on terminal, and it can't run arbitrary commands. You choose which legs to cut per agent, per service.
    Shield doesn't solve prompt injection. Nothing does yet. But it constrains the blast radius, which is the same approach Willison, Google's agent security paper, and Meta's Rule of Two all recommend: architectural boundaries enforced outside the model, not prompt-level guardrails.
    On the architecture: Shield hooks into OpenClaw's native Plugin API (before_tool_call / after_tool_call), so it intercepts at the tool execution layer before the call reaches the system. It's not a wrapper or an afterthought.
    And to be clear, Shield is a personal project. It has nothing to do with Atlassian or Rovo. I mentioned my role because building agent infrastructure professionally is how I recognized the gap in the open-source ecosystem. Atlassian already has its own agent governance infrastructure. Most teams building on OpenClaw don't.
    If you want to see how Shield works under the hood: https://multicorn.ai/shield
- EmperorClawd15 hours ago
  [dead]