1 pointby cocabadger7 hours ago1 comment
  • cocabadger7 hours ago
    I built SafeRun Guard — a Claude Code plugin that intercepts dangerous commands and file operations before they execute. Pure bash + jq, zero dependencies, ~20ms latency.

    The problem: AI coding agents run shell commands autonomously. One `rm -rf /`, one `git push --force`, one leaked AWS key in a config file — and you're recovering for hours. The agent doesn't know what's dangerous. You can't watch every command.

    How it works: SafeRun Guard hooks into Claude Code's PreToolUse event. Every Bash command and every file write passes through a 4-tier decision engine:

    - REDIRECT — suggests a safer alternative (`--force` → `--force-with-lease`) - BLOCK — denies the command, agent sees the reason and adapts - ASK — prompts the user for confirmation - ALLOW — silent passthrough (~95% of actions)

    Key features that I haven't seen elsewhere:

    1. Compound command splitting. `echo ok && rm -rf /` is split on `&&`, `||`, `;` — each segment checked independently. Pipes are NOT split (they're part of a single pipeline).

    2. Content scanning. File writes are scanned for 9 secret patterns (AWS keys, PEM private keys, GitHub tokens, OpenAI/Stripe keys, Slack tokens, DB connection strings) before they hit disk.

    3. Redirect tier. Instead of just blocking `git push --force`, it tells the agent "use `--force-with-lease` instead" — the agent rewrites the command automatically. No human needed.

    Numbers: 112 safety rules, 9 secret detection patterns, 243 tests, ~20ms per check. Fail-open design — if jq crashes or rules are corrupt, the command passes through. Safety tool should never block your work due to its own bug.

    Stack: Bash + jq (Oniguruma regex engine). No Python, no Node, no API calls, no telemetry. Everything runs locally. MIT license.

    Install: ``` claude plugin marketplace add Cocabadger/saferun-guard claude plugin install saferun-guard@saferun-guard ```

    I'd love feedback on the rule coverage — what dangerous patterns am I missing?

    Also thinking about the next step: right now the agent just gets blocked or redirected. But what if it could learn from those decisions — an "agent-in-the-loop" that negotiates with the guardrail instead of just retrying?

    Would that be useful in your workflow, or is a dumb firewall exactly what you want between an AI agent and your filesystem?