2 pointsby celestinestudio3 hours ago1 comment
  • celestinestudio3 hours ago
    I’ve been noticing a pattern in many tool-using LLM setups.

    We spend a lot of effort filtering model outputs, but relatively little on deciding whether the model should be allowed to attempt the action itself.

    This harness is a small local framework that evaluates action requests (deploy code, send emails, export data, financial operations, etc.) against pre-execution authorization signals and produces an audit trail explaining the decision.

    It’s intentionally simple and deterministic — not a product or policy engine. More of a thinking tool.

    Curious if others building agents or tool-connected systems have run into this boundary where the model becomes an operator instead of a requester.