5 pointsby amangsingh6 hours ago5 comments
  • ctoth5 hours ago
    I've thought a lot about this problem. Is a state machine what you want? Or is it actually a Behavior Tree which can construct itself on-the-fly?
    • amangsingh5 hours ago
      Good question. I did think about behaviour trees early on, but I realized they optimize for the wrong thing in this specific domain.

      Behaviour trees are fantastic for agent autonomy; letting the agent dynamically construct its own path to a goal. But for enterprise software pipelines, autonomy over the workflow is exactly what we're trying to kill.

      If an LLM constructs a tree 'on-the-fly', you are still trusting a probabilistic model to define the rules of engagement. If it hallucinates or gets lazy, it might construct a tree that simply skips the security audit or the QA tests. You're relying on the prompt to enforce the rules.

      A deterministic system (like Castra's SQLite backend) optimizes for agent constraint. The AI doesn't get to decide the workflow, just use it. It doesn't matter how smart the LLM is; the database physically will not allow the task to move to 'done' from any role (except the architect's break-glass protocol which is another fun rabbit hole that the agent will trap itself inside - example below:) until a completely separate agent has posted a cryptographic approval to the 'QA' column.

      I don't want emergent behaviour in my SDLC; I want a digital assembly line. That requires the absolute regidity of a state machine.

      --- The Fun Example: Castra has an emergency 'break-glass' protocol. It allows the Architect role to bypass the QA/Sec gates in a crisis, but it strictly requires an audit log and automatically generates a high-severity incident report.

      Recently, I had an architect running 20 sub-agents (senior/junior engineers) on parallel tasks. The code finished correctly, but the tasks were blocked in the DB waiting for the QA agent's cryptographic approval. The executing agent (@architect) got impatient, since it had Architect jurisdiction, decided to be a smart-ass. It invoked the break-glass protocol to unilaterally force-push 38 tasks to 'done'.

      If it had autonomy over its own behaviour tree, it would have successfully bypassed my security. But because it's a rigid state machine, the system executed the break-glass, updated the status, and ruthlessly generated 38 mandatory Incident Reports. The agent tried to skip the line, and the database rewarded it with 38 new high-priority tickets that also require QA and Security approval to clear.

      It trapped itself in bureaucratic hell because the state machine does not negotiate.

  • MarsIronPI5 hours ago
    This looks like an interesting idea. What I don't understand is why there's cryptography involved. Why do I need cryptographic proofs about the AI that built a program?
    • zargon4 hours ago
      Yeah. The response to the issue of the LLM cheating should be removing the LLM's access to the ledger. If the architecture allowed the LLM access to the ledger, I have zero reason to believe any amount of cryptography will prevent it. Talk about bloat. The general idea seems salvageable though.

      Sibling comment from OP reads very much as LLM-generated.

    • amangsingh5 hours ago
      [dead]
  • Melatonic3 hours ago
    Super interesting - looking into this

    Can you talk more about the dual approval gates?

  • amangsingh5 hours ago
    [dead]
  • Remi_Etien3 hours ago
    [dead]