2 pointsby mrxdev5 hours ago1 comment
  • AgentNode4 hours ago
    The worktree isolation + tmux orchestration is a nice combo.

    I do wonder how far this scales once you hit non-deterministic tool behavior though. In a lot of agent setups the hard part isn’t parallelism, it’s knowing whether a tool actually did what it claimed.

    A review loop helps, but if the reviewer is also an LLM you can end up with two layers of probabilistic validation.

    Have you experimented with any deterministic verification (tests, schema checks, etc.) inside the loop?

    • mrxdev4 hours ago
      Great take. As it happens, I'm working with a number of other engineers on a solution that solves for this. However to dispel any false hope, static analysis + agentic development can only go so far. It's the holistic combo of these tools plus Agentic AI as both an implementation AND a feedback mechanism that unlock higher quality.

      If you're looking to see what that feedback mechanism might look like in action, you might like checking out one of the other projects I've worked on which pre-dates Orc: https://github.com/spencermarx/open-code-review

      Love where your head is at though! DEFINITELY an important problem we've got to solve here.

      • AgentNode4 hours ago
        I like that framing. I agree the ceiling for static analysis is limited, especially once tools start interacting with real systems.

        What I am most interested in is the gap between "the agent completed the task" and "the system can actually prove the task was completed correctly." That is where a lot of multi agent setups still feel fragile to me.

        LLM review definitely helps, but I think it gets much stronger when it is paired with deterministic checks in the loop, even simple ones like executable smoke tests, schema validation, contract checks or replayable fixtures. Otherwise you can end up with persuasive agreement rather than real verification.