1 pointby fitz28827 hours ago1 comment
  • fitz28827 hours ago
    Hey HN — Dave here. While building a multi-agent system recently, I kept noticing that the diagrams of agents passing feedback to each other looked just like the electrical circuit diagrams from control theory. I got curious whether the established math transferred, and to my surprise it did. LoopGain is the first product of that research: an open-source library that replaces the max_iterations=N cap on agent loops with an actual measurement of whether the loop is still improving.

    The cap is how nearly every verify-revise loop gets stopped today, and it's wrong in both directions. Stop too early and you clip a loop that was still improving. Stop too late and you pay for iterations after the loop already found its best answer — and ship the final attempt, which is sometimes worse than one it already had.

    On a 2,000-trial benchmark (paired real-API runs, five loop patterns across six framework adapters, three model providers, pre-registered protocol with kill criteria), LoopGain cut total API spend by 92.8% vs max_iterations=20 ($27.05 → $1.94) and median wall-clock ~15× (30.9s → 2.1s). A cross-vendor judge preferred LoopGain's outputs on the weighted average (0.678 across 1,800 pairwise comparisons) — mostly because LoopGain returns the iteration with the lowest error it saw, while a fixed cap ships whatever the final iteration produced. The raw data and methodology are public, and the full run is browsable at dashboard.loopgain.ai/benchmark.

    How it works: each iteration, LoopGain takes the ratio of the current error to the previous error — the loop's empirical loop gain (Aβ, borrowed from control theory). Aβ<1 means the error shrank — the loop is improving. Aβ≥1 means it held or grew — the loop is stuck or making things worse. A trajectory classifier reads the recent Aβ values, labels the loop (FAST_CONVERGE / CONVERGING / STALLING / OSCILLATING / DIVERGING), and decides whether to keep going, stop here, or stop and roll back to the lowest-error output so far.

    Integration is a few lines around any loop that produces an error signal:

      from loopgain import LoopGain
      lg = LoopGain(target_error=0.0)
      output = generate(task)                # first attempt
      while lg.should_continue():
          errors = verify(output)            # e.g. count of failing tests
          lg.observe(errors, output=output)  # the only LoopGain call in the loop
          output = revise(output, errors)
      result = lg.result   # best_output, outcome, convergence_profile, savings_vs_fixed_cap
    
    Honest limits, because they matter more than the headline: LoopGain detects convergence, not correctness — it inherits your verifier's blind spots. I re-graded my own benchmark and 4.5% of "converged" code-gen runs passed every check the loop ran but failed a fuller held-out test suite. And savings depend on workload: failure-heavy loops save ~78–84%, not 92.8%. There's a writeup on designing verifiers strong enough to trust on the blog.

    Apache-2.0, pip install loopgain. Adapters for LangGraph, CrewAI, AutoGen, LangChain, OpenAI Agents SDK, and the Claude Agent SDK; the raw API works for anything with a measurable error.

    What I'd really love from HN: if you run production agent loops, I'm interested in whether the stop decisions match what you see empirically — and what your loops' error signals actually look like, since the verifier is what makes or breaks the stop. Happy to answer anything.