2 pointsby bhaviav1002 hours ago3 comments
  • DarthCeltic85an hour ago
    I had gotten a student/ultra code for antigravity promo for three months, so I was using that, but that finally ran out this month. Currently Im using windstream and flipping between claude as my left brain and code extraction and the higher context but cheaperish models there.

    honestly though, im getting to a point where im running custom project mds that flip between different models for different things, using list outputs depending on what it finds and runs. (I have two monorepo projects, and one thats a polyglot microengine that jumps using gRPC communication.)

    The mds are highly specialized for each project as each project deals with vastly different issues. Cycling through the different pro accounts and keeping the mds in place over it all is helping me not kill my wallet.

    • bhaviav100an hour ago
      hmm interesting model routing + specialized MDs makes sense for cost efficiency.

      I’m seeing a different failure mode though that even with good routing, agents are looping or retrying and burning my money.

  • rox_kdan hour ago
    In what settings do you mean - there are multiple strategies, I think building your own compaction layer in front seems a bit over-kill ? have you considered implementing some cache strategy, otherwise summary pipelines - I made once an agent which based on the messages routed things to a smaller model for compaction / summaries to bring down the context, for the main agent.

    But also ensuring you start new fresh context threads, instead of banging through a single one untill your whole feature is done .. working in small atomic incrementals works pretty good

    • bhaviav100an hour ago
      yes, compaction and smaller models help on cost per step.

      But my issue wasn’t just inefficiency, it was agents retrying when they shouldn’t.

      I needed visibility + limits per agent/task, and the ability to cut it off, not just optimize it.

  • 2 hours ago
    undefined