As an avg dev, what I can even do with 10 agents? It like managing 10 toddlers who can code, it looks good but it becomes hard to manage as you have limited context in your brain.
2 is the best setup if you can afford. One can write Tests and other can write the code. This is better because if you just use the same agent instance, its not gonna be able to write good tests as it will just write tests that its code is gonna pass. Its different for everyone else, but for me, 2 is the best setup for TDD.
Apart from that, you can just go ahead and do it your own way. I have found that many senior engineers think they are special when they can make Claude Code do something, they think its with their setup but I am usually able to replicate without any setups or Agents.md/Claude.md, the models are good enough without any complex setup.
I agree that extensive modding isn’t required. Just maintaining my claude.md seems to do the trick.
It’s like a fire extinguisher that helps engineers manage the problems they created to begin with.
(I ended up just using the claude web interface and making it use a checklist, took 8 hours)
I’ll occasionally have it write a little regex for me, which it does a decent job with, that’s its main use.
That is, don’t judge the llm hype by copilot its staggeringly bad.
Copilot is all I have to go by, due to the work restrictions. It took a good year to get that. I don’t know if anything else is in the works.
I’ve made a couple things outside of work with ChatGPT, but they were so basic that it was hardly something to get excited about. If it can’t help me at work, it’s hard for me to care much.
The choices those processes make about tool calls, using subagents, etc make a huge difference in the quality of result you’ll get. Copilot is just an extremely bad agent compared to the sota agents.
Several of the open source agents will out perform copilot regularly (try aider, open code or cline if you can). It’s really just a baffling own goal by Microsoft on how they’ve managed this.
I use them only for early prototypes that we discard early , but can’t use them with legacy codebases because reasons.
But if you are worried, you can use an inference only solution like Groq.
For personal use vs code + GitHub copilot pro plus works great (highest limits available for code generation for 40$) includes has over 10 models
Because its been updated even just this past couple weeks - everything is there - agents - codex - claude
I only have 16 gb ram and im coding like 4 projects at once if im crazy enough
These agents work best when you know what you want done, specifically on implementation and if you know what you are doing, some code (such as frontend) can be one-shotted >90% of the time with minimal checks.
Anything lower than that must be checked over by a human + agent, otherwise you will risk introducing a critical leak, bug or a new security issue.