If the LLM fails, either you didn't describe your outcome sufficiently or is misinterpreted what you said or it couldn't do it (rare).
Common errors should be encoded as context for future similar tasks, don't bloat skills with stuff that isn't shown to be necessary.
I prefer the start small and iterate approach to arrive at a result.
Then I ask it to summarize. Sometimes after that I ask it to generalize.
This is not true for anything complex. They’re instruction followers, of which task completion is just one facet.
They’re also extremely eager to complete tasks without enough information, and do it wrongly. In the case of just describing task completion, despite your best efforts, there are always some oversights or things you didn’t even realize were underspecified.
So it helps a lot to add some process around it, eg “look up relevant project conventions and information. think through how to complete the task. ask me clarifying questions to resolve ambiguities. blah blah”.
Yes, not everything I use LLMs for going to have the same level of ambiguity or complex requirements. Optimizing by choosing to skip over parts of the process is exactly Addy is talking in this article.
If Addy reads this, how do you pitch this vs. Superpowers? https://github.com/obra/superpowers
"If you think there is even a 1% chance a skill might apply to what you are doing, you ABSOLUTELY MUST invoke the skill."It shouldn't be your default, but should absolutely be tried when your skill/agent test suite displays evidence that it's not being reliably invoked without it.
I showed up on the agentic dev scene prior to superpowers, and I am getting concerned that >50% of my self-rolled processes are now covered by superpowers.
I no longer trust gh stars, can anyone chime in? Is superpowers now truly adopted?
If it is truly valuable, why hasn't Boris integrated the concepts yet?
To give back as much as I can, I use the two built-in CC review processes when appropriate. But, those only do "is this PR good code?"
Far too late did I finally roll my own custom review skill that tests: "does this PR accomplish what the specs required?"
If I could ask for one more vanilla CC skill, it might be that. However, maybe rolling your own repo-aware skill via prompt is better?
Curious how normal that is - it would only take a couple of these to really fill the context alot.
That being said, this post is full of reasonable assertions, so I'm looking forward to experimenting with this... whatever it is.
Very grateful for this repository and everyone who contributed to it!
This (sdlc == working backwards & bar raiser) is so horribly wrong, that I hope this was an LLM hallucination.
In general, I'm starting to see these agent scaffolding systems as an anti-pattern: people obsess over systems for guiding agents and construct elaborate rube-goldberg machines and then others cargo-cult them wholesale, in an effort to optimize and control a random process and minimize human involvement.