(Waiting for Cerebras coding plan to stop being sold out ;)
I've used them for smaller tasks (making small edits), and the "realtime" aspect of it does provide a qualitative difference. It stops being async and becomes interactive.
A sufficient shift in quantity produces a phase shift in quality.
--
That said, the main issue I find with agentic is my mental model getting desynchronized. No matter how fast the models get, it takes a fixed amount of time for me to catch up and understand what they've done.
The most enjoyable way I've found of staying synced is to stay in the driver's seat, and to command many small rapid edits manually. (i.e. I have my own homebrew "agent" that's just a loop of, I prompt it, it proposes edits, I accept or edit, repeat.)
So then the "synchronization" of the mental state is happening continuously, because there is no opportunity for desynchronization. Because you are the one driving. I call that approach semi-auto, or Power Coding (akin to Power Armor, which is wielded manually but greatly enhances speed and strength).
Most agent tools right now don't give you good visibility into what sub-agents are doing or what decisions they're making. You zoom out, let it run, come back to a mess. Tools like OpenCode and Amazon's CLI Agent Orchestrator are trying to fix this - letting you watch what each agent is actually doing and step in to correct or redirect.
OpenCode actually removed the ability to message sub-agents directly. I get why - people would message one after it finished, the conversation would fork off, and the main orchestrator lost track. But I don't love that fix because being able to correct or pivot a sub-agent before it finishes was genuinely useful. They bandaided a real problem by removing a good feature.
Honestly the model that works best for me is treating agents like junior devs working under a senior lead. The expert already knows the architecture and what they want. The agents help crank through the implementation but you're reviewing everything and holding them to passing tests. That's where the productivity gain actually is. When non-developers try to use agents to produce entire systems with no oversight that's where things fall apart.
So I wouldn't want agent tools to be "calm" and fade into the background. I want full transparency into what they're doing at all times because that's how you catch wrong turns early. The tooling is still early and rough but it keeps getting better at supporting experts rather than trying to replace them.
I tried to approach it that way as well, but I am realizing when I let the agent do the implementation, even with clear instructions, I might miss all the “wrong“ design decisions it takes, because if I only review and do not implement I do not discover the “right“ way to build something. Especially in places where I am not so familiar myself — and those are the places where it is most tempting to rely on an agent.
The current tools are the infancy of AI assisted coding. It’s like the MS-DOS era. Over time maybe the backpropagating from “your comfort language” to “target language” could become commonplace.
To be fair, that's not part of the article's title, but rather the title of the website that the article was posted to.
It was a good article though
I am recently using this tiny[1] skill to generate an order on how to review a PR and it has been very helpful to me.
If you are going to do a big build out of something, spec up front at least to have a clear idea of the application architectural boundaries.
If you are adding features to a mature code base, then the general order of the day is: First have the Ai scout all the code related to the thing you are changing. Then have it give you a summary of its general plan of action. Then fire it off and review the results (or watch it, less needed now though).
For smaller edits or even significant features, I often just give it very short instructions of a few sentences, if I have done my job well the code is fairly opinionated and the models pick up the patterns well and I don't really have to give much guidance. I'll usually just ask for a few touchups like introdusing some fluent api nicities.
That being said, I do tend to make a few surgical requests of the AI when I review the PR, usually around abraction seams.
(For my play projects I don't even look at the code any more unless I hit a wall, and I haven't really hit a wall since Opus 4.5, though I do have a material physics simulator that Opus 4.5 wrote that runs REALLY slow that I should muck around in, but I'm thinking of seeing if Opus 4.6 can move it to the GPU by itself first.)
So if I were doing an interview with an interview question. I would probably do a "let's break down what we know", "what can we apply to this", "ok. let's start with x" and then iterate quickly and look at the code to validate as needed.
The same goes for using Claude in a programming interview. If the environment of interview is not representative of how people actually work then the interview needs to be changed.
But the hard part is designing the problem so that it exercises skill.
Why not? It sounds like a skill issue to me.
>It ideally also requires iterating upon the prompt to refine it before execution.
I don't understand. It's not like you would need to one shot it.