3 pointsby Anon848 hours ago1 comment

bisonbear7 hours ago
For agentic development teams, I see there being two ways to measure performance:
How good is the human at using the agent, and how good is the agent itself?
I agree with the thesis here that the traditional DORA metrics don't have as much signal in an agentic world. I like the metrics mentioned in the article - another one I would propose is "number of turns" - the idea being that, if the agent goes off course, the human has to spend more turns course correcting the agent, whereas if the agent is aligned, there are just a few turns in the conversation.
For the "measuring the agent itself" part, I'm convinced that traditional benchmarks are broken, and that we need a way to measure our coding agents on our tasks, and anything else is irrelevant/noise.