6 pointsby alienll2 days ago4 comments
  • Fawlty2 days ago
    Karpathy mentioned he hasn't written a line of code since December [0], says agents only actually started working around then. The trajectory in this report lines up with that pretty well: throughput was basically flat for most of 2025, picked up ~25% in Q4, then more than doubled in Q1 2026. I've talked to people at a couple of companies that lean heavily on Claude Code internally, and fixing already-merged code still eats a big chunk of their week. Mostly because nobody's really tracking what's landing in the codebase anymore. So I went into this report expecting that pain to show up as a spike in "fixes." It kinda doesn't though? Fixes only went up 3pp (15→18%), while "growth" work picked up 7pp. So either the cleanup is happening earlier, in code review before merge, or commit data just can't see it. Curious if either of these matches what you're seeing. [0] https://x.com/karpathy/status/2026731645169185220
    • petermalina21 hours ago
      The total performance includes fixes, so they effectively more than doubled (in absolute numbers). At the same time, it's never been easier to build new features. One more assumption: since people don't read that much code anymore, they don't refactor as much as well. Things that we deemed "smelly" in the past are now just part of codebases. Same assumption for fixes. In the past, we were proactive in fixing bugs because we saw them. Now we wait for them to backfire. Let's see what Q2 brings.
    • alienll2 days ago
      [flagged]
  • alienll2 days ago
    We believe that in order to improve something, you need to measure it.

    For a long time, engineering was running on vibes and was the only department without KPIs. I understand that performance (ETV) is just one piece of the puzzle and that it is much more important to build the right things, but to start figuring out this problem, you need to understand the story behind the commit. If you understand the story, you can tie it to plans (OKRs, epics, Jira tickets, etc.). We are actually doing that, but it is not used for this research.

  • michalstencl2 days ago
    Finally a tool that could measure the coding performance.
  • alienll2 days ago
    [dead]