33 pointsby adocomplete4 hours ago6 comments
  • xlii31 minutes ago
    > We've been running Code Review internally for months: on large PRs (over 1,000 lines changed), 84% get findings, averaging 7.5 issues. On small PRs under 50 lines, that drops to 31%, averaging 0.5 issues. Engineers largely agree with what it surfaces: less than 1% of findings are marked incorrect.

    So the take would be that 84% heavily Claude driven PRs are riddled with ~7.5 issues worthy bugs.

    Not a great ad of agent based development quality.

  • CharlesW3 hours ago
    Interesting: "Reviews are billed on token usage and generally average $15–25, scaling with PR size and complexity."
    • cbovis2 hours ago
      This cost seems wild. For comparison GitHub Copilot Code Review is four cents per review once you're outside of the credits included with your subscription.
    • Twixesan hour ago
      Average _per review_? Insane costs, that's potentially thousands per developer. Am I missing something?
      • remus12 minutes ago
        I haven't used it so just spit balling, but surely it depends on the quality of the review? If it picks up lots of issues and prevents downtime then it could work out as worthwhile. What would it cost an engineer with deep knowledge of the codebase to do a similar job? You could spend an hour really digging into a PR, poking around, testing stuff out etc. Im guessing most engineers are paid more than $15-25/hr, not to mention the opportunity cost.
    • karmakaze3 hours ago
      At those prices I wonder if it also reviews the design for ineffectiveness in performance or decomposition into maintainable units besides catching the bugs.

      Also the examples are weird IMO. Unless it was an edge/corner case the authentication bug would be caught in even a smoke test. And for the ZFS encryption refactor I'd expect a static-typed language to catch type errors unless they're casting from `void*` or something. Seems like they picked examples by how important/newsworthy the areas were than the technicality of the finds.

    • atonsean hour ago
      Wait, what? So if I'm a paying Max user, i'd still have to pay more? Don't see the value. Would rather have a repo skill to do the code review with existing Claude Max tokens.
  • cpncrunchan hour ago
    Does AI review of AI generated code even make sense?
  • lowsong29 minutes ago
    > Reviews are billed on token usage and generally average $15–25, scaling with PR size and complexity.

    You've got to be completely insane to use AI coding tools at this point.

    This is the subsidised cost to get users to use it, it could trivially end up ten times this amount. Plus, you've got the ultimate perverse incentive where the company that is selling you the model time to create the PRs is also selling you the review of the same PR.

  • Bnjoroge2 hours ago
    what are the implications for the tens of code review platforms that have recently raised on sky high valuations?
  • simianwords3 hours ago
    nice but why is this not a system prompt? what's the value add here?
    • NoahZuniga2 hours ago
      You're paying the same token rate for this as you would if it was just a system prompt. Clearly the scaffolding adds something.

      (They mention their github action which seems more like a system prompt)

      • simianwords2 hours ago
        seems like a very small value add. why is this a blog post - i could do this myself.
      • sixothreean hour ago
        Does this only work with github actions? What about Devops and gitlab?