18 pointsby jonbaer3 hours ago5 comments
  • miguel_martin26 minutes ago
    It’s unfortunate that they didn’t eval using subagents/orchestration for such a complex set of tasks (from what I can tell), e.g. analyze program to produce initial spec -> code -> review and rinse&repeat with each of those steps being a separate subagent allocated

    I would be interested to see if there’s a significant quantifiable difference.

  • _pdp_an hour ago
    I am not surprised but this one sticks out...

    > Models favor monolithic, single-file implementations that diverge sharply from human-written code.

    Well, all of our code is monolithic with some files close 20K lines of code and we do use coding agents - not for the original code but as of late. I've always had that hunch that splitting everything into tiny files does not improve AI coding agent performance although it feels counterintuitive due to model context constraints.

    To me the important parts of a program should be clustered together so the implementation is obvious. Scattering the implementation in various files all over the source tree does not help much building the mental model.

    That also closely match how software used to be written in the past too.

    • Garlef7 minutes ago
      > Scattering the implementation in various files all over the source tree

      If you treat the source tree seriously, you can communicate a lot with how it is structured

    • BurningPenguin36 minutes ago
      Kinda surprising to me, since i had some trouble with Cursor & Co. once the file went over ~800 lines. It repeatedly failed to edit it, until i split it up into multiple logical components. As it should have been from the beginning...

      Though, it was some time ago, so things might have improved?

      • _pdp_13 minutes ago
        VSCode basically any model can edit the 20K file without any issues. The coding harness does not read the entire file at once though. It reads chunks of it so the size does not really matter. What matters is how close are the things the agent needs to make the edit.
  • luca-ctx44 minutes ago
    RE: monolithic, single-file implementations

    We have a lint that caps source code files at 650 LOC and it works really well.

    • 15 minutes ago
      undefined
  • vatsachakan hour ago
    In before "but they did not use my agent swarm"
    • red75prime4 minutes ago
      In science N=1 is statistically insignificant. In business it might mean that you have a product.
    • makerofthingsan hour ago
      It’s the annoying thing about AI. If it works, the AI is magic. If it doesn’t work, you’re using it wrong.
  • keylean hour ago
    How long until AI is not even writing code but producing machine code?

    Think about it, all these compilers, tooling, what a waste!

    I imagine a future where chipset makers will provide a model you can just prompt to "act upon that chipset" and voila, "You're absolutely right! Here is your binary."

    We won't be developers, we won't be devops, we'll be rollmops! /s

    • _pdp_an hour ago
      Coding agents can write ASM. But if you mean writing the actual byte-code that will require a very different approach at a very different level of abstraction that LLMs are not designed to do. Keep in mind that all LLMs are trained first on text and then fine-tuned on code.
      • keylean hour ago
        Good point! Long live ASM! Wasm everything!!1 /jk
    • quinnjhan hour ago
      My hunch is that it would take years of hundreds of thousands of developers working with machine code, posting stackoverflow questions with machine code, and publishing github repos written on it with documentation. Thats all the free labor LLMs leveraged to use high level langs.

      >We won't be developers, we won't be devops, we'll be modelops! /s

      I can still see this happening with higher level langs. the thing is the compiler is not replaced in the training data, more likely LLMs will give rise to semideterministic layers on the compilers

      I could see nvidia achieving this first with how nice the devex is with CUDA

      • osti37 minutes ago
        I heard they are already proficient at assembly languages.