Eight years of wanting, three months of building with AI(lalitm.com)

936 pointsby brilee2 days ago76 comments

Aurornis2 days ago
Refreshing to see an honest and balanced take on AI coding. This is what real AI-assisted coding looks like once you get past the initial wow factor of having the AI write code that executes and does what you asked.
This experience is familiar to every serious software engineer who has used AI code gen and then reviewed the output:
> But when I reviewed the codebase in detail in late January, the downside was obvious: the codebase was complete spaghetti14. I didn’t understand large parts of the Python source extraction pipeline, functions were scattered in random files without a clear shape, and a few files had grown to several thousand lines. It was extremely fragile; it solved the immediate problem but it was never going to cope with my larger vision,
Some people never get to the part where they review the code. They go straight to their LinkedIn or blog and start writing (or having ChatGPT write) posts about how manual coding is dead and they’re done writing code by hand forever.
Some people review the code and declare it unusable garbage, then also go to their social media and post how AI coding is completely useless and they’re not going to use it for anything.
This blog post shows the journey that anyone not in one of those two vocal minorities is going through right now: A realization that AI coding tools can be a large accelerator but you need to learn how to use them correctly in your workflow and you need to remain involved in the code. It’s not as clickbaity as the extreme takes that get posted all the time. It’s a little disappointing to read the part where they said hard work was still required. It is a realistic and balanced take on the state of AI coding, though.
- yojo2 days ago
  +1
  I’ve been driving Claude as my primary coding interface the last three months at my job. Other than a different domain, I feel like I could have written this exact article.
  The project I’m on started as a vibe-coded prototype that quickly got promoted to a production service we sell.
  I’ve had to build the mental model after the fact, while refactoring and ripping out large chunks of nonsense or dead code.
  But the product wouldn’t exist without that quick and dirty prototype, and I can use Claude as a goddamned chainsaw to clean up.
  On Friday, I finally added a type checker pre-commit hook and fixed the 90 existing errors (properly, no type ignores) in ~2 hours. I tried full-agentic first, and it failed miserably, then I went through error by error with Claude, we tightened up some exiting types, fixed some clunky abstractions, and got a nice, clean result.
  AI-assisted coding is amazing, but IMO for production code there’s no substitute for human review and guidance.
  - theshrike794 minutes ago
    [delayed]
  - figassis2 days ago
    My process: start ideating and get the AI to poke holes in your reasoning, your vision, scalability, etc. do this for a few days while taking breaks. This is all contained in one Md file with mermaid diagrams and sections.
    Then use ideation to architect, dive into details and tell the AI exactly what your choices are, how certain methods should be called, how logging and observability should be setup, what language to use, type checking, coding style (configure ruthless linting and formatting before you write a single line of code), what testing methodology, framework, unit, integration, e2e. Database, changes you will handle migrations, as much as possible so the AI is as confined as possible to how you would do it.
    Then, create a plan file, have it manage it like a task list, and implement in parts, before starting it needs to present you a plan, in it you will notice it will make mistakes, misunderstand some things that you may me didn’t clarify before, or it will just forget. You add to AGENTS.md or whatever, make changes to the ai’s plan, tell it to update the plan.md and when satisfied, proceed.
    After done, review the code. You will notice there is always something to fix. Hardcoded variables, a sql migration with seed data that should actually not be a migration, just generally crazy stuff.
    The worst is that the AI is always very loose on requirements. You will notice all its fields are nullable, records have little to no validation, you report an error when testing and it tried to solve it with an brittle async solution, like LISTEN/NOTIFY or a callback instead of doing the architecturally correct solution. Things that at scale are hell to debug, especially if you did not write the code.
    If you do this and iterate you will gradually end up with a solid harness and you will need to review less.
    Then port it to other projects.
    ElFitza day ago
    > After done, review the code. You will notice there is always something to fix. Hardcoded variables, a sql migration with seed data that should actually not be a migration, just generally crazy stuff. > > The worst is that the AI is always very loose on requirements. You will notice all its fields are nullable, records have little to no validation, you report an error when testing and it tried to solve it with an brittle async solution, like LISTEN/NOTIFY or a callback instead of doing the architecturally correct solution. Things that at scale are hell to debug, especially if you did not write the code.
    For that I usually get it reviewed by LLMs first, before reviewing it myself.
    Same model, but clean session, different models from different providers. And multiple (at least 2) automated rounds of review -> triage by the implementing session -> addressing + reasons for deferring / ignoring deferred / ignored feedbacks -> review -> triage by the implementing session -> …
    Works wonders.
    Committing the initial spec / plan also helps the reviewers compare the actual implementation to what was planned. Didn’t expect it, but it’s worked nicely.
    deadbabea day ago
    LISTEN/NOTIFY is not brittle, we use it for millions of events per day.
    JoelJacobsona day ago
    I agree! It should be very stable, IMO. If not, then please send a bug report and we'll look into it. Also, now it scales well with the number of listening connections (given clients listen on unique channel names): https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...
    deadbabea day ago
    The LISTEN/NOTIFY feature really just doesn’t get enough PR. It is perfectly suitable for production workloads yet people still want to reach for more complicated solutions they don’t need.
    figassisa day ago
    It's not the feature itself, it's how/what the llm tries to use it for. It uses it to cross any and all architectural boundaries.
    jamiemallersa day ago
    [dead]
    irishcoffee2 days ago
    I find it very interesting that you assume this method would branch out to other projects. I find it even more interesting that you assume all software codebases use a database, give a damn about async anything, and that these ideas percolate out to general software engineering.
    Sounds like a solid way to make crud web apps though.
    klardotsha day ago
    GP is clearly providing examples of categories of tasks. Sure, not all languages do “async fn foo()”, but almost all problem domains involve some sort of making sure the right things happen at the right times, which is in a similar ballpark.
    Holier than thou “yeah well I work on stuff that doesn’t use databases, checkmate!” doesn’t really land - data still gets moved around somehow, and often over a network!
    irishcoffeea day ago
    Not trying to "land" anything.
  - camdenreslink2 days ago
    I’ve found that LLMs will frequently do extremely silly things that no person would do to make typescript code pass the typechecker.
    gck114 hours ago
    I've noticed this too, but not necessarily type checkers, but more with linters. And can't really figure out if there's even a way to solve it.
    If you set up restrictive linters and don't explicitly prohibit agents from adding inline allows, most LOC will be allow comments.
    Based on this learning, I've decided to prohibit any inline allows. And then agents started doing very questionable things to satisfy clippy.
    Recent example:
    - Claude set up a test support module so that it could reuse things. Since this was not used in all tests, rust complained about dead_code. Instead of making it work, claude decided to remove test support module and just... blow up each test.
    If you enable thinking summaries, you'll always see agent saying something like: "I need to be pragmatic", which is the right choice 50% of the time.
    akdev1l2 days ago
    You need to very specific and also question the output if it does something insane
    justonceokay2 days ago
    This decade’s version of “works on my box”
    mexicocitinlueza day ago
    I've found it's less about specificity and more about removing the # of critical assumptions it needs to make. Being too specific can be a hindrance in it's own regard.
    And that's also a decent barometer for what it's good at. The more amount of critical assumptions AI needs to make, the less likely it is to make good ones.
    For instance, when building a heat map, I don't have to get specific at all because the amount of consequential assumptions it needs to make is slim. I don't care or can change the colors, or the label placement.
    dpkirchnera day ago
    I caught it using Parameters<typeof otherfn>[2] the other day. It wanted to avoid importing a type, so it did this nonsense. (I might have the syntax slightly wrong here, I'm writing from memory.)
    But it's not all bad news. TIL about Parameters<T>.
    9rxa day ago
    Yeah, I've found LLMs cannot write good Typescript code period. The good news is that they are excellent at some other languages.
    darkstarsysa day ago
    I can't agree here. https://pelorus-nav.com/ (one of my side projects) is 95-98% written by Claude Opus 4.6, all in very nice typescript which I carefully review and correct, and use good prompting and context hygiene to ensure it doesn't take shortcuts. It's taken a month or so but so worth it. And my packing list app packzen.org is also pretty decent typescript all through.
    9rxa day ago
    > which I carefully review and correct
    So you do agree? If you are having to review and correct then it's not really the LLM writing it anymore. I have little doubt that you can write good Typescript, but that's not what I said. I said LLMs cannot write good Typescript and it seems you agree given your purported actions towards it. Which is quite unlike some other languages where LLMs write good code all the time — no hand holding necessary.
    darkstarsysan hour ago
    I find correction is rarely necessary with Opus 4.6. Definitely not so much that "it's not really the LLM writing it anymore." More like it's the author and I'm the editor (in this limited case -- of course architecturally the ideas are all mine.) But I totally respect that my prompt style, the type of app I'm writing, and other factors could be influencing my success vs. others' lack of success.
    camdenreslinka day ago
    I think it can write working TypeScript code, and it can write good TypeScript code if it is guided by a knowledgable programmer. It requires actually reviewing all the code and giving pointed feedback though (which at that point is only slightly more efficient than just writing it yourself).
    9rxa day ago
    > It requires actually reviewing all the code and giving pointed feedback though
    Exactly. You can write good Typescript, no doubt, but LLMs cannot. This is not like some other languages where LLM generated code is actually consistently good without needing to become the author.
  - ffsm82 days ago
    Fwiw, the article mirrors my experience when I started out too, even exactly with the same first month of vibecoding, then the next project which I did exactly like he outlined too.
    Personally, I think it's just the natural flow when you're starting out. If he keeps going, his opinion is going to change and as he gets to know it better, he'll likely go more and more towards vibecoding again.
    It's hard to say why, but you get better at it. Even if it's really hard to really put into words why
    snovv_crash2 days ago
    Given how addictive vibecoding is, I think it's very hard to be objective about the results if you are involved in the process.
    delusional2 days ago
    It's a little like asking a cokehead how the addiction is going for him while he is high. Obviously he's going to say it's great because the consequences haven't hit him. Some percentage of addicts will never realize it was a problem at all.
    Its not random that AI happens to be built by the very same people that turned internet forums into the most addictive communication technology ever.
    59nadira day ago
    > he'll likely go more and more towards vibecoding again
    I think "more and more" is doing some very heavy lifting here. On the surface it reads like "a lot" to many people, I think, which is why this is hard to read without cringing a bit. Read like that it comes off as "It's very addictive and eventually you get lulled into accepting nonsense again, except I haven't realized that's what's happening".
    But the truth is that this comment really relies entirely on what "more and more" means here.
    devmor2 days ago
    You can’t put it into words? Why? Perhaps you haven’t looked at it objectively?
    It may actually be true. Your feeling might be right - but I strongly caution you against trusting that feeling until you can explain it. Something you can’t explain is something you don’t understand.
    ffsm82 days ago
    really?
    have you ever learned a skill? Like carving, singing, playing guitar, playing a video game, anything?
    It's easy to get better at it without understanding why you're better at it. As a matter of fact, very very few people master the discipline enough to be able to grasp the reason for why they're actually better
    Most people just come up with random shit which may or may not be related. Which I just abstained from.
    SpicyLemonZesta day ago
    I've learned a number of skills, and for me none of them worked in the way you're describing. I didn't learn to cut good miter joints by randomly vibe-sawing wood until I unlocked miter joints in the skill tree. I carefully studied the errors I made, and adjusted in ways I thought might correct them, some of which helped some of which did not. Then eventually I understood the relationship between my actions and the underlying principles in enough detail to consistently hit 45 degrees.
    eszeda day ago
    Isn't that example pretty reductive, in that you have a directly-measurable output? I mean, the joint is either 45° (well, 90°) or it's not. Zoom out a bit, and the skill-set becomes much less definable: are my cabinets good - for some intersection of well-proportioned, elegantly-finished, and fit for purpose, with well-chosen wood and appropriate hardware.
    Mind you, I don't think the process of improvement in those dimensions is fundamentally different, just much less direct and not easily (or perhaps even at all) articulable.
    devmor2 days ago
    You can get better at something without understanding why, but you should be able to think about it and determine why fairly easily.
    This is something everyone who cares about improving in a skill does regularly - examine their improvement, the reasons behind it, and how to add to them. That’s the basis of self-driven learning.
    danielmarkbrucea day ago
    This is an absurd statement. There are many complex undertakings in sport where even the very best get better with practice and can't tell you why. In fact, the ones who think they can tell you why are the one's to be most skeptical of.
    You are just making stuff up or regurgitating material from a pop science book.
    skydhasha day ago
    They can't tell you (not everyone is eloquent), but they sure know why. Struggling to put something in word is not the same as not knowing.
    threethirtytwoa day ago
    Much of human behavior is evolved so that we don't understand why. For example human morality is an evolved trait, but you wouldn't know it.
    Please explain walking to me so that I can explain it to a person who forgot how to walk such that he can walk after the explanation.
    danielmarkbrucea day ago
    Nope, they don't.
    devmor17 hours ago
    Instead of accusing others of making things up, perhaps step back and re-evaluate the conversation you're taking part in. In this instance, it appears that you misunderstood or skipped over the word "learning".
    ffsm82 days ago
    Not really. I can obviously say something, like you learn which features the models are able to actually implement, and you learn how to phrase and approach trickier features to get the model too do what you want.
    And that's not really explainable without exploring specific examples. And now we're in thousands of words of explanation territory, hence my decision to say it's hard to put it into words.
    devmor2 days ago
    I think you’re handwaving away vague, ungrounded intuition and calling it learning.
    For instance, if I say “I noticed I run better in my blue shoes than my red shoes” I did not learn anything. If I examine my shoes and notice that my blue shoes have a cushioned sole, while my red shoes are flat, I can combine that with thinking about how I run and learn that cushioned soles cause less fatigue to the muscles in my feet and ankles.
    The reason the difference matters is because if I don’t do the learning step, when buy another pair of blue shoes but they’re flat soled, I’m back to square one.
    Back to the real scenario, if you hold on to your ungrounded intuition re what tricks and phrasing work without understanding why, you may find those don’t work at all on a new model version or when forced to change to a different product due to price, insolvency, etc.
    eszeda day ago
    You're always free to stop at the level of abstraction at which you find a certain answer to be satisfying, but you can also keep digging. Why are flat shoes better? Well, it's to do with my gait. Ok, but why is my gait like that? Something-something musculoskeletal. Why is my body that way? Something-something genetic. OK, but why is that? And so on.
    Pursued far enough, any line of thought will reach something non-deterministic - or, simply, That's The Way It Is - however unsatisfying that is to those of us who crave straightforward answers. Like it or not, our ground truth as human beings ultimately rests on intuition. (Feel free to say, "No, it's physics", or "No, it's maths", but I'll ask you if you're doing those calculations in your head as you run!)
    devmora day ago
    It is very silly to treat zero grounding the same as accepting core, proven concepts. Your PoV here is no different than saying "It rains because god is sad and crying" is an appropriate thing to believe.
    If you want to say "god is responsible for creating the precipitation cycle", sure. But we don't disregard understanding that exists to substitute intuition.
    a day ago
    undefined
- libraryofbabel2 days ago
  Agree. This is such a good balanced article. The only things that still make the insights difficult to apply to professional software development are: this was greenfield work and it was a solo project. But that’s hardly the author’s fault. It would however be fantastic to see more articles like this about how to go all in on AI tools for brownfield projects involving more than one person.
  One thing I will add: I actually don’t think it’s wrong to start out building a vibe coded spaghetti mess for a project like this… provided you see it as a prototype you’re going to learn from and then throw away. A throwaway prototype is immensely useful because it helps you figure out what you want to build in the first place, before you step down a level and focus on closely guiding the agent to actually build it.
  The author’s mistake was that he thought the horrible prototype would evolve into the real thing. Of course it could not. But I suspect that the author’s final results when he did start afresh and build with closer attention to architecture were much better because he has learned more about the requirements for what he wanted to build from that first attempt.
  - argee2 days ago
    This wasn't even just greenfield work, it included the exact type of work where AI arguably excels: extracting working code from an extant codebase (SQLite) as a reusable library. (It also included the type of work AI is really bad at: designing APIs sensibly.)
  - szundi2 days ago
    [dead]
- csallen2 days ago
  I'll take the other side of this.
  Professional software engineers like many of us have a big blind spot when it comes to AI coding, and that's a fixation on code quality.
  It makes sense to focus on code quality. We're not wrong. After all, we've spent our entire careers in the code. Bad code quality slows us down and makes things slow/insecure/unreliable/etc for end users.
  However, code quality is becoming less and less relevant in the age of AI coding, and to ignore that is to have our heads stuck in the sand. Just because we don't like it doesn't mean it's not true.
  There are two forces contributing to this: (1) more people coding smaller apps, and (2) improvements in coding models and agentic tools.
  We are increasingly moving toward a world where people who aren't sophisticated programmers are "building" their own apps with a user base of just one person. In many cases, these apps are simple and effective and come without the bloat that larger software suites have subjected users to for years. The code is simple, and even when it's not, nobody will ever have to maintain it, so it doesn't matter. Some apps will be unreliable, some will get hacked, some will be slow and inefficient, and it won't matter. This trend will continue to grow.
  At the same time, technology is improving, and the AI is increasingly good at designing and architecting software. We are in the very earliest months of AI actually being somewhat competent at this. It's unlikely that it will plateau and stop improving. And even when it finally does, if such a point comes, there will still be many years of improvements in tooling, as humanity's ability to make effective use of a technology always lags far behind the invention of the technology itself.
  So I'm right there with you in being annoyed by all the hype and exaggerated claims. But the "truth" about AI-assisted coding is changing every year, every quarter, every month. It's only trending in one direction. And it isn't going to stop.
  - mjr002 days ago
    > However, code quality is becoming less and less relevant in the age of AI coding, and to ignore that is to have our heads stuck in the sand. Just because we don't like it doesn't mean it's not true.
    Strongly disagree with this thesis, and in fact I'd go completely the opposite: code quality is more important than ever thanks to AI.
    LLM-assisted coding is most successful in codebases with attributes strongly associated with high code quality: predictable patterns, well-named variables, use of a type system, no global mutable state, very low mutability in general, etc.
    I'm using AI on a pretty shitty legacy area of a Python codebase right now (like, literally right now, Claude is running while I type this) and it's struggling for the same reason a human would struggle. What are the columns in this DataFrame? Who knows, because the dataframe is getting mutated depending on the function calls! Oh yeah and someone thought they could be "clever" and assemble function names via strings and dynamically call them to save a few lines of code, awesome! An LLM is going to struggle deciphering this disasterpiece, same as anyone.
    Meanwhile for newer areas of the code with strict typing and a sensible architecture, Claude will usually just one-shot whatever I ask.
    edit: I see most replies are saying basically the same thing here, which is an indicator.
    59nadira day ago
    I agree entirely with your statement that structure makes things easier for both LLMs and humans, but I'd gently push back on the mutation. Exactly as mutation is fine for humans it also seems to be fine for LLMs in that structured mutation (we know what we can change, where we can change it and to what) works just fine.
    Your example with the dataframes is completely unstructured mutation typical of a dynamic language and its sensibilities.
    I know from experience that none of the modern models (even cheap ones) have issues dealing with global or near-global state and mutating it, even navigating mutexes/mutices, conds, and so on.
    csallen2 days ago
    > LLM-assisted coding is most successful in codebases with attributes strongly associated with high code quality: predictable patterns, well-named variables, use of a type system, no global mutable state, very low mutability in general, etc.
    That's all very true, but what you're missing is that the proportion of codebases that need this is shrinking relative to the total number of codebases. There's an incredible proliferation of very small, bespoke, simple, AI-coded apps, that are nonetheless quite useful. Most are being created by people who have never written a line of code in their life, who will do no maintenance, and who will not give two craps how the code looks, any more than the average YouTuber cares about the aperture of their lens or the average forum commenter care about the style of their prose.
    We don't see these apps because we're professional software engineers working on the other stuff. But we're rapidly approaching a world where more and more software is created by non-professionals.
    mjr002 days ago
    > That's all very true, but what you're missing is that the proportion of codebases that need this is shrinking relative to the total number of codebases. There's an incredible proliferation of very small, bespoke, simple, AI-coded apps, that are nonetheless quite useful. Most are being created by people who have never written a line of code in their life, who will do no maintenance, and who will not give two craps how the code looks, any more than the average YouTuber cares about the aperture of their lens or the average forum commenter care about the style of their prose.
    I agree that there will be more small, single-use utilities, but you seem to believe that this will decrease the number or importance of traditional long-lived codebases, which doesn't make sense. The fact that Jane Q. Notadeveloper can vibe code an app for tracking household chores is great, but it does not change the fact that she needs to use her operating system (a massive codebase) to open Google Chrome (a massive codebase) and go to her bank's website (a massive codebase) to transfer money to her landlord for rent (a process which involves many massive software systems interacting with each other, hopefully none of which are vibe coded).
    The average YouTuber not caring about the aperture of their lens is an apt comparison: the median YouTube video has 35 views[0]. These people likely do not care about their camera or audio setup, it's true. The question is, how is that relevant to the actual professional YouTubers, MrBeast et al, who actually do care about their AV setup?
    [0] https://www.intotheminds.com/blog/en/research-youtube-stats/
    csallena day ago
    This is where I get into much more speculative land, but I think people are underestimating the degree to which AI assistant apps are going to eat much of the traditional software industry. The same way smart phones ate so many individual tools, calculators, stop watches, iPods, etc.
    It takes a long time for humanity to adjust to a new technology. First, the technology needs to improve for years. Then it needs to be adopted and reach near ubiquity. And then the slower-moving parts of society need to converge and rearrange around it. For example, the web was quite ready for apps like Airbnb in the mid 90s, but the adoption+culture+infra was not.
    In 5, maybe 10, certainly 15 years, I don't think as many people are going to want to learn, browse, and click through a gazillion complex websites and apps and flows when they can easily just tell their assistant to do most of it. Google already correctly realizes this as an existential threat, as do many SaaS companies.
    AI assistants are already good enough to create ephemeral applications on the fly in response to certain questions. And we're in the very, very early days of people building businesses and infra meant to be consumed by LLMs.
    mjr00a day ago
    > In 5, maybe 10, certainly 15 years, I don't think as many people are going to want to learn, browse, and click through a gazillion complex websites and apps and flows when they can easily just tell their assistant to do most of it.
    And how do you think their assistant will interact with external systems? If I tell my AI assistant "pay my rent" or "book my flight" do you think it's going to ephemerally vibe code something on the banks' and airlines' servers to make this happen?
    You're only thinking of the tip of the iceberg which is the last mile of client-facing software. 90%+ of software development is the rest of the iceberg, unseen beneath the surface.
    I agree there will be more of this but again, that does not preclude the existence of more of the big backend systems existing.
    csallena day ago
    I don't think we disagree. We still have big mainframe systems from the 70s and beyond that a powering parts of society. I don't think all current software systems are just going to die or disappear, especially not the big ones. But I do think significant double digit percentages of software engineers are working on other types of software that are at risk of becoming first- or second- or third-order casualties in a world where ephemeral AI assistant-generated software and vibe coded bespoke software becomes increasingly popular.
    grey-area20 hours ago
    You are vastly overstating the capabilities of LLMs and the capacity and desire of non-technical individuals to use them to create applications.
    csallen16 hours ago
    What's even the point of vague replies like this that disagree with no real evidence, arguments, or examples?
    joe_the_usera day ago
    The thing, everything you describe may be easy for an average person in the future. But just having your single AI agent do all of that will be even easier and that seems like where things will go.
    datsci_est_2015a day ago
    Just like everyone has a 3D printer at home?
    People want convenience, not a way to generate an application that creates convenience.
    HeyLaughingBoy18 hours ago
    And perhaps they'll get that convenience from an application that they don't even know came into existence because they asked their agent to do something.
    datsci_est_201517 hours ago
    What, in practice, is the difference between AGI and what you’re suggesting will exist in terms of agent automation?
  - zozbot2342 days ago
    However, code quality is becoming less and less relevant in the age of AI coding
    It actually becomes more and more relevant. AI constantly needs to reread its own code and fit it into its limited context, in order to take it as a reference for writing out new stuff. This means that every single code smell, and every instance of needless code bloat, actually becomes a grievous hazard to further progress. Arguably, you should in fact be quite obsessed about refactoring and cleaning up what the AI has come up with, even more so than if you were coding purely for humans.
    bananamogula day ago
    Even non-frontier models now offer a context window of 1 million tokens. That's 100K-300K LOCs. I would not call that a limited context.
  - vlovich1232 days ago
    > However, code quality is becoming less and less relevant in the age of AI coding, and to ignore that is to have our heads stuck in the sand. Just because we don't like it doesn't mean it's not true.
    Strong disagree. I just watched a team spend weeks trying to make a piece of code work with AI because the vibe coded was spaghetti garbage that even the AI couldn’t tell what needed to be done and was basically playing ineffective whackamole - it would fix the bug you ask it by reintroducing an old bug or introducing a new bug because no one understood what was happening. And humans couldn’t even step in like normal because no one understood what’s going on.
    csallen2 days ago
    Okay, so you observed one team that had an issue with AI code quality. What's your point?
    In 1998, I'm sure there were newspaper companies who failed at transitioning online, didn't get any web traffic, had unreliable servers crashed, etc. This says very little about what life would be like for the newspaper industry in 1999, 2000, 2005, 2010, and beyond.
    vlovich1232 days ago
    Im arguing that code quality very much still matters and will only continue to matter.
    AI will get better at making good maintainable and explainable code because that’s what it takes to actually solve problems tractably. But saying “code quality doesn’t matter because AI” is definitely not true both experientially and as a prediction. Will AI do a better job in the future? Sure. But because their code quality improves not because it’s less important.
    csallen2 days ago
    Well then sure, we can agree there, it's just a matter of phrasing then.
    vlovich1232 days ago
    Then you may want to clarify what your phrasing meant because I couldn’t find a more charitable interpretation
    csallen2 days ago
    More and more software will be built by non-experts, software that has smaller user bases and simpler use cases and doesn't need to be maintained as much if at all. "Poor AI code quality" matters much less for these than for say, software written by developers at FAANG companies, since literally nobody will ever even look at the code.
    Where we're headed is toward a world where a ton of software is ephemeral, apps literally created by AI out of thin air for a single use, and then gone.
    fn-mote2 days ago
    Ephemeral in the same way the electrical wiring in an old house is ephemeral.
    Which is to say, not at all.
    Original wiring done by a professional, later changes by “vibe electrician” homeowners.
    Every circuit might be a custom job, but they all accumulate into something a SWE calls “technical debt”.
    Don’t like how the toaster and the microwave are on the same circuit even though they are in different parts of the kitchen? You’re lucky if you can even follow the wiring back to the circuit box to see how it was done. The electrical box is so much of a mess where would you even run a new circuit?
    That’s the future we’re looking at.
    csallena day ago
    No ephemeral as in: I'll ask the AI to check my email, and it'll create a bespoke table UI on the fly right inside my AI assistant, and populate it with relevant email data. And I'll use it, and then it will disappear. Software created and destroyed in a moment.
    Not all software is meant to be some permanent building block upon which other software sits.
    When new technology arrives that makes earlier ways of doing things obsolete, the consistent pattern throughout history has been that existing experts and professionals significantly underestimate the changes to come, in large part because (a) they don't like those changes, and (b) they're too used to various constraints and priorities that used to be important but no longer are. In other words, they're judging the new tech the lens of an older world, rather than through the lens of a newer world created by the new tech.
    strokirka day ago
    Yeah, I’ve built many one-off scripts in my day, and these days they take 100x less time.
    atomicnumber32 days ago
    There's almost no point in arguing about this anymore. Neither you nor the other person are going to be convinced. We just have to wait and see if a new crop of 100x productivity AI believer companies come along and unseat all the incumbents.
    layer82 days ago
    It seems that your opinion is based on expectations for the future then, which is notoriously difficult to predict.
    csallen2 days ago
    It's not that hard to predict that obviously useful new technology is going to improve over time.
    Guns, wheels, cars, ships, batteries, televisions, the internet, smartphones, airplanes, refrigeration, electric lighting, semiconductors, GPS, solar panels, antibiotics, printing presses, steam engines, radio, etc. The pattern is obvious, the forces are clear and well-studied.
    If there is (1) a big gap between current capabilities and theoretical limits, (2) huge incentives for those who to improve things, (3) no alternative tech that will replace or outcompete it, (4) broad social acceptance and adoption, and (5) no chance of the tech being lost or forgotten, then technological improvement is basically a guarantee.
    These are all obviously true of AI coding.
    ThrowawayR22 days ago
    That list cherry picks all the successful cases where the technology improved while ignoring the many, many others where it didn't and the technology improved no further. That's dishonest.
    It isn't even a good job of cherry picking: we never got mainstream supersonic passenger aircraft after the Concorde because aerospace technology hasn't advanced far enough to make it economically viable and the decrease in progress and massively increasing costs in semiconductors for cutting edge processes is very well known.
    csallen2 days ago
    You're not factoring in the list of constraints I provided.
    There's no broad social acceptance of supersonic flight because it creates incredibly loud sonic booms that the public doesn't want to deal with. And despite that, it's still a bad counterexample, as companies continue to innovate in this area e.g. Boom Supersonic.
    At best you can say, "It's taking longer than expected," but my point was never that it will happen on any specific schedule. It took 400 years for guns to advance from the primitive fire lances in China to weapons with lock mechanisms in the 1400s. Those long time frames only prove my point even more strongly. Progress WILL happen, when there is appetite and acceptance and incentive and room to grow, and time is no obstacle. It's one of the more certain things in human history, and the forces behind it have been well studies.
    Just as certain: the people and jobs who are obsoleted by these new technologies often remain in denial until they are forgotten.
    vlovich1232 days ago
    If code quality only stops mattering in 400 years (whatever that definition happens to be) then the prediction that it makes is worthless in terms of what you should do today. You use it to argue it’s unimportant deal with it, but if it’s a 400 year payoff you’ve made the wrong bet.
    csallen2 days ago
    Surely you don't think AI coding technology will be as slow to develop as guns were.
    We're obviously talking about 1-10 years here, not 100-1000 years.
    vlovich1232 days ago
    It’s really hard to predict where exponential progress will freeze. I was reading the other day that the field seems to have stagnated again in terms of no really meaningful ideas to overcome the inherent bottlenecks we’ve hit now in terms of diminishing returns for scaling. I’m not a pessimist or unbridled optimist but I think it’s fundamentally difficult to predict and the law of averages suggests someone will end up crowing about being right
    duskdozera day ago
    In contrast to AI/AI companies, which have no negative externalities?
    21 hours ago
    undefined
    mnsc2 days ago
    But hindsight is 20/20 as they say. In 2020 people predicted that Facebook Horizon would only go one direction, always improve and become as pervasive as the internet. So when you predict that the design and architecture capabilities of models will continue to improve, thus making code quality irrelevant, you sound very confident. And if in five years you are right, you will brag about it here. If not, well I for one will not track you down and rub it in your face. Peace out.
    csallen2 days ago
    You're confusing betting on a company/product vs betting on technological improvement in general.
    It is absolutely the case that virtual reality technology will only get better over time. Maybe it'll take 5, or 10, or 20, or 40 years, but it's almost a certainty that we'll eventually see better AR/VR tech in the future than we have in the past.
    Would you bet against that? You'd be crazy to imo.
    slfnflctd2 days ago
    There's a kid outside the window of the place I'm staying who's been in the yard playing and talking with people online through his VR headset for like 2+ hours. He's living in the future. Whatever happens, he and his friends are going to continue to be interested in more of this.
    Whether what they're using in 20 years is produced by the company formerly known as Facebook or not is a whole different question.
    NoMoreNicksLefta day ago
    The newspaper industry is the perfect analogy, because it is effectively dead. Wholesale dead. Here and there, the biggest, most world-renowned papers are still alive, on life-support... NYT, WSJ, etc. But they're all dead. Their death has caused the absolute destruction of an entire industry sector and has given gangrene to adjacent industries that they will soon succumb to. The point about 1998 wasn't that there was this transition that demanded careful attention and wise strategy, but that death was coming for it no matter what anyone did to stop it.
    The death of newspapers is quite the spectacle too. No one seems to understand how bad it is... the youngest generation can't even seem to recognize that anything is missing. We've effectively amateurized journalism so that only grifters and talentless hacks want to attempt it, and only in tiny little soundbites on Twitter or other social media (and they're quickly finding out how it might be more lucrative to do propaganda for foreign governments or MLM charlatanism). When the death of the software industry is complete, it too will have been completely amateurized, the youngest generation will not even appreciate that people used to make it for a living, and the few amateurs doing it will start to comprehend how much more lucrative it will be to just make poorly disguised malware.
  - Philip-J-Fry2 days ago
    I don't buy this at all. Code quality will always matter. Context is king with LLMs, and when you fill that context up with thousands of lines of spaghetti, the LLM will (and does) perform worse. Garbage in, garbage out, that's still the truth from my experience.
    Spaghetti code is still spaghetti code. Something that should be a small change ends up touching multiple parts of the codebase. Not only does this increase costs, it just compounds the next time you need to change this feature.
    I don't see why this would be a reality that anyone wants. Why would you want an agent going in circles, burning money and eventually finding the answer, if simpler code could get it there faster and cheaper?
    Maybe one day it'll change. Maybe there will be a new AI technology which shakes up the whole way we do it. But if the architecture of LLMs stays as it is, I don't see why you wouldn't want to make efficient use of the context window.
    csallen2 days ago
    I didn't say that you "want" spaghetti code or that spaghetti code is good.
    I said that (a) apps are getting simpler and smaller in scope and so their code quality matters less, and (b) AI is getting better at writing good code.
    camdenreslink2 days ago
    Apps are getting bigger and more ambitious in scope as developers try to take advantage of any boost in production LLMs provide them.
    csallen2 days ago
    Every metric I've seen points to there being an explosion in (a) the number of apps that exist and (b) the number of people making applications.
    pessimizer2 days ago
    What relevance do either of those claims have to the claim of the comment you are responding to?
    Are you trying to imply that having more things means that each of them will be smaller? There are more people than there were 500 years ago - are they smaller, or larger?
    Also, the printing press did lead to much longer works. There are many continuous book series that have run for decades, with dozens of volumes and millions of words. This is a direct result of the printing press. Just as there are television shows that have run with continuous plots for thousands of hours. This is a consequence of video recording and production technologies; you couldn't do that with stage plays.
    You seem to be trying to slip "smaller in scope" into your statement without backing, even though I'd insist that applications individuals wrote being "smaller in scope" was a obvious consequence of the tooling available. I can't know everything, so I have to keep the languages and techniques limited to the ones that I do know, and I can't write fast enough to make things huge. The problems I choose to tackle are based on those restrictions.
    Those are the exact things that LLMs are meant to change.
    csallen2 days ago
    The average piece written and published today today is much shorter than the average piece from the past. Look at Twitter. Social media in general. Internet forums. Blog posts. Emails. Chats. Etc. The amount of this content DWARFS other content.
    The same is true of most things that get democratized. Look at video. TikTok, YouTube, YouTube shorts.
    Look at all the apps people are building are building for themselves with AI. They are typically not building Microsoft Word.
    Of course there will be some apps that are bigger and more ambitious than ever. I myself am currently building an app that's bigger an more ambitious than I would have tried to build without AI. I'm well aware of this use case.
    But as many have pointed out, AI is worse at these than at smaller apps. And pretending that these are the only apps that matter is what's leading developers imo to over-value the importance of code quality. What's happening right now that's invisible to most professional engineers is an explosion in the number of time, bespoke personal applications being quickly built by non-developers are that are going to chip away at people's reasons to buy and use large, bloated, professional software with hundreds of thousands of users.
    camdenreslink2 days ago
    > Look at all the apps people are building are building for themselves with AI.
    The apps those people were making before LLMs became ubiquitous were no apps. So by definition they are now larger and more ambitious.
    leptons2 days ago
    There's already been an explosion of apps - and most of them suck, are spam, or worse, will steal your data.
    We don't need more slop apps, we already have that and have for years.
    tadfisher2 days ago
    The Jevons paradox says otherwise. As producing apps becomes cheaper, we will not be able to help ourselves: we will make them larger until they fill all available space and cost just as much to produce and maintain.
    csallen2 days ago
    That's the incorrect application of the Jevons Paradox. We won't get bigger apps, we'll get more apps.
    Think about what happened to writing when we went from scribes to the printing press, and from the printing press to the web. Books and essays didn't get bigger. We just got more people writing.
  - Gigachada day ago
    I’ve been told repeatedly now that if AI coding isn’t working for me it’s because my projects code quality is too poor so the agents can’t understand it.
    Now I’m being told code quality doesn’t matter at all.
  - HeyLaughingBoy18 hours ago
    Controversy much :-)
    I completely agree. Just going through the beginner & hobbyist forums, the change from "can you help me with code to do X" to "I used ChatGPT/Claude/Copilot to write code to do X" happened with absolutely startling speed, and it's not slowing down. There was clearly a pent-up demand here that wasn't being met otherwise.
    People are using AI to get code written. They have no idea what code quality is and only care that what they built works.
    AFAICT, every time technology has allowed non-technical people to do more, it's opened up new opportunities for programmers. I don't expect this to be any different, I just want to know where the opportunities are.
  - vehementia day ago
    Nothing you wrote seems to support what you said at the start there. Why is the importance of code quality decreasing?
  - wiether2 days ago
    > However, code quality is becoming less and less relevant in the age of AI coding, and to ignore that is to have our heads stuck in the sand. Just because we don't like it doesn't mean it's not true. > [...] > We are increasingly moving toward a world where people who aren't sophisticated programmers are "building" their own apps with a user base of just one person. In many cases, these apps are simple and effective and come without the bloat that larger software suites have subjected users to for years. The code is simple, and even when it's not, nobody will ever have to maintain it, so it doesn't matter. Some apps will be unreliable, some will get hacked, some will be slow and inefficient, and it won't matter. This trend will continue to grow.
    I do agree with the fact that more and more people are going to take advantage of agentic coding to write their own tools/apps to maker their life easier. And I genuinely see it as a good thing: computers were always supposed to make our lives easier.
    But I don't see how it can be used as an argument for "code quality is becoming less and less relevant".
    If AI is producing 10 times more lines that are necessary to achieve the goal, that's more resources used. With the prices of RAM and SSD skyrocketing, I don't see it as a positive for regular users. If they need to buy a new computer to run their vibecoded app, are they really reaping the benefits?
    But what's more concerning to me is: where do we draw the line?
    Let's say it's fine to have a garbage vibecoded app running only on its "creator" computer. Even if it gobbles gigabytes of RAM and is absolutely not secured. Good.
    But then, if "code quality is becoming less and less relevant", does this also applies to public/professional apps?
    In our modern societies we HAVE to use dozens of software everyday, whether we want it or not, whether we actually directly interact with them or not.
    Are you okay with your power company cutting power because their vibecoded monitoring software mistakenly thought you didn't paid your bills?
    Are you okay with an autonomous car driving over your kid because its vibecoded software didn't saw them?
    Are you okay with cops coming to your door at 5AM because a vibecoded tool reported you as a terrorist?
    Personally, I'm not.
    People can produce all the trash they want on their own hardware. But I don't want my life to be ruled by software that were not given the required quality controls they must have had.
    csallen2 days ago
    > If AI is producing 10 times more lines that are necessary to achieve the goal, that's more resources used. With the prices of RAM and SSD skyrocketing, I don't see it as a positive for regular users. If they need to buy a new computer to run their vibecoded app, are they really reaping the benefits?
    I mean, I agree, but you could say this at any point in time throughout history. An engineer from the 1960s engineer could scoff at the web and the explosion in the number of progress and the decline in efficiency of the average program.
    An artist from the 1700s would scoff at the lack of training and precision of the average artist/designer from today, because the explosion in numbers has certain translated to a decline in the average quality of art.
    A film producer from the 1940s would scoff at the lack of quality of the average YouTuber's videography skills. But we still have millions of YouTubers and they're racking up trillions of views.
    Etc.
    To me, the chief lesson is that when we democratize technology and put it in the hands of more people, the tradeoff in quality is something that society is ready to accept. Whether this is depressing (bc less quality) or empowering (bc more people) is a matter of perspective.
    We're entering a world where FAR more people will be able to casually create and edit the software they want to see. It's going to be a messier world for sure. And that bothers us as engineers. But just because something bothers us doesn't mean it bothers the rest of the world.
    > But then, if "code quality is becoming less and less relevant", does this also applies to public/professional apps?
    No, I think these will always have a higher bar for reliability and security. But even in our pre-vibe coded era, how many massive brandname companies have had outages and hacks and shitty UIs? Our tolerance for these things is quite high.
    Of course the bigger more visible and important applications will be the slowest to adopt risky tech and will have more guardrails up. That's a good thing.
    But it's still just a matter of time, especially as the tools improve and get better at writing code that's less wasteful, more secure, etc. And as our skills improve, and we get better at using AI.
  - hokkos2 days ago
    If strongly typed languages are preferred for AI coding, maybe the fixation on code quality make LLM produce better code.
  - danielmarkbrucea day ago
    Maybe, but how exactly are you defining "code quality" ?
  - subhobroto2 days ago
    > nobody will ever have to maintain it, so it doesn't matter
    I'm curious about software that's actively used but nobody maintains it. If it's a personal anecdote, that's fine as well
    csallen2 days ago
    I mean I've written some scripts and cron jobs for websites that I manage that have continued trucking for years with no changes or monitoring on my end. I suppose it's a bit easier on the web.
  - lowsong2 days ago
    > However, code quality is becoming less and less relevant in the age of AI coding, and to ignore that is to have our heads stuck in the sand. Just because we don't like it doesn't mean it's not true.
    It's the opposite, code quality is becoming more and more relevant. Before now you could only neglect quality for so long before the time to implement any change became so long as to completely stall out a project.
    That's still true, the only thing AI has changed is it's let you charge further and further into technical debt before you see the problems. But now instead of the problems being a gradual ramp up it's a cliff, the moment you hit the point where the current crop of models can't operate on it effectively any more you're completely lost.
    > We are in the very earliest months of AI actually being somewhat competent at this. It's unlikely that it will plateau and stop improving.
    We hit the plateau on model improvement a few years back. We've only continued to see any improvement at all because of the exponential increase of money poured into it.
    > It's only trending in one direction. And it isn't going to stop.
    Sure it can. When the bubble pops there will be a question: is using an agent cost effective? Even if you think it is at $200/month/user, we'll see how that holds up once the cost skyrockets after OpenAI and Anthropic run out of money to burn and their investors want some returns.
    Think about it this way: If your job survived the popularity of offshoring to engineers paid 10% of your salary, why would AI tooling kill it?
    csallen2 days ago
    > That's still true, the only thing AI has changed is it's let you charge further and further into technical debt before you see the problems. But now instead of the problems being a gradual ramp up it's a cliff, the moment you hit the point where the current crop of models can't operate on it effectively any more you're completely lost.
    What you're missing is that fewer and fewer projects are going to need a ton of technical depth.
    I have friends who'd never written a line of code in their lives who now use multiple simple vibe-coded apps at work daily.
    > We hit the plateau on model improvement a few years back. We've only continued to see any improvement at all because of the exponential increase of money poured into it.
    The genie is out of the bottle. Humanity is not going to stop pouring more and more money into AI.
    > Sure it can. When the bubble pops there will be a question: is using an agent cost effective? Even if you think it is at $200/month/user, we'll see how that holds up once the cost skyrockets after OpenAI and Anthropic run out of money to burn and their investors want some returns.
    The AI bubble isn't going to pop. This is like saying the internet bubble is going to pop in 1999. Maybe you will be right about short term economic trends, but the underlying technology is here to stay and will only trend in one direction: better, cheaper, faster, more available, more widely adopted, etc.
    lowsong2 days ago
    > What you're missing is that fewer and fewer projects are going to need a ton of technical depth. > I have friends who'd never written a line of code in their lives who now use multiple simple vibe-coded apps at work daily.
    Again it's the opposite. A landscape of vibe coded micro apps is a landscape of buggy, vulnerable, points of failure. When you buy a product, software or hardware, you do more than buy the functionality you buy the assurance it will work. AI does not change this. Vibe code an app to automate your lightbulbs all you like, but nobody is going to be paying millions of dollars a year on vibe coded slop apps and apps like that is what keeps the tech industry afloat.
    > Humanity is not going to stop pouring more and more money into AI.
    There's no more money to pour into it. Even if you did, we're out of GPU capacity and we're running low on the power and infrastructure to run these giant data centres, and it takes decades to bring new fabs or power plants online. It is physically impossible to continue this level of growth in AI investment. Every company that's invested into AI has done so on the promise of increased improvement, but the moment that stops being true everything shifts.
    > The AI bubble isn't going to pop. This is like saying the internet bubble is going to pop in 1999.
    The internet bubble did pop. What happened after is an assessment of how much the tech is actually worth, and the future we have now 26 years later bears little resemblance to the hype in 1999. What makes you think this will be different?
    Once the hype fades, the long-term unsuitability for large projects becomes obvious, and token costs increase by ten or one hundred times, are businesses really going to pay thousands of dollars a month on agent subscriptions to vibe code little apps here and there?
    csallen2 days ago
    > Again it's the opposite. A landscape of vibe coded micro apps is a landscape of buggy, vulnerable, points of failure. When you buy a product, software or hardware, you do more than buy the functionality you buy the assurance it will work. AI does not change this. Vibe code an app to automate your lightbulbs all you like, but nobody is going to be paying millions of dollars a year on vibe coded slop apps and apps like that is what keeps the tech industry afloat.
    This is what everyone says when technology democratizes something that was previously reserved for a small number of experts.
    When the printing press was invented, scribes complained that it would lead to a flood of poorly written, untrustworthy information. And you know what? It did. And nobody cares.
    When the web was new, the news media complained about the same thing. A landscape of poorly researched error-ridden microblogs with spelling mistakes and inaccurate information. And you know what? They were right. That's exactly what the internet led to. And now that's the world we live in, and 90% of those news media companies are dead or irrelevant.
    And here you are continuing the tradition of discussing a new landscape of buggy, vulnerable products. And the same thing will happen and already is happening. People don't care. When you democratize technology and you give people the ability to do something useful they never could do before without having to spend years becoming an expert, they do it en masse, and they accept the tradeoffs. This has happened time and time again.
    > The internet bubble did pop... the future we have now 26 years later bears little resemblance to the hype in 1999. What makes you think this will be different?
    You cut out the part where I said it only popped economically, but the technology continued to improve. And the situation we have now is even better than the hype in 1999:
    They predicted video on demand over the internet. They predicted the expansion of broadband. They predicted the dominance of e-commerce. They predicted incumbents being disrupted. All of this happened. Look at the most valuable companies on earth right now.
    If anything, their predictions were understated. They didn't predict mobile, or social media. They thought that people would never trust SaaS because it's insecure. They didn't predict Netflix dominating Hollywood. The internet ate MORE than they thought it would.
    2ccg2 days ago
    Your whole argument is based on 'the technology improves'.
    Ok, so another fundamental proposition is monetary resources are needed to fund said technology improvement.
    Whats wrong with LLMs? They require immense monetary resources.
    Is that a problem for now? No because lots of private money is flowing in and Google et al have the blessing of their shareholders to pump up the amount of cash flows going into LLM based projects.
    Could all this stop? Absolutely, many are already fearing the returns will not come. What happens then? No more huge technology leaps.
    csallen2 days ago
    This has literally never happened in the history of humanity. Name one technology where development permanently stopped due to lack of funding, despite there being...
    1. lots of room for progress, i.e. the theoretical ceiling dwarfed the current capabilities
    2. strong incentives to continue development, i.e. monetary or military success
    3. no obviously better competitors/alternatives
    4. social/cultural tolerance from the public
    Literally hasn't happened. Even if you can find 1 or 2 examples, they are dwarfed by the hundreds of counter examples. But more than likely, you won't find any examples, or you'll just find something recent where progress is ongoing.
    Useful technology with room to improve almost always improves, as people find ways to make it better and cheaper. AI costs have already fallen dramatically since LLMs first burst on the scene a few years back, yet demand is higher than ever, as consumers and businesses are willing to pay top dollar for smarter and better models.
    lowsong2 days ago
    AI has none of these things.
    1. As I said before, we've long since reached diminishing returns on models. We simply don't have enough compute or training data left to make them dramatically better.
    2. This is only true if it actually pans out, which is still an unknown question.
    3. Just... not using it? It has to justify its existence. If it's not of benefit vs. the cost then why bother.
    4. The public hates AI. The proliferation of "AI slop" makes people despise the technology wholesale.
    csallen2 days ago
    1. Saying that AI will never approach its theoretical limits because XYZ tech is approaching diminishing returns, is like saying guns would never get better than the fire sticks of China in 1000 AD because the then-current methods hit their theoretical limits. You're betting against tens of thousands of the smartest minds of a generation across the entire planet. I will happily take the other side of this bet.
    2. Sure, depends on #1. But the incentive is undeniable.
    3. It has. Do you think people are using Claude Code in incredible numbers for no reason?
    4. The public and businesses are adopting AI en masse. It's incredibly useful. Demand is skyrocketing. I don't think you could show that negative public sentiment has been sufficient to stop this, any more than negative sentiment about TVs, headphones, bicycles, etc (which was significant).
    With the exception of #1, I feel like you're arguing that things won't happen, where the numbers show they've already have happened and are accelerating.
    2ccg2 days ago
    Thanks for jumping in fella. Agree on all points.
    lowsong2 days ago
    > This is what everyone says when technology democratizes something that was previously reserved for a small number of experts.
    What part of renting your ability to do your job is "democratizing"? The current state of AI is the literal opposite. Same for local models that require thousands of dollars of GPUs to run.
    Over the past 20 years software engineering has become something that just about anyone can do with little more than a shitty laptop, the time and effort, and an internet connection. How is a world where that ability is rented out to only those that can pay "democratic"?
    > When the printing press was invented, scribes complained that it would lead to a flood of poorly written, untrustworthy information. And you know what? It did. And nobody cares.
    A bad book is just a bad book. If a novel is $10 at the airport and it's complete garbage then I'm out $10 and a couple of hours. As you say, who cares. A bad vibe coded app and you've leaked your email inbox and bank account and you're out way more than $10. The risk profile from AI is way higher.
    Same is even more true for businesses. The cost of a cyberattack or a outage is measured in the millions of dollars. It's a simple maths, the cost of the risk of compromise far oughtweights the cost of cheaper upfront software.
    > You cut out the part where I said it only popped economically, but the technology continued to improve.
    The improvement in AI models requires billions of dollars a year in hardware, infrastructure, end energy. Do you think that investors will continue to pour that level of investment into improving AI models for a payout that might only come ten to fifteen years down the road? Once the economic bubble pops, the models we have are the end of the road.
    2ccg2 days ago
    Dont waste your time on him. He reminds me of people who are so concentrated on one part of the picture, they can't see the whole damn thing and how all the pieces fit and interact with each other.
    csallen2 days ago
    You're describing yourself imo. Your point ignores hundreds of years of history and says zero about the forces that shape technological development and progress, which have been studied fairly exhaustively.
    2ccg2 days ago
    [flagged]
    zozbot2342 days ago
    "Thousands of dollars of GPU" as a one-time expense (not ongoing token spend) is dirt cheap if it meaningfully improves productivity for a dev. And your shitty laptop can probably run local AI that's good enough for Q&A chat.
    lowsong2 days ago
    On a SWE salary maybe. If the baseline cost of doing business is a $5k GPU you've excluded like a quarter of the US working population immediately.
    csallen2 days ago
    > What part of renting your ability to do your job is "democratizing"? The current state of AI is the literal opposite. Same for local models that require thousands of dollars of GPUs to run.
    "Renting your ability to do your job"?
    I think you're misunderstanding the definition of democratization. This has nothing to do with programmers. It has nothing to do with people's jobs. Democratizing is defined as "the process of making technology, information, or power accessible, available, or appealing to everyone, rather than just experts or elites."
    In other words, democratizing is not about people who who have jobs as programmers. It's about the people who don't know how to code, who are not software engineers, who are suddenly gaining the ability to produce software.
    Three years ago, you could not pay money to produce software yourself. You either had to learn and develop expertise yourself, or hire someone else. Today, any random person can sit down and build a custom to-do list app for herself, for free, almost instantly, with no experience.
    > The improvement in AI models requires billions of dollars a year in hardware, infrastructure, end energy. Do you think that investors will continue to pour that level of investment into improving AI models for a payout that might only come ten to fifteen years down the road? Once the economic bubble pops, the models we have are the end of the road.
    10-15 year payouts? Uhhh. Maybe you don't know any AI investors, but the payout is coming NOW. Many tens of thousands of already gotten insanely rich, three years ago, and two years ago, and last year, and this year. If you think investors won't be motivated, and there aren't people currently in line to throw their money into the ring, you're extremely uninformed about investor sentiment and returns lol.
    You can predict that the music will stop. That's fair. But to say that investors are worried about long payout times is factually inaccurate. The money is coming in faster and harder than ever.
    pessimizer2 days ago
    I have no idea what this flood of personal-use software is that you think normal people want to produce. Normal people don't even think about software doing a thing until they see an advertisement about software that does a thing. And then they'd rather pay 10 bucks for it than to invent a shittier version of it themselves for $500.
    And I'm not being condescending about normal people. Developers often don't think about the possibility of making software that does a particular thing until they actually see software that does that thing. And they're going to also going to prefer to buy than vibe code unless the program is small and insignificant.
    csallen2 days ago
    Go look at the numbers from Lovable and Replit and Claude Code and similar companies. Quite staggering.
    I myself have run an online community for early-stage startup founders for over a decade. The number of ambitious people who would love to build something but don't know how to code and in the last year or two have started cranking out applications is tremendous. That number is far higher than the number of software engineers who existed before.
    skydhasha day ago
    That's very much an echo chamber you find yourself in. I'm far away from any technological center and the main use of LLM for people is the web search widget, spell checking and generating letters. Also kids cheating on their homework.
    lowsong2 days ago
    > Democratizing is defined as "the process of making technology, information, or power accessible, available, or appealing to everyone, rather than just experts or elites."
    Your definition only supports my point. The transfer of skill from something you learn to something you pay to do is the exact and complete opposite of your stated definition. It turns the activity from something that requires you to learn it to one that only those that can afford to pay can do.
    It is quite literally making this technology, information, and power available to only the elite.
    > Uhhh. Maybe you don't know any AI investors, but the payout is coming NOW.
    What payout? Zero AI companies are profitable. If you're invested in one of these companies you could be a billionaire on paper, but until it's liquid it's meaningless. There's plenty of investors who stand to make a lot of money if these big companies exit, but there's no guarantee that will happen.
    The only people making money at the moment are either taking cash salaries from AI labs or speculating on Nvidia stock. Neither of which have much do with the tech itself and everything to do with the hype.
    csallen2 days ago
    > It is quite literally making this technology, information, and power available to only the elite.
    I don't know what to say to you. More people are coding now with AI than ever coded before. If your argument was true, then that would just mean that there are more elites than ever. Obviously that's not what's happening.
    > What payout? Zero AI companies are profitable.
    Because they're reinvesting profits into continued R&D, not because their current products are unprofitable. You're failing to understand basic high-growth business models.
    > If you're invested in one of these companies you could be a billionaire on paper, but until it's liquid it's meaningless.
    Plenty of AI companies have exited, and plenty of other AI companies offer tender offers where shareholders have been able to sell their shares to new investors. Again, it sounds like you just aren't really educated on what's happening. Plenty of people are millionaires in real life, not just on paper. You're massively incorrect about the payout landscape that investors are considering.
    > The only people making money at the moment are either taking cash salaries from AI labs or speculating on Nvidia stock.
    No, founders, early-stage investors, and employees with stock have cashed out in many cases. Again, it just feels like you're not aware of what's happening on the ground.
    > Neither of which have much do with the tech itself and everything to do with the hype.
    That's a very different argument. If you want to say that the investment is unsound, then fine, that's your opinion, but trying to say that investors have no appetite because they have to wait 10 to 15 years for a payout is incredibly incorrect.
    lowsong2 days ago
    > I don't know what to say to you. More people are coding now with AI than ever coded before. If your argument was true, then that would just mean that there are more elites than ever. Obviously that's not what's happening.
    I don't know how I can explain this any more clearly.
    If you need AI to create software, and the cost of AI is $200/month, then only people who can afford $200/month can create software.
    Costs will increase. The current cost is substituted by investor funding. Sell at a loss to get people hooked on the product and then raise the price to make money, a "high-growth business model" as you say.
    The cost to make a competitor to Anthropic or OpenAI is tens or hundreds of billions of dollars upfront. There will be few competitors and minimal market pressure to reduce prices, even if the unit costs of inference are low.
    $200/month is already out of reach of the majority of the population. Increases from here means only a small percentage of the richest people can afford it.
    I don't know what definition of "elite" you're using but, "technology limited so that only a small percentage of the population can afford it" is... an elite group.
    This is fun and all, but I think we've reached the end of the productive discussion to be had and I don't have much more to say. Charitably, we're leaving in completely different realities. I just hope when the bubble pops the fall isn't too hard for you.
    csallen19 hours ago
    > I don't know how I can explain this any more clearly. If you need AI to create software, and the cost of AI is $200/month, then only people who can afford $200/month can create software.
    Your entire hypothetical is based on "ifs" that aren't true. Nothing in this sentence is true. You don't need AI to create software, the cost of AI development is much less than $200/month on average, and many more people can afford AI dev than programming bootcamps or classes or degrees.
    > Costs will increase. The current cost is substituted by investor funding. Sell at a loss to get people hooked on the product and then raise the price to make money, a "high-growth business model" as you say.
    Inference is already profitable at current pricing. Most funding goes toward R&D for new model training, not inference.
    Also, inference costs dropped over 280x between Nov 2022 and Oct 2024. Inference will continue to get cheaper as we develop more specialized hardware and efficient models.
    This is not Uber, subsidizing the cost of human drivers. This is real tech, chips and servers and software. Costs fall over time, not rise. Innovation does not go backwards.
    duggana day ago
    > $200/month is already out of reach of the majority of the population.
    1. You can build small applications with the $20/month sub, much more with the $100/month. Competition and technology improvements will inevitably improve the price to value ratio.
    2. Cable sports subscriptions are in a similar price range. Expensive, but not exclusive to “the elites”.
    lowsonga day ago
    The median per capita income in the United States is $37,683/year.[0] Depending on your state, after taxes, that's something like ~$2,600/month. You're asking almost 10% of their post-tax income to this just for the opportunity to create software. With rent, food, and other living expenses many households at that income level simply cannot afford this.
    This is the median income. If it's a struggle for someone on this income then it's worse for half of all Americans, and American incomes are higher than most of the rest of the world.
    [0]: https://en.wikipedia.org/wiki/Per_capita_personal_income_in_...
    duggana day ago
    The bar for "create software" up to this last year or so was "learn software development" or "pay someone else".
    Personally, I think millions more people having the ability to create some subset of software is an incredible shift.
    tatrionsa day ago
    [flagged]
    baggy_trougha day ago
    > $200/month is already out of reach of the majority of the population. Increases from here means only a small percentage of the richest people can afford it.
    This is an absurd claim. There are many things the majority of the population spends money on that cost more than this.
    lowsonga day ago
    I'm going to take your comment at face value, and I'm also going to assume that you're US-based.
    You need to take a step back and look at the economic reality of the majority of Americans today. Many live paycheck-to-paycheck, even those with "middle class" incomes. For many a $200 one-off bill is debilitating, yet alone a recurring subscription. If you don't know that, you have a dangerously narrow view of the economy.
    baggy_trougha day ago
    If you think that a $200/month subscription is "out of reach" for the majority of Americans, you are just plainly and simply wrong about that. They might have to make some tradeoffs by reducing spending in other areas, but that's part of life.
- hbarka2 days ago
  > Some people never get to the part where they review the code. They go straight to their LinkedIn or blog and start writing (or having ChatGPT write) posts about how manual coding is dead and they’re done writing code by hand forever. Some people review the code and declare it unusable garbage, then also go to their social media and post how AI coding is completely useless and they’re not going to use it for anything. This blog post shows the journey that anyone not in one of those two vocal minorities is going through right now.
  What’s really happening is that you’re all of those people in the beginning. Those people are you as you go through the experience. You’re excited after seeing it do the impossible and in later instances you’re critical of the imperfections. It’s like the stages of grief, a sort of Kübler-Ross model for AI.
- atomicnumber32 days ago
  I'm deeply convinced that there's 2 reasons we don't see real takes like this: 1) is because these people are quietly appreciating the 2-50% uplift you get from sanely using LLMs instead of constantly posting sycophantic or doomer shit for clout and/or VC financing. 2) is because the real version of LLM coding is boring and unsexy. It either involves generating slop in one shot to POC, then restarting from scratch for the real thing or doing extensive remediation costing far more than the initial vibe effort cost; or it involves generally doing the same thing we've been doing since the assembler was created except now I don't need to remember off-hand how to rig up boilerplate for a table test harness in ${current_language}, or if I wrote a snippet with string ops and if statements and I wish it were using regexes and named capture groups, it's now easy to mostly-accurately convert it to the other form instead of just sighing and moving on.
  But that's boring nerd shit and LLMs didn't change who thinks boring nerd shit is boring or cool.
  - zozbot2342 days ago
    > because the real version of LLM coding is boring and unsexy
    Some people do find it unfun, saying it deprives them of the happy "flow" of banging out code. Reaching "flow" when prompting LLMs arguably requires a somewhat deeper understanding of them as a proper technical tool, as opposed to a complete black box, or worse, a crystal ball.
    cindyllm2 days ago
    [dead]
  - marhee2 days ago
    Software engineering is only about 20% writing code (the famous 40-20-40 split). Most people use it only for the first 40%, and very succesfully (im in that camp). If you use it to write your code you can theorettically maybe get 20% time improvement initially, but you loose a lot of time later redoing it or unraveling. Not worth bothering.
    bdangubic2 days ago
    20% is one of those cool lies SWEs have been able to push through (like “our jobs are oh so very special we can’t really estimate it, we’ll create an entire sub-industries with our industry to make sure everyone knows we can’t estimate”).
    SWEs spend 20% of the time writing code for exactly the same reason brick-layers spend 20% of their time laying bricks
    skydhasha day ago
    The other 80% is spent on the following:
    - A lot of research. Libraries documentation, best practice, sample solutions, code history,... That could be easily 60% of the time. Even when you're familiar with the project, you're always checking other parts of the codebase and your notes.
    - Communication. Most projects involve a team and there's a dependency graph between your work. There may be also a project manager dictating things and support that wants your input on some cases.
    - Thinking. Code is just the written version of a solution. The latter needs to exists first. So you spend a lot of time wrangling with the problem and trying to balance tradeoffs. It also involves a lot of the other points.
    Coding is a breeze compared to the others. And if you have setup a good environment, it's even enjoyable.
  - devmor2 days ago
    There’s also just the negative association factor.
    I use LLMs in my every day work. I’m also a strong critic of LLMs and absolutely loathe the hype cycle around them.
    I have done some really cool things with copilot and Claude and I keep sharing them to within my working circle because I simply don’t want to interact that much with people who aren’t grounded on the subject.
    cossray2 days ago
    I would be interested to hear your take on Copilot vs Claude. I have used Copilot (trial) in VS Code and I found it to mostly meet my needs. It could generate some plans and code, which I could review on the go. I found this very natural to me as I never felt 'left behind' in whatever code the AI was generating. However, most of the posts I see here are on Claude (I haven't tried it) and very few mentions of Copilot. What is your impression about them and the use cases each is strong in?
    cassidy_co2 days ago
    (Context: I'm a different person, but have thoughts on this)
    I started using Copilot at work because that's what the company policy was. It's a pretty strict environment, but it's perfectly serviceable and gets a lot of fresh, vetted updates. IDE integration with vs code was a huge plus for me.
    Claude code is definitely a messier, buggier frontend for the LLM. It's clunkier to navigate and it has much more primitive context management tools. IDE integration is clunky with vs code, too.
    However, if you want to take advantage of the Anthropic subscription services, I've found Claude Code is the way to go... Simply because Anthropic works hard to lock you into their ecosystem if you want the sweet discounts. I'm greedy, so I bit the bullet for all of the LLM coding stuff I do in my personal life.
    devmor2 days ago
    Copilot isn’t really a competing product to Claude - in fact I use Claude through copilot.
    I have found in general that for the type of work I do (senior to staff level engineering, 90-10 research to programming) that Claude Opus is the only model really worth my time - but I just really like the Copilot CLI tooling.
    avadodin2 days ago
    So, are you using it for the 10%?
    I do use LLMs to learn about new subjects but we already only bill 10% for "coding" and that's inflating it to cover other parts.
    I can't imagine that slopping it up would be a great decision. Having alien code that no one ever understood between a bug report and a solution. Anthropic isn't going to give us money for our lost contracts, is it?
    devmora day ago
    I would say I'm using it for about half of the "10%" and and a quarter of the "90%".
    > I can't imagine that slopping it up would be a great decision. Having alien code that no one ever understood between a bug report and a solution. Anthropic isn't going to give us money for our lost contracts, is it?
    Absolutely, that's a real concern. The only time I will let it loose on something is a throwaway project to test something, or a small tool that I know I can write deterministic tests for.
    On codebases of any significant size, I'm using it more like a custom domain Stackoverflow search engine.
  - megous2 days ago
    It can be any number of things. From spending hour or two just writing requirements, to giving an example of existing curated code from another project you wrote and would like to emulate, or rewriting existing apps in a different language/architecture (sort of like translating), to serving as a QA agent or reviewer for the LLM agent, or vice versa.
    I kinda like how you can just use it for anything you like. I have bazillion personal projects, I can now get help with, polish up, simplify, or build UI for, and it's nice. Anything from reverse engineering, to data extraction, to playing with FPGAs, is just so much less tedious and I can focus on the fun parts.
- zahlman2 days ago
  I feel like recently HN has been seeing more takes like this one and at least slightly less of the extremist clickbaity stuff. Maybe it's a sign of maturity. (Or maybe it's just fatigue with the cycle of hyping the absolute-latest model?)
  - senko2 days ago
    It takes time for people to go through these experiences (three months, in OP's case), and LLMs have only been reasonably good for a few months (since circa Nov'25).
    Previously, takes were necessarily shallower or not as insightful ("worked with caveats for me, ymmv") - there just wasn't enough data - although a few have posted fairly balanced takes (@mitsuhiko for example).
    I don't think we've seen the last of hypers and doomers though.
    deauxa day ago
    > LLMs have only been reasonably good for a few months (since circa Nov'25).
    Ironically this itself is one of the hyper/doomer takes.
    senkoa day ago
    But is it a hyper or a doomer take?
    There seems to be a concencus among people I follow/read, that somewhere around that time was an inflexion point in coding LLMs, and this matches my personal experience.
    (My comment was in the context of LLMs being used to generate non-throwaway code, not general GenAI use - apologies if that was unclear.)
    deaux6 hours ago
    > There seems to be a concencus among people I follow/read, that somewhere around that time was an inflexion point in coding LLMs, and this matches my personal experience.
    LLMs have been a very useful tool to generate non-throwaway code for at least 2 years. It's a question of Overton Window, of mainstream acceptance, that has made you and the people you're talking about come to a "consensus" that there was an agentic LLM coding step change around that time. There was indeed an inflection point, but it was a social one: a window shift, where it became socially acceptable among mainstream SWEs to hold this sentiment. It had already been the case for a long time, but for those who understood this, expressing that was still not deemed acceptable. Doing so would brand one as a hyper, at least in the particular circles you were in.
    You were in reality part of the doomers, just as much as the current doomers are still doomers.
  - tipiirai2 days ago
    Can you point to one other post like this? Curious. Thanks
- sharperguy2 days ago
  It's actually common for human-written projects to go through an initial R&D phase where the first prototypes turn into spaghetti code and require a full rewrite. I haven't been through this myself with LLMs, but I wonder to what extent they could analyse the codebase, propose and then implement a better architecture based on the initial version.
  - Deukhoofd2 days ago
    Let's be real, a lot of organizations never actually finish that R&D phase, and just continue iterating on their prototypes, and try to untangle the spaghetti for years.
    I recently had to rewrite a part of such a prototype that had 15 years of development on it, which was a massive headache. One of the most useful things I used LLMs for was asking it to compare the rewritten functionality with the old one, and find potential differences. While I was busy refactoring and redesigning the underlying architecture, I then sometimes was pinged by the LLM to investigate a potential difference. It sometimes included false positives, but it did help me spot small details that otherwise would have taken quite a while of debugging.
  - zozbot2342 days ago
    If you write that first prototype in Rust, with the idiomatic style of "Rust exploratory code" (lots of defensive .clone()ing to avoid borrowck trouble; pervasive interior mutability; gratuitous use of Rc<> or Arc<> to simplify handling of the objects' lifecycle) that can often be incrementally refactored into a proper implementation. Very hard to do in other languages where you have no fixed boilerplate marking "this is the sloppy part".
    klabb32 days ago
    Rust is a language for fast prototyping? That’s the one thing Rust is absolutely terrible at imo, and I really like the production/quality/safety aspects of Rust.
    zozbot2342 days ago
    It's not specialized to fast prototyping for sure, but you can use for that with the right boilerplate.
- prpla day ago
  I find the people who end up with spaghetti code did so because they didn’t translate their normal processes over.
  Being completely methodical about development really helps. obra/superpowers, for example, gets close but I think it overindexes on testing and doesn’t go far enough with design document templates, planning, code style guides, code reviews, and more.
  Being methodical about it takes more time, but prevents a good bit of the tech debt.
  Planning modes help, but they are similarly not methodical enough.
  - ianhorna day ago
    That works until you make a plan/tests/etc, set the thing loose, and then when it has trouble it decides "actually the pragmatic thing would be [diverge from the plan/change the tests/etc]" and goes off the rails. I'm so frustrated by these things right now.
    prpla day ago
    I have honestly not had that problem much. Being specific, concise, and strong with your prompts helps out a lot.
- nickstinematesa day ago
  There is a lot you can do to shape the end result to not have these faults. In the end, the engineering mind and rigor still needs to apply, so the hard work doesn't go away.
  But, the errors that are described - no architecture adhesion, lack of comprehension, random files, etc. are a matter of not leveling up the sophistication of use further, not a gap in those tools.
  As an example. Very clearly laying out your architecture principles, guidance, how code should look on disk, theory on imports, etc. And then - objectively analyzing any proposed change against those principles, converges toward sane and understandable.
  We've been calling it adversarial testing across a number of dimensions - architecture, security, accessibility, among other things. Every pr gets automatically reviewed and scored based on these perspectives. If an adversary doesn't OK the PR, it doesn't get merged.
- socalgal221 hours ago
  Unfortunately there is a ton of pressure not to review code. AI enthusiasts see themselves in a race against others and code review = going slower. I recently attended a company wide "let's all use AI" conference and several talks were about how they were not reviewing much code anymore because the PRs were coming in 10x to 50x more and they just couldn't keep up.
  Scary AF
- 2 days ago
  undefined
- airstrike2 days ago
  It's a very accurate and relatable post. I think one corollary that's important to note to the anti-AI crowd is that this project, even if somewhat spaghettified, will likely take orders of magnitude less time to perfect than it would for someone to create the whole thing from scratch without AI.
  I often see criticism towards projects that are AI-driven that assumes that codebase is crystalized in time, when in fact humans can keep iterating with AI on it until it is better. We don't expect an AI-less project to be perfect in 0.1.0, so why expect that from AI? I know the answer is that the marketing and Twitter/LinkedIn slop makes those claims, but it's more useful to see past the hype and investigate how to use these tools which are invariably here to stay
  - kaoD2 days ago
    > this project, even if somewhat spaghettified, will likely take orders of magnitude less time to perfect than it would for someone to create the whole thing from scratch without AI
    That's a big leap of faith and... kinda contradicts the article as I understood it.
    My experience is entirely opposite (and matches my understanding of the article): vibing from the start makes you take orders of magnitude more time to perfect. AI is a multiplier as an assistant, but a divisor as an engineer.
    airstrike2 days ago
    vibing is different from... steering AI as it goes so it doesn't make fundamentally bad decisions
    0xbadcafebee2 days ago
    Both of these are not really the right way to use AI to code with. There are two basic ways to code with AI that work:
    1. Autocomplete. Pretty simple; you only accept auto-completes you actually want, as you manually write code.
    2. Software engineering design and implementation workflow. The AI makes a plan, with tasks. It commits those plans to files. It starts sub-agents to tackle the tasks. The subagents create tests to validate the code, then writes code to pass the tests. The subagents finish their tasks, and the AI agent does a review of the work to see if it's accurate. Multiple passes find more bugs and fix them in a loop, until there is nothing left to fix.
    I'm amazed that nobody thinks the latter is a real thing that works, when Claude fucking Code has been produced this way for like 6 months. There's tens of thousands of people using this completely vibe-coded software. It's not a hoax.
    airstrike2 days ago
    #2 does not negate my steering suggestion, so I'm not sure how you can conclude nobody thinks it's a real thing that works
    also Claude Code is notoriously poorly built, so I wouldn't tout it as SOTA
    0xbadcafebee2 days ago
    I have worked at companies from startups to fortune 500. They all have garbage code. Who cares? It works anyway. The world is held together with duct tape, and it's unreasonably effective. I don't believe "code quality" can be measured by how it looks. The only meaningful measure of its quality is whether it runs and solves a user's problem.
    Get the best programmer in the world. Have them write the most perfect source code in the world. In 10 years, it has to be completely rewritten. Why? The designer chose some advanced design that is conceptually superior, but did not survive the normal and constant churn of advancing technology. Compare that to some junior sysadmin writing a solution in Perl 5.x. It works 30 years later. Everyone would say the Perl solution was of inferior quality, yet it provides 3x more value.
    airstrike2 days ago
    I hear you about "it just works" mattering infinitely more than some arbitrary code quality metric
    but I'm not judging Claude Code by how it looks. I kinda like the aesthetics. I'm talking about how slow, resource hungry and finnicky/flickery it is. it's objectively sloppy
    zozbot2342 days ago
    > when Claude fucking Code has been produced this way for like 6 months
    And people can look at the results (illegally) because that whole bunch of code has been leaked. Let's just say it's not looking good. These are the folks who actually made and trained Claude to begin with, they know the model more than anyone else, and the code is still absolute garbage tier by sensible human-written code quality standards.
    0xbadcafebee2 days ago
    Yet it works anyway. What does that say about human code quality standards?
    SpicyLemonZesta day ago
    Human code quality standards are built around the knowledge that humans prefer polished products that work consistently. You can get away without code quality in the short term, especially if you have no real competitors - to a lot of people, there just aren't any models other than Anthropic's which are particularly useful for software development. But in the long term it gets you into a poor quality trap that's often impossible to escape without starting over from scratch.
    (Anthropic, of course, believes that advances in AI capability over the next few years will so radically reshape society that there's no point worrying about the long term.)
- throwatdem123112 days ago
  For me it’s just a matter of “does this actually save me time at all?”
  If it generates the slop version in a week but it takes me 3 more weeks to clean it up, could I have I just done it right the first time myself in 4 weeks instead? How much money have I wasted in tokens?
  - sunaurus2 days ago
    I've been arguing that it's POSSIBLE to get a small (but meaningful) uplift in productivity on average if you are careful with how you use LLMs, but at the same time, it's also extremely easy to actually negatively impact your productivity.
    In both cases, you feel super productive all the time, because you are constantly putting in instructions and getting massive amounts of output, and this feels like constant & fast progress. It's scary how easy it is to waste time on LLMs while not even realizing you are wasting time.
    2ccg2 days ago
    [dead]
  - 0xbadcafebee2 days ago
    A car saves you time in getting to and from the store. But if you don't learn to drive, and just hop in the car and press things, you're going to crash, and that definitely won't save you time. Cars are also more expensive than walking or a bike, yet people still buy them.
    throwatdem123112 days ago
    I already know how to drive stick (trad coding), I don’t feel like I’m gaining much by switching to automatic transmission.
    0xbadcafebee2 days ago
    Yeah that's not the difference, lol. With AI coding you can get the same work done in an order of magnitude less time, without even knowing how to program.
    The only comparison I can come up with is 3D printers, but even that's not as ridiculously fast and easy as AI coding. An average person can ask an agent to write a program, in any popular language, and it'll do it, and it'll work. We still need people intelligent enough to steer the agent, but you do not need to edit a single line of code anymore.
    skydhasha day ago
    Most people that don't know how to program have no real desire in coding with AI (unless to pose as a SWE and get that sweet money). Most of them don't even like computers. Yes they do some tasks on it, but they're not that attached to the tool and its capabilities.
  - zephen2 days ago
    > does this actually save me time at all?
    Soooooo....
    As one who hasn't taken the plunge yet -- I'm basically retired, but have a couple of projects I might want to use AI for -- "time" is not always fungible with, or a good proxy for, either "effort" or "motivation"
    > How much money have I wasted in tokens?
    This, of course, may be a legitimate concern.
    > If it generates the slop version in a week but it takes me 3 more weeks to clean it up, could I have I just done it right the first time myself in 4 weeks instead?
    This likewise may be a legitimate concern, but sometimes the motivation for cleaning up a basically working piece of code is easier to find that the motivation for staring at a blank screen and trying to write that first function.
    throwatdem123112 days ago
    Well for me, the amount of time/effort as a function my of my motivation has acted as a natural gatekeeper to bad ideas. Just because I can do something with AI now doesn’t necessarily mean that I should. I am also weary of trading time and effort for outright money right out of my own pocket to find out, especially when I find the people I’d be giving money to so reprehensible. I don’t live somewhere where developers make a lot of money. I’m not poor in any stretch but not rich enough that I can waste money on slop for funsies. But I can spend a month on validating a side project because I find coding as a hobby enjoyable in and of itself, and I don’t care if I throw out a few thousand lines of code after a little while and realize I’m wasting my time.
    Cleaning up agent slop code by hand is also a miserable experience and makes me hate my job. I do it already because at $DAYJOB because my boss thinks “investing” in third worlders for pennies on the dollar and just giving them a Claude subscription will be better than investing in technical excellence and leadership. The ROI on this strategy is questionable at best, at least at my current job. Code Review by humans is still the bottleneck and delivering proper working features has not accelerated because they require much more iteration because of slop.
    Would much rather spend the time making my own artisanal tradslop instead if it’s gonna take me the same amount of time anyway - at least it’s more enjoyable.
    zephena day ago
    Your position makes an immense amount of sense for your described situation.
    As I said, I'm retired, and so I've never had to clean up AI slop at $DAYJOB.
    Since the whole AI thing would be a learning experience for me, it would include trying to toilet train the AI itself, as others have intimated can be done in some cases, rather than dealing with a bunch of already-checked-into-the-repo-slop.
    And that may be a losing proposition. I don't know; haven't tried it yet.
    > Would much rather spend the time making my own artisanal tradslop instead if it’s gonna take me the same amount of time anyway - at least it’s more enjoyable.
    Although I haven't had the AI experience you describe, I have had a similar experience with coworkers who moved fast and broke all kinds of shit. That was similarly no fun. It's like trying to work on your wife's minivan, but she won't pull over and let you properly fix it.
    Given sufficient time, I enjoy polishing/perfecting/refactoring code. My final output often looks radically different from my prototype. It is clear to me that I would hate the situation you describe. It is not clear to me that starting with prompted slop and wrangling it into submission would be much less enjoyable to me than writing my own slop and then wrangling it into submission.
    > especially when I find the people I’d be giving money to so reprehensible.
    This is a bit of a concern, but I'm pretty sure that, at the moment, every token you burn costs them more than you.
    throwatdem12311a day ago
    The biggest thing that has changed in my experience (at least in a professions setting) is now that people have AI agents they don’t really have any motivation to improve. If you tell them something that needs to be changed they just reprompt the agent until it’s good enough - but the most sinister thing is they keep making the same mistakes over and over again. There is no growth, no shared understanding that disseminates through review - just re-prompting. They often just directly use my review comments as prompts! People don’t understand code they generated themselves just a few days later. But not in a “oh just let me reread this again real quick” kind of way but a “I have absolutely no clue wtf I am even looking at” way.
    I’ve been sounding the alarm in my own circles about the lack of junior roles now because of AI - which will lead to a shortage of seniors in just a few years - but there is something even more sinister: juniors no longer improve enough to be intermediates and seniors, and worse…seniors and intermediates have regressed to juniors through laziness and cognitive offloading.
    Like if I’m just sending code review to a middle man prompter - why not just skip the middle man? I’m already wrangling a handful of AI agents myself every day, so what is even the point of this extra person anyway? I don’t want to replace people with AI but if the person is so lazy that even I would probably prefer just doing the prompting myself then why shouldn’t I replace them with AI?
    zephen17 hours ago
    That does sound like an intractable problem.
    My problem, if and when I get started, would be tangential to this. It is clear that communication with LLMs is changing so rapidly that there may not be any universal long-lived lessons to be learned from optimizing your interactions with a particular model.
    I know that one-shotting things is probably not best, but determining how far to take it and when to cut over and finish it myself is something that I want to learn, but perhaps not too well.
    My skills are an eclectic mix of high- and low- level. I know exactly what, for example, a frequency analyzer can do for me, but controlling the $400K frequency analyzer is often best left to the guy who lives and breathes it.
    Likewise, my debugging skills are exceptional, but I am not as proficient with any particular debugger as are people who live in the debugger daily because they write terrible code. My debugging skills are mostly predicated on a big part of your daily life -- reading code.
    (To be fair, I have known a very few people who live in the debugger because they are dealing with intractable problems caused by other people, but those are the rarities. I, myself, used to live in the debugger a lot when I was writing graphics drivers for the mostly undocumented Windows 3.1.)
    Which brings us to your reports and/or co-workers. These people have always existed. They pride themselves on and partly base their value on and derive their value from the tools they think they know inside-out.
    In truth, they don't know the tools, but they are intimately familiar with the controls of the tool, like a child who knows how to make a smartphone do exactly what their parent needs it to do.
    So, as long as it's a tool you need, but it's too painful for you to control directly, these people are useful. In your case, you already have cause to use the LLM directly on a regular basis, so, as you point out, the value of these people is diminishing and maybe already negative.
    > why shouldn’t I replace them with AI?
    You probably should. Or, at a minimum, if possible, you should restructure things so that the people who are doing things that you are already proficient at are doing them for someone else who isn't as proficient at the tools, and you can get out of that loop.
    One reason I am not yet completely insane is that I realized about 40 years ago that the place I hated most being was inside someone else's debug loop. Because most people are objectively stupid, and this goes double for people who need you in that loop. So I always work to structure my responsibilities and work setup to avoid this. If I find a bug in an internal supplier's code, I create an MVCE and hand it over to them. If an internal customer claims to find a bug in my code and doesn't provide an MVCE, I figure out what they are attempting to do, create my own MVCE for their function, and either fix it if it was really my problem, or hand it back to them, and ask them to expand on it until it breaks and get back to me.
    Reflecting on this, I realize that I am probably not too likely to succumb to interminable prompting loops, because that wouldn't feel much different to what I nave avoided most of my life. On the few occasions over the last four decades where being involved in someone else's debug loop was completely unavoidable, the most useful thing I brought to the table when they were out of ideas and ready to throw a lot of effort at trying random things was a series of questions like "What are you going to learn from that? What will your decision points be?"
    And I'm not much of a gambler, so I won't be spending too many tokens hoping "the next time, for sure!"
  - hypersoloa day ago
    [dead]
- cush2 days ago
  Good code will stand the test of time. No matter how great a project is, there’s no world in which I adopt anything written over a few months and just released, for maintenance reasons alone.
- wolttam2 days ago
  > you need to learn how to use them correctly in your workflow and you need to remain involved in the code
  I completely agree that this is the case right now, but I do wonder how long it will remain the case.
- te_chris2 days ago
  Without wanting to sound rude: I think the mistake people make with AI prototypes is keeping the code at all.
  The AI’s are more than capable of producing a mountain of docs from which to rebuild, sanely. They’re really not that capable - without a lot of human pain - of making a shit codebase good.
- bmitc17 hours ago
  > This blog post shows the journey that anyone not in one of those two vocal minorities is going through right now: A realization that AI coding tools can be a large accelerator but you need to learn how to use them correctly in your workflow and you need to remain involved in the code. It’s not as clickbaity as the extreme takes that get posted all the time. It’s a little disappointing to read the part where they said hard work was still required. It is a realistic and balanced take on the state of AI coding, though.
  I appreciate the balanced takes and also the notion that one can use these AI tools to build software with principled use.
  However, what I am still failing to see is concrete evidence that this is all faster and cheaper than just a human learning and doing everything themself or with a small team. The cat is out of the bag, so to speak, but I think it's still correct to question these things. I am putting in a _lot_ of work to reach a principled status quo with these tools, and it is still quite unclear whether it's actually improvement versus just a side quest to wrangle tools that everyone else is abusing.
- joe_the_user2 days ago
  This blog post shows the journey that anyone not in one of those two vocal minorities is going through right now
  Is there evidence these groups are a minority? I mean, the OP sounds like they are taking the right approach but I suspect it requires both skill/experience and an open mind to take their approach.
  Just because an approach has good use-cases doesn't mean those are going predominate.
- moron4hire2 days ago
  Someone used Claude Code to generate a very simple staffing management app. The sort of thing that really wouldn't take that long to make, but why pay for any software when you can just ignore the problem, amiright? Anyway, the code that got generated was full of SQL injection issues for the most absurd sorts of things. It would have 80% of the database queries implemented through the ORM, but then the leftover stuff was raw string concat junk, for no good reason because it wasn't even doing any dynamic query or anything that the ORM couldn't do.
- order-mattersa day ago
  i feel like it's worth noting that for a long time prior to AI, I heard a ton of anecdotes about codebases at big companies being very shitty spaghetti more often than not, and if one of the maintainers left it was often a nightmare of managing, refactoring, or in some cases re-writing whenever issues popped up with new integrations
  If AI needs to re-write everything from scratch everytime you make a design change, that may have some obvious inefficiencies and limitations to it but if it can also do that in a few hours or a week, is it really that bad comapred to months of stalling and excuses from devs trying to understand the work of someone before them who wasnt given enough time to make well documented clean code to begin with?
  Like it is undoubtedly worse for hobby projects to rely on the AI output 100%, but im actually not so sure for commercial products. It'll be the same type of spagetti garbage everywhere. There will be patterns in its nonsense that over years people will start to get accustomed to. Youll have people specialize in cleaning up AI generated code, and itll more or less be a relatively consistent process compared to picking up random developer spaghetti
  maybe this is a hot take though
- vasco2 days ago
  Those extreme takes are taken mostly for clicks or are exaggerated second hand so the "other side's" opinion is dumber than it is to "slam the naysayers". Most people are meh about everything, not on the extremes, so to pander to them you mock the extremes and make them seem more likely. It's just online populism.
- beepbooptheory2 days ago
  I'm sorry, who is this for? Aren't you maybe a little tired of talking about this, if not totally <expletive> bored??
  There is something at this point kind of surreal in the fact that you know everyday there will be this exact blog post and these exact comments.
  Like, its been literal years and years and yall are still talking about the thing thats supposed to do other things. What are we even doing anymore? Is this dead internet? It boggles the mind we are still at this level of discourse frankly.
  Love 'em hate 'em I don't care yall need to freaking get a grip! Like for the love god read a book, paint a picture! Do something else! This blog is just a journey to snooze town and we all must at some level know that. This feels like literal brain virus.
- iamandoni2 days ago
  This is exactly why I built https://github.com/andonimichael/arxitect . I’ve found that agents by default produce tactical but brittle software. But if you teach agents to prioritize software architecture and design patterns, their code structure becomes much much better. Additionally, better structured code becomes more token efficient, requires less context to make changes, and coding agents become more accurate.
dirtbag__dad2 days ago
> Tests created a similar false comfort. Having 500+ tests felt reassuring, and AI made it easy to generate more. But neither humans nor AI are creative enough to foresee every edge case you’ll hit in the future; there are several times in the vibe-coding phase where I’d come up with a test case and realise the design of some component was completely wrong and needed to be totally reworked. This was a significant contributor to my lack of trust and the decision to scrap everything and start from scratch.
This is my experience. Tests are perhaps the most challenging part of working with AI.
What’s especially awful is any refactor of existing shit code that does not have tests to begin with, and the feature is confusing or inappropriately and unknowingly used multiple places elsewhere.
AI will write test cases that the logic works at all (fine), but the behavior esp what’s covered in an integration test is just not covered at all.
I don’t have a great answer to this yet, especially because this has been most painful to me in a React app, where I don’t know testing best practices. But I’ve been eyeing up behavior driven development paired with spec driven development (AI) as a potential answer here.
Curious if anyone has an approach or framework for generating good tests
- suzzer992 days ago
  I've always thought that writing good tests (unit, integration or e2e) is harder than the actual coding by maybe an order of magnitude.
  The tricky part of unit tests is coming up with creative mocks and ways to simulate various situations based on the input data, w/o touching the actual code.
  For integration tests, it's massaging the test data and inputs to hit every edge case of an endpoint.
  For e2e tests, it's massaging the data, finding selectors that aren't going to break every time the html is changed, and trying to winnow down to the important things to test - since exhaustive e2e tests need hours to run and are a full-time job to maintain. You want to test all the main flows, but also stuff like handling a back-end system failure - which doesn't get tested in smoke tests or normal user operations.
  That's a ton of creativity for AI to handle. You pretty much have to tell it every test and how to build it.
- viktorianer2 days ago
  The false comfort usually comes from line coverage. I had a model at 97.2% coverage, 92 tests. Ran equivalence partitioning on it (partition inputs into classes, test one from each) and found 6 real gaps: a search scope testing 1 of 5 fields, a scope checking return type but not filtering logic, a missing state machine branch. SimpleCov said the file was covered. The logical input space was not. The technique is old (ISTQB calls it specification-based testing) but the manual overhead made it impractical until recently. Agents made it possible to apply across 60+ models, which is the one thing they have changed for testing so far.
- michaelsalim20 hours ago
  Personally i think the challenge of testing never really changed with AI. You need to know what you want to specifically test before writing/vibe coidng with it. Otherwise it'll just manufacture tests that always passes and are of 0 value.
  If some component doest benefit from being extensively tested, then it's still the same today. The difference is now it's so easy to generate something, no matter how useless it is. Worse part is, no one cares. Test passes, it doesn't affect production, line coverage increases, managers think the software is more tested, developers just let a prompt do everything. It's all just testing theatre.
  I think E2E is the more important than ever. AI is pretty good at getting the local behaviour correct. So unit tests are of less value. Same can't be said for the system as a whole. The best part is, AI is actually pretty good at writing E2E tests. Ofc, given that you already know what you want to test
- whattheheckheck2 days ago
  Use tla+ and have it go back and forth with you to spec out your system behavior then iterate on it trying to link the tla+ spec with the actual code implementing it
  Pull out as many pure functions as possible and exhaustively test the input and output mappings.
  - gck17 hours ago
    The problem with specs for me is always with boundaries. How many specs do you have for a complex project? How do they reference each other? What happens when requirements cross boundaries?
    And finally, how do you address spec drift?
- szundi2 days ago
  [dead]
rokob2 days ago
> architecture is what happens when all those local pieces interact, and you can’t get good global behaviour by stitching together locally correct components
This is a great article. I’ve been trying to see how layered AI use can bridge this gap but the current models do seem to be lacking in the ambiguous design phase. They are amazing at the local execution phase.
Part of me thinks this is a reflection of software engineering as a whole. Most people are bad at design. Everyone usually gets better with repetition and experience. However, as there is never a right answer just a spectrum of tradeoffs, it seems difficult for the current models to replicate that part of the human process.
- physicles2 days ago
  I’ve had a couple wins with AI in the design phase, where it helped me reach a conclusion that would’ve taken days of exploration, if I ever got there. Both were very long conversations explicitly about design with lots of back and forth, like whiteboarding. Both involved SQL in ClickHouse, which I’m ok but not amazing at — for example I often write queries with window functions, but my mental model of GROUP BY is still incomplete.
  In one of the cases, I was searching for a way to extract a bunch of code that 5-6 queries had in common. Whatever this thing was, its parameters would have to include an array/tuple of IDs, and a parameter that would alter the table being selected from, neither of which is allowed in a clickhouse parameterized view. I could write a normal view for this, but performance would’ve been atrocious given ClickHouse’s ok-but-not-great query optimizer.
  I asked AI for alternatives, and to discuss the pros and cons of each. I brought up specific scenarios and asked it how it thought the code would work. I asked it to bring what it knew about SQL’s relational algebra to find the an elegant solution.
  It finally suggested a template (we’re using Go) to include another sql file, where the parameter is a _named relation_. It can be a CTE or a table, but it doesn’t matter as long as it has the right columns. Aside from poor tooling that doesn’t find things like typos, it’s been a huge win, much better than the duplication. And we have lots of tests that run against the real database to catch those typos.
  Maybe this kind of thing exists out there already (if it does, tell me!) but I probably wouldn’t have found it.
  - neonstatic2 days ago
    This rings a bell. The model is a search engine, that understands concepts - to some degree. It can find a concept, that you currently need.
- laanako08a day ago
  [dead]
- MeetRickAI2 days ago
  [dead]
lubujackson2 days ago
Long term, I think the best value AI gives us is a poweful tool to gain understanding. I think we are going to see deep understanding turn into the output goal of LLMs soon. For example, the blocker on this project was the dense C code with 400 rules. Work with LLMs allowed the structure and understanding to be parsed and used to create the tool, but maybe an even more useful output would be full documentation of the rules and their interactions.
This could likely be extracted much easier now from the new code, but imagine API docs or a mapping of the logical ruleset with interwoven commentary - other devtools could be built easily, bug analysis could be done on the structure of rules independent of code, optimizations could be determined on an architectural level, etc.
LLMs need humans to know what to build. If generating code becomes easy, codifying a flexible context or understanding becomes the goal that amplifies what can be generated without effort.
- thunfischbrot2 days ago
  Looks like a clear divide in people‘s experiences based on how they use these new tools:
  1) All-knowing oracle which is lightly prompted and develops whole applications from requirements specification to deployable artifacts. Superficial, little to no review of the code before running and committing.
  2) An additional tool next to their already established toolset to be used inside or alongside their IDE. Each line gets read and reviewed. The tool needs to defend their choices and manual rework is common for anything from improving documentation to naming things all the way to architectural changes.
  Obviously anything in between as well being viable. 1) seems like a crazy dead-end to me if you are looking to build a sustainable service or a fulfilling career.
mossBenchwrighta day ago
This is a really good article but one of the paragraphs at the end rubs me the wrong way.
> In theory, you can try to preserve this context by keeping specs and docs up to date. But there’s a reason we didn’t do this before AI: capturing implicit design decisions exhaustively is incredibly expensive and time-consuming to write down. AI can help draft these docs, but because there’s no way to automatically verify that it accurately captured what matters, a human still has to manually audit the result. And that’s still time-consuming.
I agree that it's time consuming and we don't have a good solution yet, but my guess is that a huge part of the next 3 years of iteration in the craft of Software Engineering is going to be creating tools and practices to make this possible. Especially as AIs get better at the actual writing of the code, the key failure mode for agentic coding is going to be the intent gap between what you asked for and what you wanted.
PaulHoule2 days ago
Note I believe this one because of the amount of elbow grease that went into it: 250 hours! Based on smaller projects I’ve done I’d say this post is a good model for what a significant AI-assisted systems programming project looks like.
jillesvangurp2 days ago
This is the hardest it's ever going to be. That's been my mode for the last year. A lot of what I did in the last month was complete science fiction as little as six months ago. The scope and quality of what is possible seems to leap ahead every few weeks.
I now have several projects going in languages that I've never used. I have a side project in Rust, and two Go projects. I have a few decades experience with backend development in Java, Kotlin (last ten years) and occasionally python. And some limited experience with a few other languages. I know how to structurer backend projects, what to look for, what needs testing, etc.
A lot of people would insist you need to review everything the AI generates. And that's very sensible. Except AI now generates code faster than I can review it. Our ability to review is now the bottleneck. And when stuff kind of works (evidenced by manual and automated testing), what's the right point to just say it's good enough? There are no easy answers here. But you do need to think about what an acceptable level of due diligence is. Vibe coding is basically the equivalent of blindly throwing something at the wall and seeing what sticks. Agentic engineering is on the opposite side of the spectrum.
I actually emphasize a lot of quality attributes in my prompts. The importance of good design, high cohesiveness, low coupling, SOLID principles, etc. Just asking for potential refactoring with an eye on that usually yields a few good opportunities. And then all you need to do is say "sounds good, lets do it". I get a little kick out of doing variations on silly prompts like that. "Make it so" is my favorite. Once you have a good plan, it doesn't really matter what you type.
I also ask critical questions about edge cases, testing the non happy path, hardening, concurrency, latency, throughput, etc. If you don't, AIs kind of default to taking short cuts, only focus on the happy path, or hallucinate that it's all fine, etc. But this doesn't necessarily require detailed reviews to find out. You can make the AI review code and produce detailed lists of everything that is wrong or could be improved. If there's something to be found, it will find it if you prompt it right.
There's an art to this. But I suspect that that too is going to be less work. A lot of this stuff boils down to evolving guardrails to do things right that otherwise go wrong. What if AIs start doing these things right by default? I think this is just going to get better and better.
- jaccola2 days ago
  But why are you making projects in so many languages? The language is very rarely the barrier to performance, especially if you don't even understand the language.
  - jillesvangurp2 days ago
    I try to pick the language best to the situation rather than giving into my own biases. I need to broaden my horizon to be able to cover the full stack of stuff that I need, not just the things I've been doing myself a lot for years. There's a lot of stuff that used to be out of my comfort zone that I can now tackle easily. Stepping over my own biases is part of that.
    I know not everybody is quite ready for this yet. But I'm working from the point of view that I won't be manually programming much professionally anymore.
    So, I now pick stuff I know AIs supposedly do well (like Go) with good solid tool and library ecosystems. I can read it well enough; it's not a hard language and I've seen plenty other languages. But I'm clearly not going to micro manage a Go code base any time soon. The first time I did this, it was an experiment. I wanted to see how far I could push the notion. I actually gave it some thought and then I realized that if I was going to do this manually I would pick what I always pick. But I just wasn't planning to do this manually and it wasn't optimal for the situation. It just wasn't a valid choice anymore.
    Then I repeated the experiment again on a bigger thing and I found that I could have a high level discussion about architectural choices well enough that it did not really slow me down much. The opposite actually. I just ask critical questions. I try to make sure to stick with mainstream stuff and not get boxed into unnecessary complexity. A few decades in this industry has given me a nose for that.
    My lack of familiarity with the code base is so far not proving to be any issue. Early days, I know. But I'm generating an order of magnitude more code than I'll ever be able to review already and this is only going to escalate from here on. I don't see a reason for me to slow down. To be effective, I need to engineer at a macro level. I simply can't afford to micro manage code bases anymore. That means orchestrating good guard rails, tests, specifications, etc. and making sure those cover everything I care about. Precisely because I don't want to have to open an editor and start fixing things manually.
    As for Rust, that was me not thinking about my prompt too hard and it had implemented something half decent by the time I realized so I just went with it. To be clear, this one is just a side project. So, I let it go (out of curiosity) and it seems to be fine as well. Apparently, I can do Rust now too. It's actually not a bad choice objectively and so far so good. The thing is, I can change my mind and redo the whole thing from scratch and it would not be that expensive if I had to.
  - nsvd2a day ago
    In my experience Rust and Go are both opinionated languages with strong types which makes them work well with agentic coding.
moshib2 days ago
> There’s an uncomfortable parallel between using AI coding tools and playing slot machines28. You send a prompt, wait, and either get something great or something useless. I found myself up late at night wanting to do “just one more prompt,” constantly trying AI just to see what would happen even when I knew it probably wouldn’t work. The sunk cost fallacy kicked in too: I’d keep at it even in tasks it was clearly ill-suited for, telling myself “maybe if I phrase it differently this time.”
Oof, this hit very close to home. My workplace recently got, as a special promotion, unlimited access to a coding agents with free access to all the frontier models, for a limited period of time. I find it extremely hard to end my workday when I get into the "one more prompt" mindset, easily clocking 12-hour workdays without noticing.
ang_cire2 days ago
It's a huge mistake to start building with Claude without mapping out a project in detail first, by hand. I built a pretty complex device orchestration server + agent recently, and before I set Claude to actually coding I had ~3000 lines of detailed design specs across 7 files that laid out how and what each part of the application would do.
I didn't have to review the code for understanding what Claude did, I reviewed it for verifying that it did what it had been told.
It's also nuts to me that he had to go back in later to build in tests and validation. The second there is an input able to be processed, you bet I have tests covering it. The second a UI is being rendered, I have Playwright taking screenshots (or gtksnapshot for my linux desktop tools).
I think people who are seeing issues at the integration phase of building complex apps are having that happen because they're not keeping the limited context in mind, and preempting those issues by telling their tools exactly how to bridge those gaps themselves.
- lalitmagantia day ago
  > It's a huge mistake to start building with Claude without mapping out a project in detail first, by hand.
  I agree with you in theory but in my opinion, it doesn't work so well when you don't even know what exactly you are looking for at the start. Yes I knew I wanted a formatter, linter, parse but which language should those be written in, should they be one project or many, how the pieces should fit together, none of that was clear to me.
  As I pointed out in the article, in these sort of "greenfield projects" I work a lot better with concrete prototypes and code in front of me I can dissect instead of trying to endlessly play with designs in my head.
  > It's also nuts to me that he had to go back in later to build in tests and validation.
  I think this is a little misleading. Yes I did do some testing retroactively (i.e. the upstream validation testing) but I was using TDD + verifying outputs immediately, even during the vibe coding phase. The problem as I point out is that this is not enough. Even when I had unit tests written at the same time as they code, they had lots of holes and over time, I kept hitting SQL statements which failed which the testing did not cover.
  - ang_cire3 hours ago
    > in these sort of "greenfield projects" I work a lot better with concrete prototypes and code in front of me I can dissect instead of trying to endlessly play with designs in my head
    I'd really recommend separating prototyping work like that out into a pre-design phase. Do the prototypes and figure out the direction for the actual project, but then come back in with a clean repo and design docs built off the prototypes, for claude to work from. I started out using claude to refactor my old projects (or even my codex ones) before I realized it worked better starting fresh.
    I think sometimes it silently decides that certain pieces of code or design are absolute constraints, and won't actually remove or change them unless you explicitly tell it to. Usually I run into this towards the end of implementation, when I'll see something I don't expect to and have to tell it to rip it out.
    One example recently was an entire messaging queue (nats jetstream) docker image definition that was sitting in the deploy files unused, but claude didn't ever mention or care about it as it worked on those files; it just silently left it sitting there.
    Another example was an auth-bypass setting I built in for local testing during prototyping, being not just left alone by Claude but actually propagated into other areas of the application (e.g. API) without asking.
- DANmodea day ago
  Do you write anywhere else?
  Really enjoy your style.
  - ang_cirea day ago
    Thank you! Not anywhere but forums. I used to have a blog, but I haven't posted anything in years.
cloche2 days ago
Really great to see a realistic experience sans hype about AI tools and how they can have an impact.
> But when I reviewed the codebase in detail in late January, the downside was obvious: the codebase was complete spaghetti...It was extremely fragile; it solved the immediate problem but it was never going to cope with my larger vision...I decided to throw away everything and start from scratch
This part was interesting to me as it lines up with Fred Brooks "throw one away" philosophy: "In most projects, the first system built is barely usable. Hence plan to throw one away; you will, anyhow."
As indicated by the experience, AI tools provide a much faster way of getting to that initial throw-away version. That's their bread and butter for where they shine.
Expecting AI tools to go directly to production quality is a fool's errand. This is the right way to use AI - get a quick implementation, see how it works and learn from it but then refactor and be opinionated about the design. It's similar to TDD's Red, Green, Refactor: write a failing test, get the test passing ASAP without worrying about code quality, refactor to make the code better and reliable.
In time, after this hype cycle has died down, we'll come to realize that this is the best way to make use of AI tools over the long run.
> When I had energy, I could write precise, well-scoped prompts and be genuinely productive. But when I was tired, my prompts became vague, the output got worse
This part also echoes my experience - when I know well what I want, I'm able to write more specific specifications and guide along the AI output. When I'm not as clear, the output is worse and I need to spend a lot more time figuring it out or re-prompting.
- GrumpyYoungMan2 days ago
  > Fred Brooks "throw one away" philosophy
  Everybody remembers that soundbite but nobody remembers that he changed his mind about it later and switched to advocating iterative refinement.
  - cloche2 days ago
    Oh interesting, I hadn't heard that. Do you know where he said that? A quick Google search doesn't turn up anything.
- mockinglorisa day ago
  @cloche... spot on.
  If we are all honest, it seems to be the case - most of the time:
  - Refactoring (Sometimes starting again.. this is rarely starting from scratch as there would have been some insights and personal design decisions garnered from the previous experience) - Specificity (It is heavily influenced by energy which also is different depending time of day or on the individual level)
  At the end of the day, it takes taste + experience of the user to make anything of notable complexity(architecture) with AI.(For now and the nearest future at least).
  I find reading articles as this gives me a renewed sense of agency as a technologist and my growing list of passions.
  A solid thank you to Lalit Maganti for sharing and the better HN community. I found a lot to steal reuse from the material/banter.
zellyn2 days ago
Does SQLite not have a lemon parser generated for its SQL?
When I ported pikchr (also from the SQLite project) to Go, I first ported lemon, then the grammar, then supporting code.
I always meant to do the same for its SQL parser, but pikchr grammar is orders of magnitude simpler.
- lalitmagantia day ago
  When I refer to "extracting sources from the SQLite codebase" a big part of that was indeed referring to compiling Lemon and executing it against a custom implementation of parse.y [1].
  The problem comes from how SQLite's upstream parse.y works. Becuase it doesn't actually generate the parse tree, instead generating the bytecode directly, the intepretation of any node labelled "id" or "nm" is buried inside the source code behind many layers of functions. You can see for yourself by looking at SQLite's parse.y [2]
  [1] https://github.com/LalitMaganti/syntaqlite/tree/main/syntaql... [2] https://sqlite.org/src/file?name=src/parse.y&ci=trunk
  - zellyn21 hours ago
    Ah, that makes sense. Thanks for the details. I see now that your article basically had all the information I needed to figure this out if I’d thought a bit harder!
    Also, nice work: this makes the world just a little nicer!
- smartmic2 days ago
  Correct[0]. This was also my first thought after reading
  > Unfortunately, unlike many other languages, SQLite has no formal specification describing how it should be parsed. It doesn’t expose a stable API for its parser either. In fact, quite uniquely, in its implementation it doesn’t even build a parse tree at all9! The only reasonable approach left in my opinion is to carefully extract the relevant parts of SQLite’s source code and adapt it to build the parser I wanted
  Did they made a proper problem research in the first place?
  [0]: https://sqlite.org/lemon.html
  - afc2 days ago
    I was also baffled. "No formal specification"? Two minutes of browsing is enough to find it: https://github.com/sqlite/sqlite/blob/master/src%2Fparse.y
    lalitmaganti2 days ago
    I'm very well aware of parse.y, if you look into the syntaqlite code, you'd find it's a critical part of how the whole "source extraction" mentioned in the article works [1]
    To be clear when I say "formal specification", I'm not just talking about the formal grammar rules but also how those interpreted in practice. Something closer to the ECMAScript specification (https://ecma-international.org/publications-and-standards/st...).
    [1] https://github.com/LalitMaganti/syntaqlite/blob/93638c68f9a0...
smj-edison2 days ago
The description of working with AI tools really resonates with me. It's dangerous to work on my codebase when I'm tired, since I don't feel like doing it properly, so I play slots with Claude, and stay up later than I should. I usually come back later and realize the final code that gets generated is an absolute mess.
It is really good for getting up to speed with frameworks and techniques though, like they mentioned.
- zozbot2342 days ago
  You should take advantage of these states of cognitive exhaustion by asking Claude to document and explain the codebase to you, and checking whether it still makes sense. If there are things that you have trouble understanding in that state, make a note of them to check later whether they can be simplified.
  - danparsonsona day ago
    Or, you know, rest :-)
    shimmana day ago
    No, it's much better for the business magnates if you spend all your short lifespan on solving problems every waking hour. Don't worry about experiencing your life, you need to generate value!
- ulf-777232 days ago
  Same for me. What I liked about the article was the emphasis on the mental model. Staying up late using the a lot machine is not helping me to remember the model better
darkstarsysa day ago
This post is excellent, and accurately describes my experience writing pcons (pcons.org) as a side project. I was one of the original developers of SCons and have wanted to rebuild it better for more than a decade. All the same roadblocks Maganti describes kept me from starting it, and Claude Opus 4.6 suddenly opened the door, and now it's live and people are starting to use it as a cmake or scons replacement. My experience over the last few months mirror Maganti's in many ways: ease of refactoring, investigating many more design ideas, getting frustrated with blind alleys and its not understanding the big picture, and ultimately getting a product I'm proud of.
Vision, taste and good judgment are going to be the key skills for software developers from now on.
bytefish2 days ago
This resonates with my experience.
I have several Open Source projects and wanted to refactor them for a decade. A week ago I sat down with Google Gemini and completely refactored three of my libraries. It has been an amazing experience.
What’s a game changer for me is the feedback loop. I can quickly validate or invalidate ideas, and land at an API I would enjoy to use.
- suzzer992 days ago
  Did you already have good integration tests?
  - bytefisha day ago
    I had, yes. But those had also been rewritten with Gemini from ground up, because it caught way more edge cases, than I did. Which is fascinating and frightening at the same time.
    I think my vibe-coding success also has to do with the problem being not that “novel” and prior art exists.
    Nevertheless still impressive.
alexpotatoa day ago
Been using LLMs both at work (FinTech DevOps/SRE) and on side projects (big data, games, websites) and here has been my "arc"
- first used copy and paste in and out of Grok
- started using CLI tools e.g. Claude and OpenCode
- move up to using 3 and sometimes 4 agents at the same time
- considered going to the agents managing agents
- have settled on having LLMs build tools that are both deterministic, usable by humans and the agent, and also faster (b/c there is less "back and forth")
Honestly, it feels a LOT like when Kubernetes came out. e.g. you stopped running containers on a box using Docker Compose plus scripts/configs etc. Instead gave a large part of the operation to an "agent" (in this case k8s) that managed all of the details you didn't need to care about anymore.
I've also realized that while the LLMs can crank out code at a very high rate, someone still needs to make sure everything is running, debug issues etc. You could set up agents to monitor what the agents do but then you still end up with someone needing to keep an eye on everything. If anything, you need MORE people b/c now you can just keep spinning up new components etc.
Also, was in a discussion with one of the best developers I've ever worked with. It came down to the following point:
"Programming is rapidly becoming a hobby. Software engineering is becoming more important than ever."
pwr12 days ago
This resonates. I had a project sitting in my head for years and finally built it in about 6 weeks recently. The AI part wasn't even the hard part honestly, it was finally commiting to actually shipping instead of overthinking the architecture. The tools just made it possible to move fast enough that I didn't lose momentum and abandon it like every other time.
DareTheDev2 days ago
This is very close to my experience. And I agree with the conclusion I would like to see more of this
tech_kena day ago
> When I was working on something I already understood deeply, AI was excellent. I could review its output instantly, catch mistakes before they landed and move at a pace I’d never have managed alone.
This precisely captures my experience with AI tools. When I understand the domain very deeply, AI feels like magic. I can tell it exactly how I want something implemented and it just appears in 30 seconds. When I don't understand something very well, however, I get easily misled by bogus design choices that I've delegated to the AI. It's so easy for me to spend 4 hours drafting some prototype in an almost dreamlike state of productive bliss, only for it to crash apart when I discover some fundamental bug in the thing I've vibecoded.
- chappyasel21 hours ago
  [dead]
billylo2 days ago
Thank you. The learning aspect of reading how AI tackles something is rewarding.
It also reduces my hesitation to get started with something I don't know the answer well enough yet. Time 'wasted' on vibe-coding felt less painful than time 'wasted' on heads-down manual coding down a rabbit hole.
bigcat12345678a day ago
> Unfortunately, unlike many other languages, SQLite has no formal specification describing how it should be parsed.
BorgCfg had exactly the same situation.
mpvl (borgcfg original author, author of https://cuelang.org/) and others had tried to refine bcl while bcl itself is underspecified.
Eventually, the team built a drop-in replacement of bcl and specced out the language almost entirely.
The biggest lesson to me was that engineering never has any short cut.
noritaka8812 hours ago
Feels like the real bottleneck is shifting.
AI makes it easy to generate working code quickly, but the limiting factor becomes understanding and trust.
You end up with systems that “work”, but nobody fully understands how or why.
At some point, the problem isn’t generation anymore — it’s verification.
Curious how others are dealing with this in practice.
myultidevhq2 days ago
The 8-year wait is the part that stands out. Usually the question is "why start now" not "why did it take 8 years". Curious if there was a specific moment where the tools crossed a threshold for you, or if it was more gradual.
- bdcravens2 days ago
  For me, the amount of tedium that comes with any new project before I can get to the "good stuff" is a blocker. It's so easy to sit down with excitement, and then 3 hours later, you're still wrestling with basic dependencies, build pipelines, base CSS, etc.
  - 8organicbits2 days ago
    Have you tried using starting templates for projects? For many platforms there are cookiecutters or other tools to jump over those.
- jayd162 days ago
  It's kind of click bait tho. "I took 3 months and AI to build a SQLite tool" is not going to stand out. The 8 year wait gives a sense of scale or difficulty but that's actually an illusion and does not reflect the task itself.
bvan2 days ago
This a very insightful post. Thanks for taking the time to share your experience. AI is incredibly powerful, but it’s no free-lunch.
simondotau2 days ago
This essay perfectly encapsulates my own experience. My biggest frustration is that the AI is astonishingly good at making awful slop which somehow works. It’s got no taste, no concern for elegance, no eagerness for the satisfyingly terse. My job has shifted from code writer to quality control officer.
Nowhere is this more obvious in my current projects than with CRUD interface building. It will go nuts building these elaborate labyrinths and I’m sitting there baffled, bemused, foolishly hoping that THIS time it would recognise that a single SQL query is all that’s needed. It knows how to write complex SQL if you insist, but it never wants to.
But even with those frustrations, damn it is a lot faster than writing it all myself.
- pizzafeelsright2 days ago
  Trim your scope and define your response format prior to asking or commanding.
  Most of my questions are "in one sentence respond: long rambling context and question"
- mtrifonova day ago
  The "no taste" thing is real when AI is in generate-for-me mode. It's trying to fulfill your request, and won't evaluate it unsolicited. But if you change the relationship and let it react to what you're building instead of building it for you, aesthetic judgment shows up immediately. It'll tell you something is ugly or overengineered or missing the point. The taste was always in the model, it just can't express it while it's busy being obedient.
- lyricalstringa day ago
  This is the thing that gets me. The code compiles. Passes tests. So you stop reading it. Why wouldn't you.
  Then three weeks later you're tracing some control flow that makes no sense and nobody knows why it's structured that way. Not you, not the model. I've been treating it like code from a contractor now, review every line same as a junior dev's PR. Gets tedious but the alternative is worse.
  - simondotaua day ago
    I’ve been treating it like a glorified autocomplete, or a glorified search and replace. Everything else is saxophone jazz when I’m writing for a string quartet: useful for inspiration, useful for understanding what isn’t clearly explained, sometimes it builds a decent first attempt, occasionally it gets shockingly close, but I’ve learned to never let my guard down. Go too far and untangling its slop becomes burdensome. Leave it to its own devices for more than a few rounds and it can become so unfixable it’s easier to start from scratch.
dcre2 days ago
"Knowing where you are on these axes at any given moment is, I think, the core skill of working with AI effectively."
I like this a lot. It suggests that AI use may sometimes incentivize people to get better at metacognition rather than worse. (It won't in cases where the output is good enough and you don't care.)
sebastianconcpt2 days ago
In a not so far future, people will be amazed that these dense pieces of source code were done by hand and meant to be maintained by people. Same type of amazing you see when thinking in the internals of The Silver Swan or any other famous mechanical automaton.
stepan_l2 days ago
I had the same experience, been working on my project for a few months and it started very easy and then I lost control of the code base. Had to rewrite a lot of things. The code AI writes does not look bad, but there is something wrong about it. It just does not feel right. You still need to steer it a lot. But I am very happy that I could write a quite complex project with almost no dependencies at all. Only used Electron. I don't even use npm. That is very promising how far you can get without relying on any libraries/frameworks. You can check it here https://github.com/AgentWFY/AgentWFY MIT license.
The_Goonies19852 days ago
The author mentions a C codebase. Is AI good at coding in C now? If so, which AI systems lead in this language?
Ideally: local; offline.
Or do I have to wrestle it for 250 hours before it coughs up the dough? Last time I tried, the AI systems struggled with some of the most basic C code.
It seemed fine with Python, but then my cat can do that.
- Morpheus_Matrix2 days ago
  [flagged]
  - The_Goonies19852 days ago
    Thanks Morpheus_Matrix. I'll take a look at Qwen 2.5 Coder 32B for offline C. I appreciate your guidance.
    By extraordinary coincidence, I was just a moment ago part-of-the-way through re-watching The Matrix (1999) and paused it to check Hacker News. There your reply greeted me.
    Wild glitch!
    adrian_b2 days ago
    There is also a successor to that: Qwen3-Coder-Next, which is a newer and bigger model, but it obviously requires more hardware resources, being an 80B model.
    However it is likely to be the most powerful open weights coding assistant that you can run locally, without having to worry about token price or reaching the subscription limits in the most inconvenient moment.
nektroa day ago
> I’ve long been puzzled that no one has invested in building a really good developer experience for it.
https://sqlitebrowser.org/
> Unfortunately, unlike many other languages,
what
> SQLite has no formal specification describing how it should be parsed.
https://sqlite.org/syntax.html
- lalitmagantia day ago
  > https://sqlitebrowser.org/
  sqlitebrowser.org is cool but it's not the sort of developer tools I'm talking about. As I clarify in the side notes, I'm looking for a formatter, linter, LSP, not an IDE.
  > https://sqlite.org/syntax.html
  As I replied to some other comment, I'm very aware that there is a syntax diagram but that really only tells half the story. If you actually look at those diagrams into detail, or you look into the the actual parse.y grammar (https://sqlite.org/src/file?name=src/parse.y&ci=trunk), you'll find that they're missing a lot of information which is required for you to actually interpret the SQL into an AST.
  When I say "formal specification", I'm not just talking about the formal grammar rules but also how those interpreted in practice. Something closer to the ECMAScript specification (https://ecma-international.org/publications-and-standards/st...).
eviksa day ago
> spent weeks in the early days following AI down dead ends, exploring designs that felt productive in the moment but collapsed under scrutiny
> I paid for that with a total rewrite.
With so much waste and not a single example of the "brilliant at giving you the right answer to a specific technical question"
> The takeaway for me is simple: AI is an incredible force multiplier
Seems more like a feel multiplier, rather than force.
> 500 tests, many of which I felt I could reuse
Indeed, feeling is the only saving grace for a mountain of random unreviewed tests
- lalitmagantia day ago
  > With so much waste and not a single example of the "brilliant at giving you the right answer to a specific technical question"
  In my opinion, "giving me a better understanding for the architecture of the project" is reasonable technical compensation.
  > Indeed, feeling is the only saving grace for a mountain of random unreviewed tests
  I think I say a line or two above that this was after a review of the codebase so I did review these tests.
  - eviksa day ago
    How many of 500 tests were actually reviewed/tested and found good? The code review results were: don't understand code and/or it's pretty bad. Then 0 of those 500 tests were used due to the full rewrite. So nothing to extrapolate usefulness from, all that's left is a feel...
    > giving me better understanding
    Examples of that would also be nice (I don't doubt the personal feel that waste was justified)
    > JOURNAL before: ...
    > JOURNAL after: ... > was wrong here, learned this
    lalitmagantia day ago
    > How many of 500 tests were actually reviewed/tested and found good?
    Essentially ~all of the tests were found to be useful but in a more "smoke test" capacity i.e. they provided good "basic" coverage but it was clear that it was also not sufficient.
    Which is why in the rewrite: 1) I built a TCL driver that run the upstream SQLite tests and verified we accepted or rejected the SQL in the same way as SQLite.
    2) I wrote a test runner which checked for "idempotence" i.e. run the formatter over all the SQL from all the other types of tests then verify that the AST was identical in the input and output.
    3) I also wrote a script which ran the formatter over the PerfettoSQL standard library [1], a real world SQLite-based codebase that I knew and deeply understood so I could go through each file and manually check the output.
    > Examples of that would also be nice (I don't doubt the personal feel that waste was justified)
    Some things learned concretely:
    1) C was not going to work for the higher level parts of the project, even the formatter was not pleasant to read or write in C, the validator was much worse
    2) Doing the SQLite source extraction in the same language meant that I could ship a really cool feature where the syntaqlite CLI could "generate dialect extensions" without people needing to download a separate script, run their own extraction on the SQLite source code, or worse yet, need to fork syntaqlite. This actually makes it technically possible for people in the web playground to dynamically build extensions to SQLite (though I haven't ended up plumbing that feature through yet)
    3) Having a DSL [2] for extensions of SQLite (that e.g. PerfettoSQL could use) was the correct way to go rather than using YAML/JSON/XML etc becaue of how much clarity it provided and how AI took a lot of the annoyance of maintaining a DSL away.
    4) I need to invest much more in testing from the start and also more testing where the correctness can be "proved" in some way (e.g. idempotence testing or SQLite upstream testing as described above)
    [1] https://github.com/google/perfetto/tree/main/src/trace_proce... [2] https://docs.syntaqlite.com/v0.2.15/guides/custom-dialects/
2 days ago
undefined
javierhonduco2 days ago
Great write-up. As a side note (not a Googler myself and this is 100% my opinion) Lalit’s team was hiring in London, UK. If you are interested in working in low level performance tools, this might be a very cool opportunity!
moropexa day ago
Had a similar experience recently. AI-generated code that worked, tests passing, but I couldn't explain how half of it worked. Starting over with a clear mental model and using AI as an accelerator instead of a replacement made all the difference.
throwaway470012 days ago
I appreciate these kind of fact-based posts. Thank you for this.
Unfortunately, AI seems to be divisive. I hope we will find our way back eventually. I believe the lessons from this era will reverberate for a long time and all sides stand to learn something.
As for me, I can’t help but notice there is a distinct group of developers that does not get it. I know because they are my colleagues. They are good people and not unintelligent, but they are set in their ways. I can imagine management forcing them to use AI, which at the moment is not the case, because they are such laggards. Even I sometimes want to “confront” them about their entire day wasted on something even the free ChatGPT would have handled adequately in a minute or two. It’s sad to see actually.
We are not doing important things and we ourselves are not geniuses. We know that or at least I know that. I worry for the “regular” developer, the one that is of average intellect like me. Lacking some kind of (social) moat I fear many of us will not be able to ride this one out into retirement.
- vaylian2 days ago
  > because they are such laggards
  I am a technologist. But I am seriously concerned about the ecological consequences of the training and usage of AI. To me, the true laggards are those, who have not understood yet, that climate change requires a prudent use of our resources.
  I don't mind people having fun or being productive with AI. But I do mind it when AI is presented as the only way of doing things.
  - throwaway47001a day ago
    I get that, I do. I hate to say it, but our entire civilization is about not giving a damn about resources. Cars, flying on holidays, industrialization in general I guess. Many things are a ridiculous waste of resource if you think about it.
    The counter here would be, what if AI could be made efficient? Suddenly OK then? Is it truly about the resources?
    Walking to the nearest farm with my horse is much, much more sustainable than maintaining a sprawling toxic civilizational level infrastructure so I can go into my car to the supermarket. I get your point, but nearly every aspect of our world is filled to the brim with mind boggling complexity and corresponding resource usage and we get used to it.
  - wiether2 days ago
    Don't waste time thinking about the comment you replied to.
    Only an AI would bother to create a throwaway account to post such a shallow comment that is mostly fearmongering to push people to use AI.
abc123abc123a day ago
I agree. Keeping it short, and keeping it checkable works for me. Once it starts to become long and fuzzy, AI gets derailed.
amaia day ago
The takeaway from the article:
"AI is an incredible force multiplier for implementation, but it’s a dangerous substitute for design."
looshch2 days ago
completely off-topic, but i love the fact that this blog has the exact shade of black for the background as my site loosh.ch. Guess we both took it from some of the Google product’s night theme
zer00eyz2 days ago
This article is describing a problem that is still two steps removed from where AI code becomes actually useful.
90 percent of the things users want either A) dont exist or B) are impossible to find, install and run without being deeply technical.
These things dont need to scale, they dont need to be well designed. They are for the most part targeted, single user, single purpose, artifacts. They are migration scripts between services, they are quick and dirty tools that make bad UI and workflows less manual and more managable.
These are the use cases I am seeing from people OUTSIDE the tech sphere adopt AI coding for. It is what "non techies" are using things like open claw for. I have people who in the past would have been told "No, I will not fix your computer" talk to me excitedly about running cron jobs.
Not everything needs to be snap on quality, the bulk of end users are going to be happy with harbor freight quality because it is better than NO tools at all.
- throw52 days ago
  > This article is describing a problem that is still two steps removed from where AI code becomes actually useful.
  But it does a good job of countering the narrative you often see on LinkedIn, and to some extent on HN as well, where AI is portrayed as all-capable of developing enterprise software. If you spend any time in discussions hyping AI, you will have seen plenty of confident claims that traditional coding is dead and that AI will replace it soon. Posts like this is useful because it shows a more grounded reality.
  > 90 percent of the things users want either A) dont exist or B) are impossible to find, install and run without being deeply technical. These things dont need to scale, they dont need to be well designed. They are for the most part targeted, single user, single purpose, artifacts.
  Yes, that is a particular niche where AI can be applied effectively. But many AI proponents go much further and argue that AI is already capable of delivering complex, production-grade systems. They say, you don't need engineers anymore. They say, you only need product owners who can write down the spec. From what I have seen, that claim does not hold up and this article supports that view.
  Many users may not be interested in scalability and maintainability... But for a number of us, including the OP and myself, the real question is whether AI can handle situations where scalability, maintainability and sound design DO actually matter. The OP does a good job of understanding this.
simonreiff2 days ago
Just wanted to say thanks to @brlee for the nice write-up and congrats on the release
4b11b42 days ago
Great write-up with provenance
FurstFlya day ago
Very refreshing take on ai coding
senthilnayagam2 days ago
when he decided on rust, he could have looked up sqlite port, libsqlite does a pretty good job.
forrestthewoods2 days ago
Really great post. Thanks for sharing.
edfletcher_t1372 days ago
> Of all the ways I used AI, research had by far the highest ratio of value delivered to time spent.
Seconded!
soursoup2 days ago
The author apparently skipped ai-assisted refactoring and auditing before moving to prod.
FpUser2 days ago
I do not have anything resembling problems described. Before I ask AI to create new code (except super trivial things). I first split application into smaller functional modules. I then design structure of the code down to main classes and methods and their interaction. Also try to keep scope small. Then AI just fills out the actual code. I have no problems reviewing it. Sometimes I discover some issues - like using arrays instead of maps leading to performance issues but it is easily spotted.
holoduke2 days ago
A key take away from this article is that you as a developer spending as much time on refactoring as on the actual feature. You are constantly requesting code reviews, architectural assessements, consolidations, extractions etc. only then you can empower AI to become a force multiplier. And prevent slop and spaghetti code to be created. Nice article
Adam_ciphera day ago
[dead]
panavma day ago
[dead]
Sim-In-Silico2 days ago
[dead]
dsteel2 days ago
[dead]
nightrate_aia day ago
[dead]
TraceAgently2 days ago
[dead]
meidad_g2 days ago
[dead]
techpulselab2 days ago
[dead]
meidad_g2 days ago
[dead]
afron_manyu2 days ago
[dead]
ryguz2 days ago
[dead]
toniantunovi2 days ago
[dead]
maryjeiela day ago
[dead]
aplomb10262 days ago
[dead]
visargaa day ago
[dead]
wei032882 days ago
[dead]
MarcelinoGMX3Ca day ago
[dead]
2 days ago
undefined
vlubeschanin2 days ago
[dead]
alejandrosplitt2 days ago
[dead]
maxbeech2 days ago
[dead]
2033journeya day ago
[dead]
huflungdung2 days ago
[dead]
Yash_Claw2 days ago
[dead]
BrookHoustona day ago
[dead]
rlenf2 days ago
[flagged]
- adrian_b2 days ago
  Unlike many claims that AI works that are clearly bogus, this actually seems quite credible, because TFA describes in detail many problems encountered, which could have easily lead to a failure of the project, if not properly addressed.
  There is no doubt that when used in the right way an AI coding assistant can be very helpful, but using it in the right way does not result in the fantastic productivity-increasing factors claimed by some. TFA describes a way of using AI that seems right and it also describes the temptations of using AI wrong, which must be resisted.
  More important is whether the productivity improvement is worth a subscription price. Nothing that I have seen until now convinces me about this.
  On the other hand, I believe that running locally a good open-weights coding assistant, so that you do not have to worry about token price or about exceeding subscription limits in a critical moment, is very worthwhile.
  Unfortunately, thieves like Altman have ensured that running locally has become much more difficult than last year, due to the huge increases in the prices of DRAM and of SSDs. In January I have been forced to replace an old mini-PC, but I was forced to put in the new mini-PC only 32 GB of DDR5, the same as in the 7-year old replaced mini-PC. If I had made the upgrade a few months earlier, I would have put in it 96 GB, which would have made it much more useful. Fortunately, I also have older computers with 64 GB or 128 GB DRAM, where bigger LLMs may be run.
  - steveBK1232 days ago
    > More important is whether the productivity improvement is worth a subscription price. Nothing that I have seen until now convinces me about this. On the other hand, I believe that running locally a good open-weights coding assistant, so that you do not have to worry about token price or about exceeding subscription limits in a critical moment, is very worthwhile.
    This is one thing I also wonder about. If it's a really good programming helper, making 20% of your job 5x faster, then you can compute the value. Say for a $250K SWE this looks like $40k/year roughly. You don't want to hand 100% of that value to the LLM providers or you've just broken even, so then maybe it is worth $200/mo.
    adrian_b2 days ago
    Such a reckoning is possible when the cost of a subscription is truly predictable.
    For now, there is a lot of unpredictability in the future cost of AI, whenever you do not host it yourself.
    If you pay per token, it is extremely hard to predict how many tokens you will need. If you have an apparently fixed subscription, it is very hard to predict whether you will not hit limits in the most inconvenient moment, after which you will have to wait for a day or so for the limits to be reset.
    Recently, there have been a lot of stories where the AI providers seem to try to reduce continuously the limits allowed by a subscription. There is also a lot of incertitude about future raises of the subscription prices, as the most important providers appear to use prices below their expenses, for now.
    Therefore, while I agree with you that when something provides definite benefits you should be able to assess whether paying for it provides a net gain for you, I do not believe that using an externally-hosted AI coding assistant qualifies for such an assessment, at least not for now.
    adrian_b2 days ago
    EDIT:
    After I have written the above, that the future cost of externally-hosted AI coding assistants is unpredictable, what I have written was confirmed by an OpenAI press release that the existing Codex users will be migrated during the following weeks towards token-based pricing rates.
    Such events will not affect you if you use an open-weights assistant running on your own HW, when you do not have to care about token usage.
    zozbot2342 days ago
    You do have to care about token usage when chosing how to scale your hardware. If you do a negligible amount of AI inference for occasional simple Q&A (which is what most people do), you can get away with a very lean and cheap setup even when running very large, sophisticated models. Agentic use with function calls and responses etc. raises the amount of tokens you use over time by at least one order of magnitude.
- NewsaHackO2 days ago
  It's funny that he used Claude instead of gemini for this. Idk if his company is happy with free advertisement of a competitor
  - What12932 days ago
    Google owns 14% of Anthropic:
    https://techcrunch.com/2025/03/11/google-has-given-anthropic...
    They don't care. They want software engineers replaced by any means necessary. They know generative AI isn't a big business, that is why they slowwalk it themselves.
    Replacement won't work of course, that is why marketing blog posts are needed.
    NewsaHackO2 days ago
    But they own 100% of Google, correct?
intensifier2 days ago
article looks like a tweet turned into 30 paragraphs. hardly any taste.
- cloche2 days ago
  This is what a lot of business books are TBH
- throw52 days ago
  Yes, how dare someone take an idea, develop it, and publish it outside the algorithm-driven rage pit. Truly terrible behavior! /s
  Expanding a thought beyond 280 characters and publishing it somewhere other than the X outrage machine is something we should be encouraging.
2 days ago
undefined
clawfund2 days ago
The 8 years part is the real story. A lot of founders we talk to have the same pattern — technically capable, clear idea, but never shipped. The common explanation is "I was waiting until I had more time" but when AI removed the time constraint, the same projects still didn't ship. What actually changed here wasn't just velocity, it was that vibe coding lowered the psychological cost of starting on something uncertain. The spaghetti codebase problem is real but secondary — you can refactor bad code. You can't refactor years of not starting.