Two kinds of vibe coding(davidbau.com)

137 pointsby jxmorris122 months ago26 comments

WhyOhWhyQ2 months ago
"the last couple weeks"
When I ran this experiment it was pretty exhilarating for a while. Eventually it turned into QA testing the work of a bad engineer and became exhausting. Since I had sunken so much time into it I felt pretty bad afterwards that not only did the thing it made not end up being shippable, but I hadn't benefitted as a human being while working on it. I had no new skills to show. It was just a big waste of time.
So I think the "second way" is good for demos now. It's good for getting an idea of what something can look like. However, in the future I'll be extremely careful about not letting that go on for more than a day or two.
- danabramov2 months ago
  I believe the author explicitly suggests strategies to deal with this problem, which is the entire second half of the post. There’s a big difference between when you act as a human tester in the middle vs when you build out enough guardrails that it can do meaningful autonomous work with verification.
  - WhyOhWhyQ2 months ago
    I'm just extremely skeptical about that because I had many ideas like that and it still ended up being miserable. Maybe with Opus 4.5 things would go better though. I did choose an extremely ambitious project to be fair. If I were to try it again I would pick something more standard and a lot smaller.
    I put like 400 hours into it by the way.
    stantonius2 months ago
    This is so relatable it's painful: many many hours of work, overly ambitious project, now feeling discouraged (but hopefully not willing to give up). It's some small consolation to me to know others have found themselves in this boat.
    Maybe we were just 6 months too early to start?
    Best of luck finishing it up. You can do it.
    WhyOhWhyQ2 months ago
    Thank! Yes I won't give up. The plan now is to focus on getting an income and try again in the future.
  - irrationalfab2 months ago
    +1... like with a large enough engineering team, this is ultimately a guardrails problem, which in my experience with agentic coding it’s very solvable, at least in certain domains.
    majormajor2 months ago
    Like with large engineering teams I have little faith people will suddenly get the discipline to do the tedious, annoying, difficult work of building good enough guardrails now.
    We don't even build guardrails that keep humans who test stuff as they go from introducing subtle bugs by accident; removing more eyes from that introduces new risks (although LLMs are also better at avoiding certain types of bugs, like copypasta shit).
    "Test your tests" gets very difficult as a product evolves and increases in complexity. Few contracts (whether unit test level or "automation clicking on the element on the page") level are static enough to avoid needing to rework the tests, which means reworking the testing of the tests, ...
    I think we'll find out just how low the general public's tolerance for bugs and regressions is.
    davidbaua month ago
    No question this will be hard to do.
    But I am not so pessimistic. I do think it will be possible, because it is more fun to test your tests now than in the pre-LLM era. You just need a little bit of knowledge and patience, and the LLM absorbs most of the psychic pain.
    If programmers get accustomed to doing their tests of tests, software might actually get better.
- newspaper12 months ago
  I've had the opposite results. I used to "vibe code" in languages that I knew, so that I could review the code and, I assumed, contribute myself. I got good enough results that I started using AI to build tools in languages I had no prior knowledge of. I don't even look at the code any more. I'm getting incredible results. I've been a developer for 30+ years and never thought this would be possible. I keep making more and more ambitious projects and AI just keeps banging them out exactly how I envision them in my mind.
  To be fair I don't think someone with less experience could get these results. I'm leveraging every thing I know about writing software, computer science, product development, team management, marketing, written communication, requirements gathering, architecture... I feel like vibe coding is pushing myself and AI to the limits, but the results are incredible.
  - WhyOhWhyQ2 months ago
    I've got 20 years of experience, but w/e. What have you made?
    2 months ago
    undefined
    newspaper12 months ago
    I don't want to dox myself since I'm doing it outside my regular job for the most part, but frameworks, apps (on those frameworks), low level systems stuff, linux-y things, some P2P, lots of ai tools. One thing I find it excels at is web front-end (which is my least favorite thing to actually code), easily as good as any front-end dev I've ever worked with.
    WhyOhWhyQ2 months ago
    I think my fatal error was trying to make something based on "novel science" (I'll be similarly vague). It was an extremely hard project to be fair to the AI.
    It is my life goal to make that project though. I'm not totally depressed about it because I did validate parts of the project. But it was a let down.
    newspaper12 months ago
    Baby steps is key for me. I can build very ambitious things but I never ask it to do too much at once. Focus a lot on having it get the docs right before it writes any code (it'll use the docs) make the instructions reflexive (i.e. "update the docs when done"). Make libraries, composable parts... I don't want to be condescending since you may have tried all of that, but I feel like I'm treating it the same as when I architect things for large teams, thinking in layers and little pieces that can be assembled to achieve what I want.
    I'll add that it does require some banging your head against the wall at times. I normally will only test the code after doing a bunch of this stuff. It often doesn't work as I want at that point and I'll spend a day "begging" it to fix all of the problems. I've always been able to get over those hurdles, and I have it think about why it failed and try to bake the reasoning into the docs/tests... to avoid that in the future.
    WhyOhWhyQ2 months ago
    I did make lots of design documents and sub-demos. I think I could have been cleverer about finding smaller pieces of the project which could be deliverables in themselves and which the later project could depend on as imported libraries.
- stantonius2 months ago
  This happened to me too in an experimental project where I was testing how far the model could go on its own. Despite making progress, I can't bare to look at the thing now. I don't even know what questions to ask the AI to get back into it, I'm so disconnected from it. Its exhausting to think about getting back into it; id rather just start from scratch.
  The fascinating thing was how easy it was to lose control. I would set up the project with strict rules, md files and tell myself to stay fully engaged, but out of nowhere I slid into compulsive accept mode, or worse told the model to blatantly ignore my own rules I set out. I knew better, but yet it happened over and over. Ironically, it was as if my context window was so full of "successes" I forgot my own rules; I reward-hacked myself.
  Maybe it just takes practice and better tooling and guardrails. And maybe this is the growing pains of a new programmers mindset. But left me a little shy to try full delegation any time soon, certainly not without a complete reset on how to approach it.
  - parpfish2 months ago
    I’ll chime in to say that this happened to me as well.
    My project would start good, but eventually end up in a state where nothing could be fixed and the agent would burn tokens going in circles to fix little bugs.
    So I’d tell the agent to come up with a comprehensive refactoring plan that would allow the issues to be recast in more favorable terms.
    I’d burn a ton of tokens to refactor, little bugs would get fixed, but it’d inevitably end up going in circles on something new.
    FrinkleFrankle2 months ago
    That's kind of what learning to code is like, though. I assume you're using an llm because you don't know enough to do it entirely on your own. At least that's where I'm at and I've had similar experiences to you. I was trying to write a Rust program and I was able to get something in a working state, but wasn't confident it was secure.
    I've found getting the llm to ingest high quality posts/books about the subject and use those to generate anki cards has helped a lot.
    I've always struggled to learn from that sort of content on my own. That was leading me to miss some fundamental concepts.
    I expect to restart my project several more times as I find out more of what I need to know to write good code.
    Working with llms has made this so much easier. It surfaces ideas and concepts I had no idea about and makes it easy to convert them to an ingestible form for actual memorization. It makes cards with full syntax highlighting. It's delightful.
    WhyOhWhyQ2 months ago
    (I know you're replying to another guy but I just saw this.) I've been programming for 20 years, but I like the LLM as a learning assistant. The part I don't like is when you just come up with craftier and craftier ways to yell at it to do better, without actually understanding the code. The project I gave up on was at almost a million lines of code generated by the LLM, so it would have been impossible to easily restart it.
    danabramov2 months ago
    Curious if you have thoughts on the second half of the post? That’s exactly what the author is suggesting a strategy for.
    majormajor2 months ago
    "Test the tests" is a big ask for many complex software projects.
    Most human-driven coding + testing takes heavy advantage of being white-box testing.
    For open-ended complex-systems development turning everything into black-box testing is hard. The LLMs, as noted in the post, are good at trying a lot of shit and inadvertently discovering stuff that passes incomplete tests without fully working. Or if you're in straight-up yolo mode, fucking up your test because it misunderstood the assignment, my personal favorite.
    We already know it's very hard to have exhaustive coverage for unexpected input edge cases, for instance. The stuff of a million security bugs.
    So as the combinatorial surface of "all possible actions that can be taken in the system in all possible orders" increases because you build more stuff into your system, so does the difficulty of relying on LLMs looping over prompts until tests go green.
  - davidbaua month ago
    "I reward-hacked myself" is a great way to put it!!
    AI is too aware of human behavior, and it is teaching us that willpower and config files are not enough. When the agent keeps producing output that looks like progress, it is hard not to accept. We need something external that pushes back when we don't.
    That is why automated tests matter: not just because they catch bugs (though they do), but because they are a commitment device. The agent can't merge until the tests pass. "Test the tests" matters because otherwise the agent just games whatever shallow metric we gave it, or when we're not looking, it guts the tests.
    The discipline needs to be structural, not personal. You cannot out-willpower a system that is totally optimized to make you say yes.
- vhill2 months ago
  > not only did the thing it made not end up being shippable
  The difference between then and now is that often with the latest models, it is shippable without bugs within a couple of LLM reviews.
  I’m ok doing the work of a dev manager while holding the position of developer.
  I’m sure there was someone that once said “The washing machine didn’t do a good job, and I wasn’t proud of myself when I was using it”, but that didn’t stop washing machines from spreading to most homes in first-world countries.
  - WhyOhWhyQ2 months ago
    Who are you arguing against? When did I say it wouldn't? I'm glad you like it. No need to fight me?
- imiric2 months ago
  > I think the "second way" is good for demos now.
  It's also good for quickly creating legitimate looking scam and SEO spam sites. When they stop working, throw them away, and create a dozen more. Maintenance is not a concern. Scammers love this new tech.
  - keyle2 months ago
    Advertising campaigns as well, which, arguably, fits your categories.
  - yen2232 months ago
    This argument can be used to shut down anything that makes coding faster or easier. It's not a convincing argument to me.
- black_132 months ago
  [dead]
arkensaw2 months ago
> As AI edges humans out of the business of thinking, I think we need to be wary of losing something essential about being human
If AI edges humans out of the business of thinking, then we're all in deep shit, because it doesn't think, it just regurgitates previous human thinking. With no humans thinking, no advances in code will be possible. It will only be possible to write things which are derivatives of prior work
(cue someone arguing with me that everything humans do is a derivative of prior work)
- adlpz2 months ago
  Agreed, conceptually.
  BUT. For 99% of tasks I'm totally certain there's people out there that are orders of magnitude better at them than me.
  If the AI can regurgitate their thinking, my output is better.
  Humans may need to think to advance the state of the art.
  Humans may not need to think to just... do stuff.
  - latexr2 months ago
    > For 99% of tasks I'm totally certain there's people out there that are orders of magnitude better at them than me.
    And LLMs slurped some of those together with the output of thousands of people who’d do the task worse, and you have no way of forcing it to be the good one every time.
    > If the AI can regurgitate their thinking, my output is better.
    But it can’t. Not definitively and consistently, so that hypothetical is about as meaningful as “if I had a magic wand to end world hunger, I’d use it”.
    > Humans may not need to think to just... do stuff.
    If you don’t think to do regular things, you won’t be able to think to do advanced things. It’s akin to any muscle; you don’t use it, it atrophies.
    acoard2 months ago
    > And LLMs slurped some of those together with the output of thousands of people who’d do the task worse, and you have no way of forcing it to be the good one every time.
    That's solvable though, whether through changing training data or RL.
    adlpz2 months ago
    > And LLMs slurped some of those together with the output of thousands of people who’d do the task worse
    Theoretically fixable, then.
    > But it can’t. Not definitively and consistently
    Again, it can't, yet, but with better training data I don't see a fundamental impossibility here. The comparison with any magic wand is, in my opinion, disingenuous.
    > If you don’t think to do regular things, you won’t be able to think to do advanced things
    Humans already don't think for a myriad of critical jobs. Once expertise is achieved on a particular task, it becomes mostly mechanical.
    -
    Again, I agree with the original comment I was answering to in essence. I do think AI will make us dumber overall, and I sort of wish it was never invented.
    But it was. And, being realistic, I will try to extract as much positive value from it as possible instead of discounting it wholly.
  - y0eswddl2 months ago
    Only if you're less intelligent than the average. The problem with LLMs is that they will always fall to the average/mean/median of information.
    And if the average person is orders of magnitude better than you at thinking, you're right... you should let the AI do it lol
    adlpz2 months ago
    Your comment is nonsensical. Have you ever used any LLM?
    Ask the LLM to... I don't know, to explain to you the chemistry of aluminium oxides.
    Do you really think the average human will even get remotely close to the knowledge an LLM will return to such a simple question?
    Ask an LLM to amend a commit. Ask it to initialize a rails project. Have it look at a piece of C code and figure out if there are any off-by-one errors.
    Then try the same to a few random people on the street.
    If you think the knowledge stored in the LLM weights for any of these questions is that of the average person I don't even know what to say. You must live in some secluded community of savant polymaths.
    y0eswddla month ago
    "they will always fall to the average/mean/median of *information."*
    woopwoop2 months ago
    Do you think that the average person can get a gold on the IMO?
  - djaouen2 months ago
    > Humans may not need to think to just... do stuff.
    God forbid we should ever have to think lol
    gedy2 months ago
    It is concerning how some people really don't want to think about some things, and just "do".
    alchemism2 months ago
    Very Zen of you to say
  - toobulkeh2 months ago
    Imagine if everyone got the opportunity to work on SOTA. What a world we would be.
    Unfortunately that’s not where we’re headed.
    adlpz2 months ago
    We've never been there.
    With AI and robotics there may be the slim chance we get closer to that.
    But we won't. Not because AI, but because humans, of course.
- xtiansimon2 months ago
  > “…regurgitates previous human thinking.”
  I was thinking about this after watching YouTube short verticals for about 2 hours last night: ~2min clips from different TV series, movies, SNL skits, music insider clips (Robert Trujillo auditions for Metallica, 2003. LOL). My friends and I often relate in regurgitated human sound bites. Which is fine when I’m sitting with friends driving to a concert. Just wasting time.
  I’m thinking about this time suck, and my continual return/revisiting to my favorite hard topics in philosophy over and over. It’s certainly what we humans do. If I think deeply and critically about something, it’s from the perspective of a foundation I made for myself from reading and writing, or it was initialized by a professor and coursework.
  Isn’t it all regurgitated thinking all the way down?
  - metadope2 months ago
    > a foundation I made for myself
    Creative thinking requires an intent to be creative. Yes, it may be a delusion to imagine oneself as creative, one's thoughts to be original, but you have to begin with that idea if you're going to have any chance of actually advancing human knowledge. And the stronger wider higher you build your foundation-- your knowledge and familiarity with the works of humans before your time-- the better your chance of successful creativity, true originality, immortality.
    Einstein thinks nothing of import without first consuming Newton and Galileo. While standing on their shoulders, he could begin to imagine another perspective, a creative reimaging of our physical universe. I'm fairly sure that for him, like for so many others, it began as a playful, creative thought stream, a What If juxtoposition between what was known and the mystery of unexplored ideas.
    Your intent to create will make you creative. Entertain yourself and your thoughts, and share when you dare, and maybe we'll know if you're regurgitating or creating. But remember that you're the first judge and gatekeeper, and the first question is always, are you creative?
  - arkensaw2 months ago
    > Isn’t it all regurgitated thinking all the way down?
    there it is
- palmotea2 months ago
  > If AI edges humans out of the business of thinking, then we're all in deep shit
  Also because we live under capitalism, and you need something people need you to do to be allowed to live.
  For a century+, "thinking" was the task that was supposed to be left to humans, as physical labor was automated. If "AI edges humans out of the business of thinking" what's left for humans, especially those who still need to work for a living because they don't have massive piles of money.
- spicyusername2 months ago
  If AI edges humans out of the business of thinking
  This will never happen because the business of thinking is enjoyable and the humans whose thinking matters most will continue to be intrinsically motivated to do it.
  - palmotea2 months ago
    > This will never happen because the business of thinking is enjoyable and the humans whose thinking matters most will continue to be intrinsically motivated to do it.
    What world do you live in, where you get paid doing the things that are enjoyable to you, because they're enjoyable?
  - immibis2 months ago
    Humans draw, but humans have been edged out of the business of drawing long ago.
- jagoff2 months ago
  >With no humans thinking, no advances in code will be possible.
  What? Coding is like the one thing that RL can do without any further human input because there is a testable provable ground truth; run the code.
zkmon2 months ago
I became a big fan of David Bau about 10 years back, when I came across his conformal mapping page. My understanding of complex numbers changed forever.
There is an amusing parallel with his views on vibe coding. Back in the 90's and 2000's I noticed a pattern with the code developed by the huge influx of inexperienced programmers jumping on the dotcom bandwagon. The code can only be maintained by the same people who wrote it. There was no documentation, no intuition, no best practices and other wouldn't know where to fix if there is any issue. Probably the code aligned with the programmer's cultural habits and values (what's ok, what's not ok), which others might lack. Ironically, this has kind of provided job security for them, as it is difficult for others, to deal with that code.
I guess the LLMs are also into this "job-security" trick, by ensuring only LLMs can manage the LLMs generated code.
keyle2 months ago
I find it's ok to vibe code something digestible like a ZSH function to do X or Y. An image converter, or something along those lines.
Anything that involves multiple days of work, or that you plan on working on it further, should absolutely not be vibe coded.
A) you'll have learnt pretty much nothing, or will retain nothing. Writing stuff by hand is a great way to remember. A painful experience worthwhile of having is one you've learnt from.
B) you'll find yourself distanced from the project and the lack personal involvement of 'being in the trenches' means you'll stop progressing on the software and move on back to something that makes you feel something.
Humans are by nature social creatures, but alone they want to feel worthwhile too. Vibe coding takes away from this positive reinforcement loop that is necessary for sticking with long running projects to achievement.
Emotions drive needs, which drives change and results. By vibe coding a significant piece of work, you'll blow away your emotions towards it and that'll be the end of it.
For 'projects' and things running where you want to be involved, you should be in charge, and only use LLMs for deterministic auto-completion, or research, outside of the IDE. Just like managing state in complex software, you need to manage LLMs' input to be 'boxed in' and not let it contaminate your work.
My 5c. Understanding the human's response to interactions with the machines is important in understanding our relationship with LLMs.
- newspaper12 months ago
  I get a huge emotional reward by conjuring up something that I dreamed of but wouldn't have had time to build otherwise. The best description I can give is back in the day when you would beat a video game to see the ending.
- deegles2 months ago
  not to be antagonistic but are we paid to learn stuff or to build stuff? I think it's the latter. if we have to learn something it's only so that we can build something in the end.
  - zeta01342 months ago
    I am absolutely paid by the hour to learn stuff. The things I'm learning are mostly messy business domain bits: how does our internal API work, who wrote it, what were the constraints, which customer requested this feature, how much SLA will we lose if we break it to hotfix this CVE...
    Yes the end result is at some small moment in time a thing that was built. But the value prop of the company isn't the software, it's the ability to solve business problems. The software is a means to that end. Understanding the problems is almost the entire job.
    davidbaua month ago
    Agreed. The question, for me:
    Is it possible to vibe code (the second way, without looking at 90% of the code) and still learn the important things?
    I think the keys to the castle will come from figuring out how to do this.
    orjfi2hsbfith2 months ago
    > But the value prop of the company isn't the software, it's the ability to solve business problems.
    Clearly it's critical to the job, but to take your point to its limits: imagine the business has a problem to solve and you say "I have learned how to solve it but I won't solve it nor help anyone with it." Your employer would not celebrate this, because they don't pay you for the private inner workings of your own mind, they pay you for the effects your mind has on their product. Learning is a means to an end here, not the end itself.
    zeta01342 months ago
    Helpfully, neither is "I won't solve it nor help anyone with it" actually normal. That's what documentation, mentorship, peer review and coaching is for. Someone has to actually write all that stuff. If I solved it initially, that someone is me. Now it's got my name on it (right there in the docs, as the author) and anyone else can tap on my shoulder. I'm happy to clarify (and improve that documentation) if something is unclear.
    Here, of course, is finally where AI can plausibly enter the picture. It's pretty good at search! So if someone has learned, and understood, and written it down, that documentation can be consumed, surfaced, and learned by a new hire. But if the new hire doesn't actually learn from that, then they can't improve it with their own understanding. That's the danger.
    ModernMech2 months ago
    > "I have learned how to solve it but I won't solve it nor help anyone with it."
    Intrinsic in learning is teaching. You haven't learned something until you've successfully taught it to someone else.
  - keyle2 months ago
    That's just Stockholm syndrome. Your mind went straight to the job being soulless and coping with it.
  - 2 months ago
    undefined
Dr_Birdbrain2 months ago
I’m unclear what has been gained here.
- Is the work easier to do? I feel like the work is harder.
- Is the work faster? It sounds like it’s not faster.
- Is the resulting code more reliable? This seems plausible given the extensive testing, but it’s unclear if that testing is actually making the code more reliable than human-written code, or simply ruling out bugs an LLM makes but a human would never make.
I feel like this does not look like a viable path forward. I’m not saying LLMs can’t be used for coding, but I suspect that either they will get better, to the point that this extensive harness is unnecessary, or they will not be commonly used in this way.
- peacebeard2 months ago
  I have a feeling that the real work of writing a complex application is in fully understanding the domain logic in all its gory details and creating a complete description of your domain logic in code. This process that OP is going through seems to be "what if I materialize the domain logic in tests instead of in code." Well, at first blush, it seems like maybe this is better because writing tests is "easier" than writing code. However, I imagine the biggest problem is that sometimes it takes the unyielding concreteness of code to expose the faults in your description of the domain problem. You'd end up interacting with an intermediary, using the tests as a sort of interpreter as you indirectly collaborate with the agent on defining your application. The cost of this indirection may be the price to pay for specifying your application in a simpler, abstracted form. All this being said, I would expect the answers to "is it easier? is it faster?" would be: well, it depends. If it can be better, it's certainly not always better.
  - joshribakoff2 months ago
    Its not mutually exclusive. We write test precisely because expressing a complex application is hard without them. But to your point, we should not wave away applications that cannot be understood with extra tests. I agree.
- stephendause2 months ago
  > - Is the work faster? It sounds like it’s not faster.
  The author didn't discuss the speed of the work very much. It is certainly true that LLMs can write code faster than humans, and sometimes that works well. What would be nice is an analysis of the productivity gains from LLM-assisted coding in terms of how long it took to do an entire project, start to finish.
rrix22 months ago
I've been asking for little tutorials or implementation plans for things, and demanding that the model not write any code itself. Following the advice of Geoffrey Litt.[1] I find reviewing code written by my coworkers to be difficult when i'm being paid for it, surely i'm not gonna review thousands of lines of auto-generated code and the comprehensive tests required to trust them in my free time...!
So I've been learning kotlin & android development in the evenings and i find this style of thing to be so much more effective as a dev practice than claude code and a better learning practice than following dev.to tutorials. I've been coding for almost 20 years and find most tutorial or documentation stuff either targeted to someone who has hardly programmed at all, or just plain old API docs.
Asking the langlemangler to generate a dev plan, focusing on idiomatic implementation details and design questions rather than lines of code, and to let me fill in the algorithm implementations, it's been nice. I'll use the jetbrains AI autocomplete stuff for little things or ask it to refactor a stinky function but mostly I just follow the implementation plan so that the shape of the whole system is in my head.
Here's an example:
> i have scaffolded out a new project, an implementation of a library i've written multiple times in the last decade in multiple languages, but with a language i haven't written and with new design requirements specified in the documentation. i want you to write up an implementation plan, an in-depth tutorial for implementing the requirements in a Kotlin Multi Platform library. > i am still learning kotlin but have been programming for 20 years. you don't need to baby me, but don't assume i know best practices and proper idioms for kotlin. make sure to include background context, best practices, idioms, and rationale for the design choices and separation of concerns.
This produced a 3kb markdown file that i've been following while I develop this project.
[1]: https://x.com/geoffreylitt/status/1991909304085987366
- 3vidence2 months ago
  This is a really great idea going to try this out. I similarly just cannot mentally stand reviewing vibe coding PRs all day, but this sounds genuinely useful.
busfahrer2 months ago
As someone new to this topic, I'd find it interesting to see the actual chat/cli log of a successful LLM-created project like this one, especially when it comes to the meta-layers of tests and testing the tests, etc.
Can someone recommend any good resources on this? Google wasn't too helpful, or my google-fu is lacking.
ofconsequence2 months ago
> I dislike the term "vibe coding". It means nothing and it's vague.
It has a clear and specific definition. People just misuse and abuse the term.
Karpathy coined it to describe when you put a prompt into an LLM and then either run it or continue to develop on top of it without ever reviewing the output code.
I am unable to tell from TFA if the author has any knowledge or skills in programming and looked at the code or if they did in fact "vibe code".
charcircuit2 months ago
>keeping yourself as the human "real programmer" fully informed and in control.
That's not vibe coding. That is just AI assisted coding.
pessimizer2 months ago
I'm only doing the first kind right now - I'm not really letting the thing loose ever, even when I'm not great at the language it's writing in. I'm constantly making it refactor and simplify things.
But I'm optimistic about the second way. I'm starting to think that TDD is going to be the new way we specify problems i.e by writing constraints, LLMs are going to keep hacking at those constraints until they're all satisfied, and periodically the temperature will have to be jiggled to knock the thing out of a loop.
The big back and forth between human and machine would be in the process of writing the constraints, which they will be bad at if you're doing anything interesting, and good at if you're doing something routine.
The big question for me is "Is there a way to write complete enough tests that any LLM would generate nearly the same piece of software?" And to follow up, can the test suite be the spec? Would that be an improvement on the current situation, or just as much work? Would that mean that all capable platforms would be abstracted? Does this mean the software improves on its own when the LLM improves, or when you switch to a better LLM, without any changes to the tests?
If the future is just writing tests, is there a better way to do it than we currently do? Are tests the highest-level language? Is this all just Prolog?
wrs2 months ago
Aaargh, I hate it when useful terms get diffused to meaninglessness. No, there’s one kind of vibe coding. The definition of vibe coding is letting the LLM write the code and not looking at it. That’s what the word “vibe” is there for.
- doctoboggan2 months ago
  I agree with you that there is one original definition, but I feel like we've lost this one and the current accepted definition of vibe coding is any code is majority or exclusively produced by an LLM.
  I think I've seen people use the "vibe engineering" to differentiate whether the human has viewed/comprehended/approved the code, but I am not sure if that's taken off.
- platevoltage2 months ago
  I have no idea why an experienced developer who uses LLM's to make them more productive would want to degrade their workflow by calling it "vibe coding".
  - ares6232 months ago
    It’s a chance to become the next Uncle Bob in a new era of software
- christophilus2 months ago
  Yeah. I agree the distinction is important, but it’s already been lost. Maybe, we need a new phrase to describe “a product you absolutely cannot trust because I blindly followed a non-deterministic word generator.”
  Maybe, “dingus coding”?
- pessimizer2 months ago
  > Aaargh, I hate it when useful terms get diffused to meaninglessness.
  I think that when you say this, you have an obligation to explain how the term "vibe coding" is useful, and is only useful by the definition that you've become attached to.
  I think that the author is accepting that there's no such thing as the vibe coding that you've defined (except for very short and very simple scripts), and that in all other cases of "vibe coding" there will be a back and forth between you and the machine where you decide whether what it has done has satisfied your requirements. Then they arbitrarily distinguish between two levels of doing that: one where you never let the LLM out of the yard, and the other where you let the LLM run around the neighborhood until it gets tired and comes back.
  I think that's a useful distinction, and I think that the blog makes a good case for it being a useful distinction. I don't find your comment useful, or the strictness of definition that it demands. It's unrealistic. Nobody is asking an LLM to do something, and shipping whatever it brings back without any follow-up. If nobody is doing that, a term restricted to only that is useless.
  - wrs2 months ago
    People definitely are doing that. Anyone who is not a programmer and asks the LLM to write a program is doing exactly that. The LLM will do that itself behind the scenes nowadays (yesterday Claude wrote a Python program when I simply asked it to give me the document it wrote in Word format!).
    References: This is the original definition ("forget that the code even exists"). [0] Simon Willison wrote a much longer version of my comment. [1] He also suggested the term "vibe engineering" for the case where you are reviewing the LLM output. [2]
    [0] https://x.com/karpathy/status/1886192184808149383
    [1] https://simonwillison.net/2025/Mar/19/vibe-coding/
    [2] https://simonwillison.net/2025/Oct/7/vibe-engineering/
  - 2 months ago
    undefined
- brikym2 months ago
  It needs a short concise name. Vibe-cod-ing is catchy. Ell-Ell-Em-Cod-ing isn't.
- francisofascii2 months ago
  At this point I think it is no longer a binary yes/no but rather a nebulous percentage. For example, this codebase was 90% vibe coded, leaving 10% that was edited manually or reviewed.
- dbtc2 months ago
  In that case, "blind" would be more accurate.
- hackable_sand2 months ago
  I'm ngl, when I first heard "vibe coding" I immediately imagined programming from memory.
  - parpfish2 months ago
    My mind went… elsewhere. Specifically, the gutter.
    https://en.wikipedia.org/wiki/Teledildonics
    bitwize2 months ago
    Unsurprisingly, the Rust community has you covered there also:
    https://github.com/buttplugio/buttplug
    https://github.com/Gankra/cargo-mommy (has integration with the former)
    hackable_sand2 months ago
    Ooooh very interesting
- exe342 months ago
  you're still allowed to alternate between letting it do and consolidating, no?
  - acedTrex2 months ago
    no, vibe coding is explicitly NOT looking at the output.
    MisterTea2 months ago
    From my understanding, the vibe part means you go along with the vibe of the LLM meaning you don't question the design choices the LLM makes and you just go with the output it hands you.
    Izkata2 months ago
    This is where the term came from: https://x.com/karpathy/status/1886192184808149383?lang=en
    exe342 months ago
    In that case I cannot be accused of vibe-coding!
    2 months ago
    undefined
    exe342 months ago
    okay so I'm not vibe coding, I'm just writing shittier code than before.
gaigalas2 months ago
There's something wrong with this vibe coded stuff, any kind of it.
_It limps faster than you can walk_, in simple terms.
At each model release, it limps faster, but still can't walk. That is not a good sign.
> Do we want this?
No. However, there's a deeper question: do people even recognize they don't want this?
dash22 months ago
I tried the new vibe-coded Mandelbrot viewer and it didn't seem to work well on Safari. I could only get it to zoom in once, and most of the keys didn't work. Maybe the author hasn't done enough manual testing?
- davidbaua month ago
  Right. Just saw this thread. Yesterday asked claude+codex to add a fallback to WebGL support (another 5000 LoC!). So now it works a bit better on Linux, Safari, though the WebGL impl is not as smooth as WebGPU.
cornhole2 months ago
it took me a bit to figure out my aversion to ai usage in programming, but it comes down to the fact that i like building the thing instead of telling the computer to do it in detailed ways. if i wanted to tell people what to do, I would’ve become a manager
satisfice2 months ago
This article is premised on a shallow notion of testing. Because of this, the author lacks a conceptual toolkit to discuss product risk. He speaks of the most important part of the testing process (human thinking and judgment) as if it were “the most boring job in the world” and then later contradicts that by speaking of “testing the tests” as if that were a qualitatively different process (it’s not, it’s exactly the same cognitive process as what he called boring).
The overall effect is to use the word “test” as if it were a magical concept that you plaster onto your work to give it unearned prestige.
What the article demonstrates is that vibe coding is a way to generate orders of magnitude of complexity that no one in the world can understand and no one can take real responsibility for, even in principle.
I call it slop-coding, and I am happy to slop-code throwaway tools. I instruct Claude never to “test” anything I ask it to create, because I need to test it myself in order to apply it responsibly and feel close to it. If I want automated output checking (a waste of time with most tools I create), I can slop-code a framework for that, a la carte.
This way it burns fewer tokens of silly shallow testing.
- davidbaua month ago
  Fair. Let me be more precise.
  The distinction is between two ways of deploying human thinking. In the first, you are the test oracle: think about every test, repeat every five minutes, or every 30. In the second, you design the evaluation infrastructure: what to measure, what's untested, what hypotheses to prioritize. Both require judgment. But the first scatters your attention; the second concentrates it. I disagree that an LLM cannot be used to write tests, but as with any battery of tests, you cannot trust it blindly. You need to think about how to test the tests.
  As for product risk: I do not know what is hiding in 13,000 lines I haven't read; there are certainly bugs to be found. But that is just as true when you manage a big team. The solution has never been to read every line your collaborators write. You need to invest in (technical and human) systems that give you confidence without requiring you to personally verify everything. The question is how to build systems that are good enough.
jackfranklyn2 months ago
The "going in circles" problem several people mention seems like the key thing to watch for. In my experience it tends to happen when the codebase outgrows what the model can hold in context effectively.
What's worked better for me: treating it like onboarding a contractor. Very specific, bounded tasks with clear acceptance criteria. The moment you're spending more time explaining context than it would take to just write the code yourself, that's the signal to switch back.
- darkstar_162 months ago
  I do the same. Directed tasks with smaller context and start a new "chat" when it's done what I want.
predkambrij2 months ago
Things about those approaches did and will change more when LLMs are getting better. I got some unbelievable good results back in March, then I was tasking LLMs too hard problems and got bunch of frustrations, then learned to prompt better (to give LLMs methods to test their own work). It's an art to do good balance of spending time writing prompts that will work. A prompt could be "fix all the issues on Github", but maybe it going to fail :)
agumonkey2 months ago
I asked absurd question to chatgpt 4o when it came out, by mixing haskell and lisp books terminology (say design an isomorphic contravariant base class so that every class fills <constraint>). The result was somehow consistent and it suddenly opened my brain to what kind of stupid things i could explore.
Felt like I became a phd wannabe in 5 minutes
alyxya2 months ago
I don’t think the two kinds of vibe coding are entirely separate. There’s a spectrum of how much context you care to understand yourself, and it’s feasible to ask a lot of questions to gain more understanding or let loose and give more discretion to the LLM.
TYPE_FASTER2 months ago
> It is when you use a coding agent to build towers of complexity that go beyond what you have time to understand in any detail.
I think the quality of the product depends on the person (or people) responsible for it understanding the details.
anthk2 months ago
LLM's forget the context fast. Better if you learn programming the hard way. With books, either physical or PDF/EPUB's.
geldedus2 months ago
Neither is vibe coding. Both are AI-assisted programming, a different species. Classical mislabelling.
latexr2 months ago
> The second type of vibe coding is what I am interested in. It is when you use a coding agent to build towers of complexity that go beyond what you have time to understand in any detail. I am interested in what it means to cede cognitive control to an AI.
What an absolutely abhorrent way of thinking. “I am interested in turning off my brain to create unstable Jenga towers of complexity that I’ll have no ability to fix when they inevitably fail”.
As if software in general isn’t a big enough pile of garbage already. One day, every single one of us will be seriously bitten by bugs created by this irresponsible approach.
Being able to understand what you build is a feature, not a bug.
epgui2 months ago
Am I the only one who, rather than being impressed, is recoiling in horror?
joshuasarah2 months ago
[dead]
bloppe2 months ago
Someone should start an anthology of posts claiming "I vibe-coded this toy project. Software Engineering is dead."
- ubertaco2 months ago
  I bet we could vibe-post a bunch of them, even! Blogging is dead!
  - 2 months ago
    undefined