> Why am I doing this? Understanding the business problem and value
> What do I need to do? Designing the solution conceptually
> How am I going to do it? Actually writing the code
> For decades, that last bucket consumed enormous amounts of our time. We’d spend hours, days or weeks writing, debugging, and refining. With Claude, that time cost has plummeted to nearly zero.
That last part is actually the easiest, and if you're spending inordinate amount of time there, that usually means the first two were not done well or you're not familiar with the tooling (language, library, IDE, test runner,...).
There's some drudgery involved in manual code editing (renaming variable, extracting functions,...) but those are already solved in many languages with IDEs and indexers that automate them. And so many editors have programmable snippets support. I can genuinely say in all of my programming projects, I spent more time understanding the problem than writing code. I even spent more time reading libraries code than writing my own.
The few roadblocks I have when writing code was solved by configuring my editor.
For me the most important part of a project is working out the data structures and how they are accessed. That's where the rubber meets the road, and is something that AI struggles with. It requires a bit too high a level of abstract thinking and whole problem conceptualization for existing LLMs. Once the data structures are set the coding is easy.
I don't always find this, because there's a lot of "inside baseball" and accidental complexity in modern frameworks and languages. AI assist has been very helpful for me.
I'm fairly polyglot and do maintenance on a lot of codebases. I'm comfortable with several languages and have been programming for 20 years but drop me in say, a Java Spring codebase and I can get the job done but I'm slow. Similarly, I'm fast with TypeScript/CDK or Terraform but slow with cfndsl because I skipped learning Ruby because I already knew Python. I know Javascript and the DOM and the principles of React but mostly I'm backend. So it hurts to dive into a React project X versions behind current and try to freshen it up because in practice you need reasonably deep knowledge of not just version X of these projects but also an understanding of how they have evolved over time.
So I'm often in a situation where I know exactly what I want to do, but I don't know the idiomatic way to do it in a particular language or framework. I find for Java in particular there is enormous surface area and lots of baggage that has accumulated over the years which experienced Java devs know but I don't, e.g. all the gotchas when you upgrade from Spring 2.x to 3.x, or what versions of ByteBuddy work with blah maven plugin, etc.
I used to often experience something like a 2x or 3x hit vs a specialised dev but with AI I am delivering close to parity for routine work. For complex stuff I would still try to pair with an expert.
For me this becomes more and more relevant as I go into languages and frameworks Im not familiar with.
Having said that you do need to be vigilant. LLMs seem to love generating code that contains injection vulnerabilities. It makes you wonder about the quality of the code it's been trained on...
My use of esoteric C++ has exploded. Good thing I will have even better models to help me read my code next week.
The much lowered bar to expanding one’s toolkit is certainly noticeable, across all forms of tool expansion.
>In October 2015, Byte Buddy was distinguished with a Duke's Choice award by Oracle. The award appreciates Byte Buddy for its "tremendous amount of innovation in Java Technology". We feel very honored for having received this award and want to thank all users and everybody else who helped making Byte Buddy the success it has become. We really appreciate it!
Don't misread me. It's solid software. And an instance of a well structure objet-oriented code base.
But it's impossible to do anything without having a deep and wide understanding of the class hierarchy (which is just as deep and wide). Out of 1475 issues on the project's Github page, 1058 are labelled as questions. You can't just start with a few simple bricks and gradually learn the framework. The learning curve is super steep from the get go, all of the complexity is thrown into your face as soon as you enter the room.
This is the kind of space where LLM would shine
Or they're the kind of people who rushed to step 3 too fast, substantially skipping steps 1 and/or 2 (more often step 2). I've worked with a lot of people like that.
I mean "I don't know what I'm doing, but gotta start now." If that's "move fast and break things," it's even dumber than I thought.
I am using a modified form of TDD's red/green refactor, specifically with an LLM interface independent of my IDE.
While I error on good code over prompt engineering, I used the need to submit it to both refine the ADT and domain tests, after creating a draft of those I submit them to the local LLM, and continue on with my own code.
If I finish first I will quickly review the output to see if it produced simpler code or if my domain tests ADT are problematic. For me this avoids rat holes and head of line blocking.
If the LLM finishes first, I approach the output as a code base needing a full refactor, keeping myself engaged with the code.
While rarely is the produced code 'production ready' it often struggles when I haven't done my job.
You get some of the benefits of pair programming without the risk of demoralizing some poor Jr.
But yes, tradeoff analysis and choosing the least worst option is the part that LLM/LRMs will never be able to do IMHO.
Courses for horses and nuance, not "best practices" as anything more than reasonable defaults that adjust for real world needs.
the balance only shifts with a language/framework I'm not familiar with.
And if you don't speak the language please spare us from your LLM generated vibe coding nonsense
I don't quite agree.
This may seem like splitting hairs, but I think the only way to learn a programming language is to write it
I don't think any amount of reading and fixing LLM code is sufficient to learn how to code yourself
Writing code from scratch is a different skill
Isn't that exactly what "vibe coding" is supposed to be?
(BRB, injecting code vulnerabilities into my state actor LLM.)
For me, I give Gemini the full context of my repo, tell it the sweeping changes I want to make, and let it do the zero to one planning step. Then I modify (mostly prune) the output and let Cursor get to work.
If the full context of your repo (which I assume means more or less the entire git history of it, since that is what you usually need for sweeping changes) fits into Gemini's context window, you're working on a very small repo, and so your problems are easy to solve, and LLMs are ok at solving small easy problems. Wait till you get to more than some few thousand lines of code, and more than two years of Git history, and then see if this strategy still works well for you.
Another way to phrase this is:
I agree for method-level changes, but the more you’re
willing to cede *understanding* for larger changes, even in
a familiar language, the more an LLM accelerates you *to an
opaque change set*.
Without understanding, the probability of a code generation tool introducing significant defects approaches 1.The result ain't going to be what you get if you've got a focused group of 10x geniuses working on everything, but I think a lot of the aspects of "enterprise development" that people complain about is simply the result of making the best of a bad situation.
I like Java, because I've worked with people who will fuck up repeatedly without static type checking.
Meanwhile no two React projects are the same because they typically have several dependencies, each solving a small part of the problem at hand.
That's a management problem. Meaning you assess that risk and try to alleviate it. A good solution like you say is languages with good type checking support. Another is code familiarity and reuse through frameworks and libraries. A third may be enforcing writing tests to speed up code review (and checklist rules like that).
It's going to be boring, but boring is good at that scale.
It's a useful tool that can accelerate certain tasks, but it has a lot of sharp edges.
I do quite a bit of this and even here LLMs seem extremely hit and miss, leaning towards the miss side more often than not
LLMs _can_ do consistency, they're pretty good continuing a pattern...if they can see it. Which can be hard if it's scattered around the codebase.
This describes any codebase in any programming language
This is why "programming patterns" exist as a concept
The fact that LLMs are bad at this is a pretty big mark against them
They won't even consistently provide the same answer to the same input. Occasional consistency is inconsistency.
Yeah, now get the LLM to write C++ for a public facing service, what could possibly go wrong?
By the end of the day (10-ish hours) all I got to show was about 3 screens with few buttons each… Something a normal React developer probably would’ve spat out in about a hour. On top of that, I can’t remember shit about the application itself - and I can practically recite most of the codebases that I’ve spent time on.
And here I read about people casually generating and erasing 20k lines of code. I dunno, I guess I am either holding it wrong, or most of the time developing software isn’t spent vomiting code.
I've also been writing code for a long time, did the 6502 assembly thing way back when, and lots since then. For this current project I wanted to build a web app with a frontend in Angular and a backend in Java 21 relying on javalin.io for the services layer. It had a few other integrations as well - into a remote service requiring OAuth and also into subtlecrypto. After less than 10 hours I had a fully functioning MVP that was far superior to anything I could have created without an assistant. It gave me build files, even a test skeleton. Restyling the UI or reflowing the UX to include confirmations, additional steps, modals, ... was really easy. I just had to type it, and those changes would get made. It felt like I was "director of development" for a day.
I used Aider, plugged into Gemini 2.5.
Was it productivity boost for me - yeah, cause I know mostly shit about React. But as an end result it just felt very underwhelming. Discussing it today with my brother (who lives and breathes FE) it apparently was.
I guess I was just expecting... I dunno... more - people are claiming nX productivity boosts, and considering how the UI is mostly boilerplate...
I think I was expecting that it will turn me into a FE developer and it will feel as natural and smooth as usual when I am in my element.
It didn’t. And the results weren’t what you would get from a real FE dev. And it felt unsatisfactory, stressful and ultimately hollow.
I guess _for me_ it would be fine for a throw away MVP - something that I don’t want to put my heart into.
If a tool does not consistently produce results, you HAVE to take that at face value. You can’t just remove the numbers bringing down the average and say you have reached 95% success. When you see polar opposite experiences from so many people, the only reasonable takeaway is “so it’s unpredictable, very hard to use, or both.”
of course not everything is how i'd truly like it, but it's 80% there.
before LLMs I wouldn't have bothered with starting. I knew exactly what I want to do, but it'd take me a couple days to proof the idea in Python and then translating it to web ui in TS? forget it
So anything that can let you iterate the loop faster is good. The analogy is kind of like if you can make your compile and tests faster, its way easier to code. Because you don't just code and test at the very end, you do it as part of a thinking loop.
Writing code to find specs is brute-forcing the solution. Which is only useful when there's no answer or data (kinda rare in most domains). Taking some time to plan and do research can resolve a lot of inconsistency in your starting design. If you have written the code before, then you'll have to refactor even if the program is correct, because it will be a pain to maintain.
In painting, even sketching is a lot of work. Which is why artists will collect references instead, mentally select the part they will extract. Once you start sketching, the end goal is always a final painting, even if you stop and redo midway. Actual prototyping is called a study and it's a separate activity.
I think the major objection is that you only want to automate real tedium, not valuable deliberation. Letting an llm drive too much of your development loop guarantees you don't discover the things you need to unless the model does by accident, and in that case it has still trained you to be a tiny bit lazier and stolen an insight you would have otherwise had yourself, so are you really better off?
I'm mostly talking about Cursor Tab - the souped up autocomplete. I think its the perfect interface, it monitors what I type and guesses my intention (multiline autocomplete, and guessing which line I'm going to next).
It lets me easily parse if the LLM is heading in the right direction, in which case pressing tab speeds up the tedium. If its wrong, I just keep typing till it understands what I'm trying to do. It works really really well for me.
I went back to using a non-LLM editor for a bit and I was shocked at how much I had become dependent on it. It was like having an editor that didn't understand types and didn't autocomplete function names. I guess if you're a purist and never used any IDE functionality, then this also wouldn't be for you. But for me, its so much better of an experience.
That's only true if it doesn't negatively impact the value produced each iteration by enough to offset the speed improvement.All other things being equal, going yhrough the loop faster is better, but automation doesn and always keep all other things equal.
I'd recommend you give `aider` specifically a try. It's slowly taken over more and more of the "what" and "how" buckets outlined in the article, especially for large-surface-area code bases.
It turns out, for me at least, there is a big mental activation hurdle between "what" and "how". I need a lot of focus time to pivot between "what" and "how" efficiently, especially for work that spans large parts of the codebase, and work that I'm not super excited about doing. Using `aider` has lowered this activation threshold dramatically. It's made "writing code" about as simple as talking about a technical solution with an intelligent colleague.
I usually follow the format (1) describe the context of the app / problem you're trying to solve (2) describe what you know the solution will look like (3) ask it for clarifying questions / if it needs any examples or context to know the problem space better, and if the solution makes sense / do they see any issues with it? (4) ask it to outline the solution in greater detail, do not write code (5) add any clarifications, now do the thing.
i.e. it's kind of similar to interacting with a super fast, eager, indefatigable junior engineer. Sometimes it misses things or misunderstands, but not nearly enough to make the juice not worth the squeeze. These days, I'd say I spend more time reading/editing claude-generated code, writing commit messages, and managing deployments than I do writing code. It's a higher level of abstraction and I get way more leverage out of the deal. The code I'm writing is, on balance, better than the code I wrote before. It took maybe a few months to get here, but I'm happier for giving it a shot.
So when it is convenient, the waterfall model is suddenly modern again?
The waterfall model didn't work, because many inefficiencies and wrong assumptions are found when writing code. This is also the advantage of Lisp and other languages that are malleable and not like a block of concrete.
LLMs are like a block of concrete in that they spit out the same (plagiarized) solutions over and over again. They remove you from the code, they impede flow state due to the constant interactions and outsource your thinking to some GPUs in California.
This waterfall rationalization is just one of the latest absurdities in the LLM blogging industrial complex.
"Modern" software development is based on the idea that waterfall doesn't work, but that you can fix it by only allowing projects that take either one week (dot com boom) or two weeks (web 2.0 boom).
Personally, I avoid all this stuff as much as possible, since I've never seen any of it work.
I have had good luck with LLMs, for what it's worth. I've found for step 4, writing an informal spec at the top of the source file works well:
// $ curl https://api.com/foo/bar?baz
// { json: "response", goes: "here" }
along with instructions like "implement the interface defined in some other file", and "look at some other existing file for guidance" works well, as does "implement some trivial data structure from scratch".Then I have to read what it wrote and fix the inevitable 2-3 bugs + compilation errors. It usually gets the boilerplate right, but misunderstands fundamental things. This maybe saves me 50% of coding time, since my first draft usually has bugs + doesn't compile on the first attempt either.
This really shines in step 5 (not listed above): Writing tests. I generally end up with more thorough tests than I'd write from scratch in maybe 20% of the time.
Anyway, LLM's let me get about 10-20% more done per week. Most time is still spent designing stuff and evaluating solutions. The above workflow doesn't work for non-trivial modules. It just saves time on boilerplate. It's plagiarism the same way IDE auto-complete and copy / paste (from the same codebase) are.
The new constraint is that we can build a prototype much quicker for user testing. If it takes a week to build a trash version to throw away by vibe coding, you should build the trash version and get it in front of users to try out. Then, you can throw it away or do it again.
If you can hammer out a prototype in 2 days (or two or three even) then that's pretty agile.
If I can hammer out that prototype apart from the larger system, even better, because the cost of that prototype is cheap. And so I can totally choose to build it in larger chunks.
I can think of an 18 month project - with a team - back in the day that I could bang out today in a prototype in a month. And I could have gotten it into the customer's hands screen by screen rather than a slow increment every two weeks. (This was an agile project.) I could have built a mock server with mock data. Some of the project would have taken just as long, but that would have been a hell of a lot more agile.
And this project needed 20% novel solution and 80% best practices. Like most software.
If it were, the median e.g. business analyst should be getting paid significantly more than the median software engineer. That's not what the data shows, however.
>I can genuinely say in all of my programming projects, I spent more time understanding the problem than writing code.
This is almost trivially true for anyone who understands the problem via writing code, though.
The business analyst mostly just scratch the top half of the first part.
But I do encourage them to go vibe coding! It's providing a lot of entertainment. On off chance, they would become one of us and would be most welcomed.
My real point is claiming #3 is the easiest is just silly. It's obviously much easier to come up with good business ideas in the abstract than to bring them into being. The mixture works because software as a business is an O-ring problem. These 3 tasks are not cleanly separable, they're all part of a feedback loop together.
That's not respecting #2, which still fall squarely in engineering profession.
The design of the solution are necessarily technical, otherwise it's just throwing a bunch of concept and big words to sounds cool and leads to nowhere.
The outcome of this solution would then go back and influence #1 which is the behavior that customer see. If Steve Jobs couldn't fit all of his components into his iphone then it wouldn't have existed, and he might have to settle for something less, like an ipod.
Obviously this would all exist in a ring and that's why everyone is continuously employed and mostly not fired once the product gets released.
What if you replace "business analyst" with "software architect"?
Writing the code is the trivial part.
The last part is wrong, unless it's purely greenfield.
Instead, you first need to read and then modify existing code and ensure it can later be still understood and easily/safely changed by whoever works on it. That is the hard part that is totally missed here.
I've worked with low-code platforms and also built my own low-code platform which allows me to assemble CRUD apps quickly and avoid/bypass a huge range of possible bugs, but even then, it's still not quite like laying bricks... What happens is that the bottleneck becomes UX decision-fatigue. Translating complex business requirements into a working product is rife with conflicts at the level of requirements engineering and UX. You can attain a certain level of software complexity much faster but the requirements also evolve faster to the point where you're constantly thinking about how to make different parts of the UX work well together in a way that's not confusing.
For my current job coding is 90% of my time. The rest is meetings, deployments, ticket management. Most of the time coding isn't particularly hard, but it sure consumes lots of time. I've had many days with 1000+ line diffs.
> What do I need to do? Designing the solution conceptually
> How am I going to do it? Actually writing the code
This is why when people call programmers coders it feels wrong imo
At least for me, one example of such programming is low-level database adjacent systems programming, it can take an extreme amount of fiddling to get it to work as you intended, even if you have a clear idea of what you want to implement.
Though, in the cases where the last part is hard and time consuming, LLM-based tools are not going to be of particularly big help (and in fact, is personally where I tend to disable CoPilot because it is more likely to be a distraction than useful).
Only for short code you want to throw away.
If you care about the quality of the code -- how it is organized, naming things, meaningful tests it's not.
LLMs have gotten so much better at code that I'm surprised I still don't vibe code, but it's just laughable at how bad they are stil -- test cases that just add fluff and just how "autistic" they seem and how much they miss in context that a human would not miss.
I recently changed some code where a null was returned previously and what I really needed was sort of a java Optional but with a Reason for why the value returned was not present -- I called it AdviseDecision -- it had 2 constructors -- either the value returned or the reason a value could not be computed.
I then asked Gemini 2.5 to refactor a piece of code that dealt with the null previously.
Gemini 2.5 could not jump to the conclusion that it was not possible for both the computation result or the failure to compute could not be null at the same time.
Anyway, the examples of when LLMs fail are becoming the exception and it shows how good they have gotten, but I would never says cost time plummeted to nearly zero.
For me the biggest advantage is that even though I only get about a 30% programming speed boost, I get a 300% productivity boost because I procrastinate much less, because for me it's easier to fix/modify the LLMs tasteless code than getting over the initial bump of starting from scratch.
It probably is a contradiction then that I say LLMs are so bad and so good at the same time.
not sure if anyone knows. how good would a bigquery-sql to scala parser generated code would be? can i use it without having to dig into generated code?
I made the mistake of letting it go off on its own in the first few iterations before I realised just how crazy it could get if left unattended.
Once I stopped doing that and held the yoke more frequently, I got much better results.
It was generating far _less_ code but the code it generated was far _more_ useful.
I think I threw away about 40% of the code it generated over the course of the exercise. Which is where the realisation came from that it is sometimes easier to just throw stuff away and start again with a better question than it is to try and iterate garbage into something that works.
This. Yet more often than I would like the challenge is not understanding the business side of the side problem but dealing with the existing code...
Overall I find that each one of the three time buckets are equally important and I strive to iterate quickly between them. Pretty often the existing code somehow challenges the assumptions I made in the two previous steps, both business-wise and design-wise.
I'm not sure if you're familiar with modern JS frameworks.
If someone thinks the last part is the most difficult, then they probably aren't an actual programmer.
It's possible that people's experiences are different to yours because you work on a specific type of software and other people work on other specific types of software.
At many big tech companies I've worked out, an abstract design proposal precedes any actual coding for many tasks. These design proposals are not about how you lay out code or name variables, but a high level description of the problem and the general approach to solve the problem.
Expressing that abstract thinking requires writing code but that's the "how" - you can write that same code many ways.
Which points at a pretty substantial limitation of LLM coding...
Correctness is not embedded in software. It's embedded in the real world.
I don’t think there’s anything where the first step is writing code. It’s like saying the first step of solving a math problem is writing down equations.
The whole argument behind TDD is that it's easier to write code that verify something than actually implement the code. Because it only have the answer, not the algorithm to solve the question.
So for any code you will be writing, find the answers first (expected behavior). Then add tests. The you write the code for the algorithm to come up with the answer.
Static typing is just another form of these. You tell the checker: This is the shape of this data, and it warns you of any code that does not respect that.
Regular static typing (assuming you don't go to the level of dependent types or something) has the advantage that it is extremely quick to write, compared to an equivalent test. So even if you get the types wrong 90% of the time on the first pass, you've still wasted only a trivial amount of time (consider how long it takes to write "int foo(int x)" versus tests that fail if "foo(x)" accepts anything except an int, or returns anything except an int for any int input - and how much work you'd throw away if you later realize you have to replace int with string).
Why wouldnt you decide what your code should do before writing it?
And the code changes were pure waste too.
That's why it's important to try and reduce the risk of building to the wrong requirements wherever possible - fail fast and often, use spikes and experiments, use lean, etc.
It's why it's important to reduce the cost of writing tests and code as much as possible too.
Writing the test itself and showing bits of it to others can actually help uncover many requirements bugs too. That's called BDD.
However, if it turned out it was the right thing and you didnt write a test you've just made it much harder to change that code in the future without breaking something. The cost of code changes went up.
Yes, I said so as well. Though it's also important that the code changes are likely to be the thing that reveals that the feature is misunderstood/badly specified. Lots of people can take a working feature and tell you if it addresses their problem. Much fewer can look at a set of unit tests and tell you the same.
> However, if it turned out it was the right thing and you didnt write a test you've just made it much harder to change that code in the future without breaking something. The cost of code changes went up.
Very much debatable. If the code needs to change because requirements themselves change in the future, the tests that are validating the old requirements are not helpful. And many kinds of refactoring also break most kinds of unit tests too.
From my experience, unit tests are most useful for testing regression cases, and for validating certain constrained and well defined parts of the code, like implementations of an algorithm. They're much less useful for testing regular business logic - integration tests are a much better solution for those.
Not OP, but I find this a very good question. I've always found that playing with the problem in code is how I refined my understanding of the problem. Kind of like how Richard Feynman describes his problem solving. Only by tinkering with the hard problem do you really learn about it.
I always found it strange when people said they would plan out the whole thing in great detail and code later. That never worked for me, and I've also rarely seen it work for those proposing it.
It may be because I studied control systems, but I've always found you need the feedback from actually working with the problem to course correct, and it's faster, too. Don't be scared to touch some code. Play with it, find out where your mental model is deficient, find better abstractions than what you originally envisioned before wrestling with the actual problems.
Sometimes you don't have a way to get the exact answers, so you do experiments to get data. But just like scientists in a lab, they should be rigorous and all assumptions noted down.
And sometimes, there are easy answers, so you can get these modules out of the way first.
And in other cases, maybe a rough solution is better than not having anything at all. So you implement something that solves a part of the problem while you're working on the tougher parts.
Writing code without answers is brute-forcing the solution. But novel problems are rare, so with a bit of research, it's quite easy to find answers.
POCs are better for customer-facing, product management driven work. This is because they can be bad at describing what they want. There's more risk of building the wrong thing.
POCs can be okay for system design or back-end work (or really anything not involving vague asks), but chances are planning and deeper thinking will help you more there because the problems you solve tend to be less subjective. Less risk of building the wrong thing.
Kinda like scaling. Instead of going for Kubernetes, use a few VPS and a managed database to get your first customers.
As is usually the truth in practice, it’s a mess, which is why I’ve seen combinations of upfront planning and code spiking work the best.
An upfront plan ensures you can at least talk about it with words, and maybe you’ll find obvious flaws or great insights when you share your plan with others. Please, for the love of god, don’t ruin it with word vomit. Don’t clutter it with long descriptions of what a load balancer is. Get to the point. Be honest about weaknesses, defend strengths.
Because enterprise corporate code is a minefield of trash, you just have to suck it up and go figure out where the mines are. I’ve heard so many complaints “but this isn’t right! It’s bad code! How am I supposed to design around BAD code!” I’ll tell you how, you find the bad parts, and deal with them like a professional. It’s annoying and slow and awful, but it needs doing, and you know it.
By not doing the planning, you run the risk of building a whole thing, only to be told “well, this is nice, but you could have just done X in half the time.” By not doing the coding, you risk blowing up your timeline on some obvious unknown that could have been found in five minutes.
Not everything has been solved for ten thousand years.
Tax rules depend on an interaction between national and local laws that can change from year to year based on the ruling government. You can't just take tax rules from ancient Rome to generate 2024 tax software for Quebec, Canada.
As an old geezer, I appreciate very much how LLMs enable me skip the steep part of the learning curve you have to scale to get into any unfamiliar language or framework. For instance, LLMs enabled me to get up to speed on using Pandas for data analysis. Pandas is very tough to get used to unless you emerged from the primordial swamp of data science along with it.
So much of programming is just learning a new API or framework. LLMs absolutely excel at helping you understand how to apply concept X to framework Y. And this is what makes them useful.
Each new LLM release makes things substantially better, which makes me substantially more productive, unearthing software engineering talent that was long ago buried in the accumulating dust pile of language and framework changes. To new devs, I highly encourage focusing on the big picture software engineering skills. Learn how to think about problems and what a good solution looks like. And use the LLM to help you achieve that focus.
Once you're good at it in general. I recently witnessed what happens when a junior developer just uses AI for everything, and I found it worse than if a non-developer used AI: at least they wouldn't confuse the model with their half-understood ideas and wouldn't think they could "just write some glue code", break things in the process, and then confidently state they solved the problem by adding some jargon they've picked up.
It feels more like an excavator: useful in the right hands, dangerous in the wrong hands. (I'd say excavators are super useful and extremely dangerous, I think AI is not as extreme in either direction)
Most people were not. Most tech people were not, even.
Using LLMs feels a ton like working with Google back then, to me. I would therefore expect most people to be pretty bad at it.
(it didn't stop being possible to be "good at Google" because Google Search improved and made everyone good at Google, incidentally—it's because they tuned it to make being "bad at Google" somewhat better, but eliminated much of the behavior that made it possible to be "good at Google" in the process)
0: https://www.oreilly.com/library/view/designing-large-languag...
From this it has me wondering if AI could increase the adoption of provably correct code. Dependent types have a reputation for being hard to work with, but with AI help, it seems like they could be a lot more tractable. Moreover, it'd be beneficial it the other direction too: the more constraints you can build into the type system of your domain model, the harder it will be for an AI to hallucinate something that breaks it. Anything that doesn't satisfy the constraints will fail to compile.
I doubt it, but wishful thinking.
Ever since, this has been my favorite use case, to cut through the accidental complexity when learning a new implementation of a familiar thing. This not only speeds up my learning process and projects using the new tool, it also gives me a lot more confidence in taking on projects with unfamiliar tools. This is extremely valuable.
No, it's not perfect and I imagine there's some large warts as a result, but it was much, much better than following a bog-standard tutorial on YouTube to get something running, and I'm always able to go refactor my scripts later now that I'm past initial scaffolding and setup.
This deeply resonates with me every time I stare at pandas code seeking to understand it.
It is truly amazing what a superpower these LLM tools are for me. This particular moment in time feels like a perfect fit for my knowledge level. I am building as many MVP ideas as quickly as I can. Hopefully, one of them sticks with users.
My personal opinion is that now experience matters a lot more.
A lot of times, the subtle mistakes that LLM makes or wrong direction that it takes can only be corrected by experience. LLM also don't tend to question its own decisions in the past, and will stick with them unless explicitly told.
This means LLM based project accumulate subtle bugs unless there is a human in the loop who can rip them out, and once a project accumulated enough subtle bugs it generally becomes unrecoverable spaghetti.
Dangerous as well, is that LLMs won't (unless aggressively prompted to) question your own decisions either, in contrast to something like a mentor which would help you discover a better way, if there is one.
An attribute I would like to see is the ability for an LLM to express justified self-doubt. Likewise (and perhaps directly related) would be the ability to self-critique prior to providing an answer. It's possible they are already steered to do this; if so I would like to see more of that dialogue surfaced to the user.
I feel like LLMs are just the next step on the Jobs analogy of "computers are bicycles for the mind" [0]. And if these tools are powerful bicycles available to everyone, what happens competitively? It reminds me of a Substack post I read recently:
> If everyone has AI, then competitively no one has AI, because that means you are what drives the differences. What happens if you and LeBron start juicing? Do you both get as strong? Can you inject your way to Steph’s jumpshot? What’s the differentiator? This answer is inescapable in any contested domain. The unconventionally gifted will always be ascendant, and any device that’s available to everyone manifests in pronounced power laws in their favor. The strong get stronger. The fast get faster. Disproportionately so. [1]
[0] https://youtu.be/ob_GX50Za6c?t=25
[1] https://thedosagemakesitso.substack.com/p/trashbags-of-facts...
> I've been thinking about generative AI tools as "bicycles for the mind" (to borrow an old Steve Jobs line), but I think "electric bicycles for the mind" might be more appropriate.
> They can accelerate your natural abilities, you have to learn how to use them, they can give you a significant boost that some people might feel is a bit of a cheat, and they're also quite dangerous if you're not careful with them!
Technology is part of humanity. Just as a hammer extends the hand, so too does the LLM extend the mind.
but you do manage to drive everywhere - a feat that wasn't possible previously (except perhapes for the select few who trained).
That's a big dependency, reminds me of the people in Wall-E.
And the criteria shouldn't be just the cost of the dependency, but the benefits too! I'd say every dependency we have in the modern life is worth it - otherwise people wouldn't have chosen to have it. Like electricity, mechanised farming, etc.
We often chose short term benfits over long term negative effects.
For instance eating too much sugar, too little sleep, destroying our own habitate by burning fossil fuels etc.
AI needs lots of money and resources ans the use case is more than once for useless stuff.
i would bet that there are currently lots of people who would beat LeBron in theoretical basketball, but don't have the body nor the endurance to compete.
But with a mecha-suit, the advantages of any natural born talent, and any issue with endurance or strength, etc, are diminished, leaving only mental capability as a differentiator.
That's not to say that LeBron's mental capability (in regards to basketball) is low - surely it's high. But the combination of high athletisism and mental capability is a rarity right now. Removing one of these conditions (via the mecha-suit) will then increase the pool of "high" performers imho.
It's a super rare archetype of athleticism/size+mental that only the likes of LeBron, Jokic and Magic Johnson have occupied (not meant to be an exhaustive list).
I think a huge part of most sports (especially combat ones) is muscle memory. You don't have time to think between moves. So if you want to be good you'll still have to work for days and make your body learn.
And if you think muscle memory is bullshit, try to remember how driving was hard at first and nowadays you can almost sleep through your commute.
Another analogy:
"A good archer is going to be an amazing sharpshooter and therefore I only want to field archers (with guns) as soldiers", might be a horrible way to run a modern military.
This "the best at the old thing will be the best at the new thing too!" needs to die in a fire.
What if LLMs are cars for the mind not bicycles?
Both I and Usain Bolt get a Prius. Who is faster at the shopping mall? What will happen to our fitness? Who will be the next Usain Bolt and would we even care?
Except everyone doesn't have AI. Only huge corporations with billion dollar data centers do. What happens when Sam Altman decides to de-prioritize you? The little toy model you downloaded from github won't cut it.
These AI tools shift a lot of power into a few Silicon Valley companies. Keep those skills sharp, you'll need them. Best case scenario, the VC money bonfire runs out of fuel before 3 companies own the entire software industry.
Another way to think about it is SWE agents. About a year ago Devin was billed as a dev replacement, with the now common reaction that it's over for SWEs and it's no longer a useful to learn software engineering.
A year later there have been large amounts of layoffs that impacted sw devs. There have also been a lot of fluff statements attributing layoffs to increased efficiency as a result of AI adoption. But is there a link? I have my doubts and think it's more related to interest rates and the business cycle.
I've also yet to see any AI solutions that negate the need for developers. Only promises from CEOs and investors. However, I have seen how powerful it can be in the hands of people that know how to leverage it.
I guess time will tell. In my experience the current trajectory is LLMs making tasks easier and more efficient for people.
And hypefluencers, investors, CEOs, and others will continue promising that just around the corner is a future in which human software developers are obsolete.
they were just playing to this market reaction
layoffs = bad
layoffs because of AI = good
The people in those bottlenecks anecdotally are seeing pay increases btw which goes to show - the inefficient get the spoils.
So I would say there are three categories of programmers:
1. Programmers that just want to prompt, using AI agents to write the code.
2. Programmers, like me, that use LLM as tools, writing code by hand, letting the LLM write some code too, inspecting it, incorporating what makes sense, using the LLM to explore the frontier of programming and math topics that are relevant to the task at hand, to write better code.
3. Programmers that refuse to use AI.
I believe that today category "2" is what has a real advantage over the other two.
If you are interested in this perspective, a longer form of this comment is contained in this video in my YouTube channel. Enable the English subtitles if you can't understand Italian.
The future is coming, but you still need fundamentals to make sure the generated code has been properly setup for growth. That means you need to know what you expect your codebase to look like before or during your prompting so you can promote the right design patterns and direct the generation towards the proper architecture.
So software design is not going away. Or it shouldn't for software that expects to grow.
I feel reassured to see that I'm not the only one who feels this way. With all the talk about in-IDE direct code editing, I was thinking that I was being somewhat of a luddite who feels like the chat form is the best balance between getting help from the AI and understanding/deciding how things are actually structured/working.
I also have both GPT and the Claude UI open and I will often flick out to one or the other (Claude seems to be a lot better at Elixir code for me than GPT) and go into “discussion” mode if I want to open the aperture on a topic.
I’m certainly never letting (not at anymore, at least) it go and wrote swathes of raw code on its own. I learned that lesson the hard way. It generates absolute nonsense if left to its own devices.
Usually when I am in the flow of writing code, I can think, write, tab away and review without breaking it. If I need a smallish (up to 100-ish lines) piece of code that I know the shape of - I would use the chat to generate it and merge it back after review.
Letting the agent rip always has led to more pain and suffering down the line :(
I can see the usefulness of agents however for (a) some tedious refactorings where the IDE features might not reach and (b) occasionally writing a first pass of a low-value module when I am low on energy.
For the rest of stuff I feel very happy with copy-paste.
When the cost for something goes down, demand for that thing goes up. That fancy app that you never had time to build is now something that you are expected to ship. And that niche feature that wasn't really worth your time before, completely different story now that you can get that done in 30 minutes instead of 1 week.
Individual software engineers will simply be expected to be able to do a lot more than they can do currently without LLMs. And somebody that understands what they are doing will have a better chance of delivering good results than somebody that just asks "build me a thingy conforming to my vague and naive musings/expectations that I just articulated in a brief sentence". You can waste a lot of time if you don't know your tools. That too is nothing new.
In short everything changes and that will generate more work, not less.
If I'm not 100% sure something will work, then I'll still just code it. If it doesn't work, I can throw it away and update my mental model and set out on a new typing adventure.
If you're trying to LLM your way to a new social site you're going to need to know what entities make up that site and the relationships they have ahead of time. If you have no concept of an idea then of course the LLM will be "correct" because there were no requirements!
Software design is important today and will be even more important in the future. Many companies do not require design docs for changes and I think it is a misstep. Software design is a skill that needs to be maintained.
The end game is outsourcing, instead of team mates doing the actual programing from the other side of the planet, it will be from inside the computer.
Sure the LLMs and Agents are rather limited today, just like optimizating compilers were still a far dream in the 1960's.
That's not to say the output is correct, there are usually bugs and unnecessary stuff if the logic generated isn't trivial, but reading it isn't the biggest hurdle.
I think you are referring to the situation where people just don't read the code generated at all.. in that case it's not really LLM's fault.
Even if this were true, which I strongly disagree with, it actually doesn't matter if the code is easier to understand
> I think you are referring to the situation where people just don't read the code generated at all.. in that case it's not really LLM's fault
It may not be the LLM's "fault", but the LLM has enabled this behavior and therefore the LLM is the root cause of the problem
Most Copilot style setup's (not just in this domain) are designed to gather data and train/gather feedback before full automation or downsizing. If they outright said it they may not have got the initial usage needed to do so from developers. Even if it is augmentation it feels like at least to me the other IT roles (e.g. BA's, Solution Engineers maybe?) are safer than SWE's going forward. Maybe its because dev's have a skin in the game and without AI its not that easy of a job over time makes it harder for them to see. Respect for SWE as a job in general has fallen in at least my anecdotal conversations mainly due to AI - after all long term career prospects are a major factor in career value, social status and personal goals for most people.
Their end goal is to democratize/commoditize programming with AI as low hanging fruit which by definition reduces its value per unit of output. The fact that there is so much discussion on this IMO shows that many even if they don't want to admit it there is a decent chance that they will succeed at this goal.
Stop repeating their bullshit. It is never about democratizing. If it was, they would start teaching everyone how to program, the same way we started to teach everyone how to read and write not that long ago.
In any case I'm not saying I think they will achieve it, or achieve it soon - I don't have that foresight. I'm just elaborating on their implied stated goals; they don't state them directly but reading their announcements on their models, code tools, etc that's IMO their implied end game. Anthrophic recently announced statistics that most of their model usage is for coding. Thinking it is just augmentation doesn't justify the money IMO put into these companies by VC's, funds, etc - they are looking for bigger payoffs than that remembering that many of these AI companies aren't breaking even yet.
I was replying the the parent comment - augmentation and/or copilots don't seem to be their end game/goal. Whether they are actually successful is another story.
But lets keep cheering until it does become good enough to come for our developer jobs.
"Hey I need a quick UI for a storefront", can be done with voice. I got pretty far with just doing this, but given my experience I don't feel fully comfortable in building the mech-suit yet because I still want to do things by hand. Think about how wonky you would feel inside of a Mech, trying to acclimate your mind to the reality that your hand movements are in unity with the mech's arm movements. Going to need a leap of faith here to trust the Mech. We've already started attacking the future by mocking it as "vibe coding". Calling it a "Mech" is so much more inspiring, and probably the truth. If I say it, I should see it. Complete instant feedback, like pen to paper.
The term ‘vibe coding’ was coined by OpenAI’s co-founder.
I'm wondering if I'm "holding it wrong", or all of these anecdotes of 10x productivity are coming from folks building prototypes or simple tools for a living.
That's why these AI companies are racing to build a replacement for you and me, something that will spend 100% of its time actually building out functionality the customer is looking forward to.
I know, I know, spending 100% of our day coding is ridiculous because that all-hands conference call to get everyone onboard with which microservice is responsible for storing button colors absolutely has to happen first.
You must be a junior coder if you think that typing the code into the computer is the activity that should take up most of your time
Writing code is the last step, the shortest step, and the easiest step of building software
I used to drink the kool aid too: writing code is the last step, the shortest and the easiest one...
Over time I came to believe, this is what people in dysfunctional organizations say to justify endless political back and forth over painfully trivial matters and constant turf wars.
Anyone speaking up about it is of course getting shamed as inexperienced or incompetent. It's no surprise, people who are holding these bullshit jobs have their livelihood on the line if the bullshit gets called out.
By the way, I'm not saying there's no need to plan things out at least just a little bit or that communication does not come with a certain overhead. Not 95% though, not even anything close to that. Especially if you aren't breaking any new grounds, which the overwhelming majority of devs aren't. No, a LOB reporting app on microservices is not it. No, another AI-enabled social network on blockchain is not it either.
Coding isn't the shortest step either, go ahead have a look into a serious codebase such as Chromium then come back and tell me with a straight face developing that codebase was the shortest step.
Things were different when I was doing contract work, and every month brought on a new project that we'd have to quickly spin up. Nowadays I work in mature (legacy) codebases where introducing a new feature requires interacting with undocumented libraries last touched decades ago. I spend 1/2 of my time debugging code, 1/3 in meetings, emails and code reviews/coaching, and the rest in planning and coding.
There's definitely some degree of dysfunctionality in the orgs I worked in, but this has been consistent across 3 employers (bigger companies / FAANGs, not startups though).
A minor point I'd make is you seem to define coding as strictly typing out... well, code. My perspective is that interacting with undocumented libs definitely counts towards coding and debugging might, depending on the context.
Now, if you scroll way up to the comment that kicked off this thread you'll see it lists three kind of activities a software dev's job is made up of and claims that that the first two are supposed to take the overwhelming majority of time and effort.
Let me quote:
> Why am I doing this? Understanding the business problem and value
> What do I need to do? Designing the solution conceptually
> How am I going to do it? Actually writing the code[, debugging, and refining]
>
> That last part is actually the easiest, and if you're spending inordinate amount of time there...
Let's go along with this notion for a moment, if a dev spends 95% of their time on the first and second parts then for every 16 hours they dedicate 51 minutes to actual coding (as in legacy libs spelunking, debugging, and typing out code)
And that's what I call utter bs on.
Maybe yes
But I'm really referring to the idea that at some point you should more or less create a solution in a spec, and then code should just be an implementation of the spec
If you are still spending 95% of your time writing code after 20 years as a programmer, then either you are incredible at creating specs in a short period of time, or you are still just doing "I'll start coding and figure it out as I go"
Or worse: "I'll just hack something together without thinking about how it fits overall into the whole"
Writing the code is translating a solution into computer language. Creating the solution is the part that should take majority of your time
This sentence is a little ambiguous, so I might be reading it wrong. But if you're literally referring to the idea of a spec so detailed it's virtually coded in a natural language, then I find this idea baffling. We have a specialized tool for this job - programming languages, which I enjoy quite a bit, btw. Are these somehow beneath a true software engineer, who's supposed to program in English?
Anyway, let's do a bit of napkin math here.
- So, a real Senior Software Engineer spends 95% of their time producing specs.
- Say, coming up with this particular spec took 16 hours, then the time dedicated to implementation works out to approx. 51 minute.
- Assuming their typing speed is 350 characters per minute (nothing to scoff at, especially considering typing is such a minor part of their job.)
- Now, their style guide sets the cutoff for a line of code at 120 chars (they aren't some 80-char cavemen, are they?)
Putting it all together, banging out code non-stop for 51 minutes, they'd end up with O(150) lines of code to show for 16 hours of planning and speccing... I say someone is coasting as if it were the last day of their life. Curious to hear your take!
Reality: a saddle on the developer's back.
They really want a faster horse.
Can we stop saying this? It hasn't been true for more than 15 years.
We had all the same shit that's going on with LLM labs. Benchmarks with elo scores, leading model providers cheating (Rybka), big companies jumping in (DeepBlue), even a fucking equivalent to RAG (pre-made opening books) and I guess an analogy to prompt optimization (end game tablebases?) It's all a repeat of shit I saw in 2009.
I'm curious about the fundamental reason why LLMs and their agents struggle with executive function over time.
On Limitations of the Transformer Architecture https://arxiv.org/abs/2402.08164
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory https://arxiv.org/abs/2405.16674
TL;DR transformers are inherently limited with tasks requiring composition of sequential steps
The ability to earn the big bucks as you state is not a function of the value delivered/produced, but the scarcity and difficulty in acquiring said value. That is capitalism. An extreme example is clear air that we breathe - it is currently free, but extremely valuable to most living things. If we made it scarce (e.g. pollution) eventually people would start charging for it; potentially at extortionary prices depending on how rare it becomes.
The only exception I see is if the software encodes a domain that isn't as accessible to people and is kept secret/under wraps, has natural protections (e.g. a government system that is mandatory to use), or is complex and still requires co-ordination and understanding. This does happen, but then I would argue the value is in the adjacent domain knowledge - not in the software itself.
Instead, I use LLMs for high-level thinking first: writing detailed system design documents, reasoning about architecture, and even planning out entire features as a series of smaller tasks. I ask the LLM to break work down for me, suggest test plans, and help track step-by-step progress. This workflow has been a game changer.
As for the argument that LLMs can’t deal with large codebases—I think that critique is a bit off. Frankly, humans can’t deal with large codebases in full either. We navigate them incrementally, build mental models, and work within scoped contexts. LLMs can do the same if you guide them: ask them to summarize the structure, explain modules, or narrow focus. Once scoped properly, the model can be incredibly effective at navigating and working within complex systems.
So while there are still limitations, dismissing LLMs based on “context window size” misses the bigger picture. It’s not about dumping an entire codebase into the prompt—it’s about smart tooling, scoped interactions, and using the LLM as a thinking partner across the full dev lifecycle. Used this way, it’s been faster and more powerful than anything else I’ve tried.
That's a bingo!
My workflow is to attach my entire codebase (or just the src folder + auxiliary files like sql schemas) to a Gemini 2.5 pro chat and ask it to write an implementation plan in phases for whatever feature I need, along with a list of assumptions, types, function signatures, documentation, and tests. I then spend a few minutes iterating to make sure it uses the right libraries, patterns, and endpoints. I copy paste the plan into plan.md and instruct Cursor/Windsurf/Aider/etc to implement phase 1 of the plan, saving implementation notes to plan-notes.md (both markdown files are explicitly included in the context). Keep telling it to "continue" and "keep going with the next phase" as needed. The implementation notes keep the LLM "grounded" in each step and allows creating a new chat context when it grows too long or messes up, requiring a git reset.
The alternative first step - when I'm working on an isolated module that doesn't need to know about the rest of the codebase but is otherwise quite complicated - is to have Gemini Deep Research write a report about how to implement that feature and feed that report into the planner.
The other important part is what I call "self reflection." Give the plan or research report to an LLM and ask it about improvements, pitfalls, tradeoffs, etc. and incorporate that feedback back into the plan. It helps to mix them up, so i.e. Claude and GPT review a Gemini plan and vice versa.
> What do I need to do? Designing the solution conceptually
> How am I going to do it? Actually writing the code
This article claims that LLMs accelerate the last step in the above process, but that is not how I have been using them.
Writing the code is not a huge time sink — and sometimes LLMs write it. But in my experience, LLMs have assisted partially with all three areas of development outlined in the article.
For me, I often dump a lot of context into Claude or ChatGPT and ask "what are some potential refactorings of this codebase if I want to add feature X + here are the requirements."
This leads to a back-and-forth session where I get some inspiration about possible ways to implement a large scale change to introduce a feature that may be tricky to fit into an existing architecture. The LLM here serves as a notepad or sketchbook of ideas, one that can quickly read existing API that I may have written a decade ago.
I also often use LLMs at the very start to identify problems and come up with feature ideas. Something like "I would really like to do X in my product, but here's a screenshot of my UI and I'm at a bit of a loss for how to do this without redesigning from scratch. Can you think of intuitive ways to integrate this? Or are there other things I am not thinking of that may solve the same problem."
The times when I get LLMs to write code are the times when the problem is tightly defined and it is an insular component. When I let LLMs introduce changes into an existing, complex system, no matter how much context I give, I always end up having to go in and fix things by hand (with the risk that something I don't understand slips through).
if you look at the change in capability over time, it looks like the AI are climbing this hierarchy. "Centaur" seems to already be giving way towards "research assistant". I hesitate to make predictions but I would not place money on things stabilizing here.
Wise of you. If things don't stabilize we'll need our savings.
Like a toy policeman costume so you can pretend you have authority and you know what you're doing.
Expert humans are still quite a bit better than LLMs at nuanced requirements understanding and architectural design for now. Actual coding will increasingly become a smaller and cheaper part of the process, while the parts where human input cannot be reduced as much will take up a larger proportion of time and cost.
* Not everything here applies, but many will be. https://en.m.wikipedia.org/wiki/Baumol_effect
Agentic AI will learn to complete a larger and larger chunk of the practical software development process without much human input.
It also can’t do the all important thing: telling you what to build.
Basically a lot of projects that simply wouldn't have happened are now getting complex MVPs done by non-technical people, which gets them just enough buy-in to move it forward, and that's when they need developers.
This is my experience as well. You have to know what you want, how to interfere if things go in the wrong direction, and what to do with the result as well.
What I did years ago with a team of 3-5 developers I can do now alone using Claude Code or Cursor. But I need to write a PRD, break it down into features, epics and user stories, let the llm write code, review the results. Vibe coding tools feel like half a dozen junior to mid level developers for a fraction of the cost.
How far can you go with the free tiers? Do I need to invest much in order to develop a good feeling of what is possible and what is not?
Also, if experience matters, how to help junior developers get the coding experience needed to master LLMs? While, as TFA says, this might not replace developers, it does seem like it will make things harder for unexperienced people.
(Edit: typos)
> even when AI chess engines can easily defeat grandmasters, the human-AI combination still produces superior results to the AI alone.
Is this still the case? I didn't find a conclusive answer, but intuitively it's hard to believe. With limitless resources, AI can perform exhaustive search and is thus not possible to lose. Even with resource limits, something like AlphaZero can be very strong. Would AlphaZero+human beat pure AlphaZero?
With the weaker engines that run on browsers, you'll still catch cases where GMs have a good understanding of the position, and it takes the engine some time before the engine's understanding catches up. -- e.g. the GM will be explaining "this position is good", but the computer eval shows +1 before then climbing to +5 after some time.
Similarly, I recall a popular technique in videos where GMs play cheaters is for the GM to then adopt a solid, defensive structure. The engines then just shuffle pieces around. (Again, I suspect that's with the weaker engines running on the computer).
Though, some of the "engine vs engine" games I've seen have involved wild and inhuman play. -- In those cases, that's where I'd doubt humans would be of much help. I don't think AlphaZero+human would beat standalone AlphaZero.
If I’m encountering a new framework I want to spend time learning it.
Every problem I overcome on my own improves my skills. And I like that.
GenAI takes that away. Makes me a passive observer. Tempts me to accept convenience with a mask of improved productivity. When, in the long term, it doesn’t do anything for me except rob me of my skills.
The real productivity gains for me would come from better programming languages.
Not for me. Put me at a company with a codebase in technology Z and I can learn it MUCH faster than starting from the docs. I will still read the docs, but everything goes far, far faster if you start me out in an existing codebase.
You can use GenAI the same way. Get a codebase that's doing a thing you're interested in immediately and dive right in. You do not HAVE to be tempted into being a passive observer, you can use it as a kickstart instead.
I'm of the opinion that we'd be a lot better off if convenience was a lot further down our priority list.
Will it just be these functional cores that are the product, and users will just use an LLM to mediate all interaction with it? The most complex stuff, the actual product, will be written by those skilled in mech suits, but what will it look like when it is written for a world where everyone else has a mech suit (albeit less capable) on too?
Think like your mother running a headless linux install with an LLM layer on top, and it being the least frustrating and most enjoyable computing experience she has ever had. I'm sure some are already thinking like this, and really it represents a massive paradigm shift in how software is written on the whole (and will ironically resemble the early days of programming).
> Write a concise Python function `generate_scale(root: int, scale_type: str) -> list[int]` that returns a list of MIDI note numbers (0-127 inclusive) for the given `root` note and `scale_type` ("major", "minor", or "major7"). The function should generate all notes of the specified scale across all octaves, and finally filter the results to include only notes within the valid MIDI range.
... So I typed all of the above in and it basically said don't ever try to use an LLM for this, it doesn't know anything about music and is especially tripped up by it. And then it gave me an example that should actually work and then didn't. It's wild because it gets the actual scale patterns correct.