Such an in interesting sentence. App that doesn't work doesn't seem like it's yet come into existence.
This has been my (limited) experience so far. I haven't been able to get an AI/LLM to help me build an app. Even React apps it fails at. I have been able to get an LLM to help with coding questions similar to Stack Overflow questions though (though not always)
I know it's much more powerful than that tool was, but the experience described is similar between both
While these changes may require "new ways of thinking" in humans, the LLM seems to have these conceptual approaches embedded already thanks to other languages that did these things earlier. The what's new just shows it the syntax for these concepts in Swift.
LLM's are not "really good at writing code". They generate statistically relevant text based on their training data set.
Expecting people who do not understand code to use LLM’s for making solutions is like thinking “non-pilots” can successfully fly a 747.
And more often than not it's just plain wrong code. LLMs aren't actually writing code, they are guessing at what might satisfy an input. Writing code is more than guessing, it's about assembling instructions with intent, for a purpose. LLMs lack that intent and purpose.
> You can rightly criticise the use of LLMs in many ways but it is more useful to focus on the actual reasons they cause harm than to set red lines over language.
The "actual reasons" they cause harm are inherent to how LLMs work. The real problem is people not understanding how they work and placing too much trust and in them and believing the hype. They are not some miracle, they aren't even all that helpful, and in my experience they are more of a waste of my time.
It’s just a statistical rubber duck, “what other obvious (common) things haven’t I thought about yet Mr duck?”
My 5 year old is recreated duck hunt with almost zero assistance in an hour.
I helped with the copy paste.
We don't have a commercial offering yet and are planning to migrate off WebContainers for the upcoming full stack features -- WebContainers show their limits pretty quickly in a full stack context (e.g. CORS issues) and we need observability into the server side of the app for full stack debugging.
Regardless, our interests here are only lightly commercial. We're not really developing Nut to drive revenue but to help us develop the debugging API and push forward the SOTA for AI development as effectively as we can. That API is what we want to sell.
We're also interested in using our API with MCP so that e.g. Cursor could be used to fix bugs you're seeing locally, and plan to explore that angle before long.
I really hope this doesn't actually catch on in "real" engineering, beside as a meme joke.
The tech will get better and better (I couldn't imagine we'd be doing this a year ago) but to be truly useful it has to reliably produce reasonably well engineered code, and effective debugging is a key piece of that.
Best case is some high profile shit show caused by software made mostly or entirely by ai that hopefully is bad enough that legislators wake up and realize that in the modern world software is essential enough that you can’t let just anyone sell it or services based on it. Just like you can’t allow anybody design/build bridges or hardware or whatever.
But I’m sure thats wishful thinking. Hacks and buggy software causing consumers harm is just accepted and software industry folk all hope to be billionaires so nobody cares.
would be funny though, who produces the result faster a Fiver or an AI in a loop for a day
It spits out urls to sites and sends em to Fiver QA people, take a shot every time the app doesn't work
Wonder the cost effectiveness, have a randomizer start producing/hosting code auto submit it to Product Hunt
Karpathy "coined" the term and I absolutely hate it. It's up there with "asshat" and "awesome sauce" for profoundly stupid terms.
But we are very good in our profession to make up garbage terms to do anything but describe garbage.
I'd be interested to read a blog post or technical write up. I think conceptually it's an interesting idea
Edit: Now everything is made of buttons.
The improvements we're making are under the hood. When you ask Nut to fix a bug it should do a much better job -- we record the app's behavior and analyze it so the AI has context for the changes it needs to make.
We've also added some UI to approve or reject the changes the AI makes. For now we're using this to gather feedback so we can improve Nut, but down the line we'll also refund the user any credits when they reject changes -- you shouldn't have to pay when the AI screws up, a big issue with these tools (and vibe coding in general).
We'll continue to keep it open source as we develop it.
I couldn't start the game though, but it seems runnable given some debugging. Great work!
Another one I thought was pretty cool was handing it some API docs for something, then had it build a UI and admin interface from scratch.
The dev agent tools these days aren't half bad, and they're getting better.
I just generated two tetris games, one with ascii art and one with WebGL, and I found it quite impressive. Maybe simplistic apps, but still quite impressive with it's ability to create functional games and fix bugs with minor prodding.