NeuralOS: An operating system powered by neural networks(neural-os.com)

208 pointsby yuntian7 months ago27 comments

yuntian7 months ago
Thanks everyone for trying out NeuralOS, and apologies for the frustrating user experience!
I coded up the demo myself and didn't anticipate how disruptive the intermittent warning messages about waiting users would become. The demo is quite resource-intensive: each session currently requires its own H100 GPU, and I'm already using a dispatcher-worker setup with 8 parallel workers. Unfortunately, demand exceeded my setup, causing significant lag and I had to limit sessions to 60 more seconds when others are waiting. Additionally, the underlying diffusion model itself is slow to run, resulting in a frame rate typically below 2 fps, further compounded by network bottlenecks.
As for model capabilities, NeuralOS is indeed quite limited at this point (as acknowledged in my paper abstract). That's why the demo interactions shown in my tweet were minimal (opening Firefox, typing a URL).
Overall, this is meant as a proof-of-concept demonstrating the potential of generative, neural-network-powered GUIs. It's fully open-source, and I hope others can help improve it going forward!
Thanks again for the honest feedback.
- deepdarkforest7 months ago
  This is a shockingly fresh idea. I get that this generates every pixel from scratch, unlike Gemini approaches. But, i wonder how do you think this type of neural OS would be able to communicate with the internet or other similar neural os. It would have to at least send and get http responses?
  - idiotsecant7 months ago
    And when you lose your Internet connection, no problem! The OS can just imagine what the content should be!!
    lillecarl7 months ago
    Is AI just fancy text compression on steroids? :p
    vrighter7 months ago
    https://www.shs-conferences.org/articles/shsconf/abs/2021/13...
    pretty much
- cupantae7 months ago
  Nǐ hăo, xìe xìe Yuntian! I read the readme and paper but haven’t played around much yet. I find this fascinating and I don’t care much about poor “experience” because intuitively I feel this idea couldn’t produce something as reliable and flexible as a real OS anyway. I see you talked about inability to install new software and my reaction was “well obviously”, because surely it will be at least as limited as the training data, while a real OS provides lots of software of great complexity which is seldom used.
  Could you talk about your hopes for the future on this project? What are your thoughts on having a more simplified interface which could combine inputs in a more abstract way, or are you only interested in simulating a traditional OS?
  Thanks again.
  PS the waiting time while firefox “loads” made me laugh. I presume this is also simulated.
  - yuntian7 months ago
    Thanks for your comment! I completely agree that currently NeuralOS is far from being as reliable as a real OS. The Firefox loading time is indeed a funny artifact of the neural model simulating delay in real OS.
    However, my real dream behind this project is to blur the boundaries across applications, not just simulate traditional OS interactions. For example, imagine converting a movie we're watching directly into an interactive video game, or instantly changing the interface of an app (like Signal) to something we prefer (like Facebook Messenger) on the fly.
    Of course, the current training data severely limits what's achievable today. But looking forward, I envision combining techniques from controllable text generation (such as Zhiting Hu's "Toward Controlled Generation of Text" paper) or synthesizing new interaction data to achieve greater and customization. I believe this is a promising path toward creating truly generative and personalized interfaces.
    Thanks again for your interest!
  - 7 months ago
    undefined
- yunyu7 months ago
  Maybe put the warning below the UI, so it doesn't cause the layout to change?
  - yuntian7 months ago
    Good idea. I'll update when no one is using it, don't want to cause further interruptions...
- yalogin7 months ago
  Hey OP, I understand the load you are seeing on the servers. Given that, can you describe the functionality and how the interface is supposed to work? Specifically how NNs and LLMs provide this functionality?
- pinoy4207 months ago
  [dead]
munchler7 months ago
I tried to use this but the lag made it impossible to even click on an icon. On top of that, a message that other people were waiting popped up intermittently, pushing the emulation down the page, away from the mouse pointer. I'm not sure what sort of experience you're aiming for, but this probably isn't it.
- sensen7 months ago
  I opened the terminal and typed `which bash`. This was interpreted as `ls`.. It's a very fun demo, but the utility of disregarding my input and trying to guess what I wanted to type is very questionable.
K0balt7 months ago
Very interesting idea. The entire computing experience, os applications, potentially even file systems etc exists only in the “imagination “ of the model.
Although this is of course ridiculously wasteful right now, I can see this being the optimal solution for many things if a technology like thermodynamic well based neural networks get to the point of viability.
A thermodynamic well based model could have a trillion parameters in the size of an so card at a few milliwatts of power.
In case like that it’s easy to imagine that mass produced implementations could be a one size fits all solution for all but the most trivial or advanced computing tasks. For perhaps less than a dollar for a 100b sized chip, you get the ability to “imagine” video, sound, etc and a strong general purpose “reasoning” capability imbedded into everything right down to children’s toys and toasters.
Kinda makes me think of Rick and Morty with the butter passing robot. A lot of pointless capabilities, but still cheaper than a purpose built deterministic computing device. OTOH having embedded knowledge as an ambient part of everyday life would be kinda neat, even if it would almost surely mean the end of human civilization lol.
- AIorNot7 months ago
  I also wonder deeply the philosophy behind imagination of neural models
  What are the implications of relying on deep networks for instantiating and running the abstractions we usually hand upon physics and transistors
  Is this a type of VM
  Is an imagined VM Turing complete?
  - K0balt7 months ago
    >Is an imagined VM Turing complete?
    Fascinating question. My “vibe” opinion is that it is, but there are limits on the meaning of Turing completeness that do not apply within traditional computing paradigms, vis a vis scaling costs. My intuition is that scaling costs in imaginary VMs would be quadratic rather that linear, e.g. a task that takes twice the memory takes ^2 compute instead.
godelski7 months ago
I didn't get to do much. Had a hard time clicking on Firefox and then getting to the nav bar and type in "Hackernews". Boy was that wild watching it type. Those definitely weren't letters. Then it tried to translate the page for me into Finish and weirdly the "I'm not a robot" box would appear, disappear, and then I'd see the title of some paper. I never actually made it to the Google results...
It's an interesting project. I'll totally accept "for fun" or "because" but I'm interested in the why. Even if just a very narrow thing, is there any benefits we would get from using a ML based OS? I mean it is definitely cool and that has merit in its own right, but people talk about Neural OSs and I just don't "get it"
- yuntian7 months ago
  Thanks for the feedback! Yes, the demo is definitely limited. The reason I built NeuralOS is that I'm excited about a future where boundaries between software categories fade away. Imagine converting a movie directly into an interactive video game, customizing app interfaces by talking to it, or sharing the same underlying physics/world model between movies and games. Perhaps someday, movies or even interactive games could just be detailed text prompts describing scenes and characters, with the OS "hallucinating" everything on the fly (maybe movies adapt to user preferences as well so different users watch different "versions" of the same underlying movie plot). This minimizes storage and download times, but also provides maximal flexibility.
  Unlike other ML-based OS projects (such as Gemini OS, which generates code and renders traditional UIs), NeuralOS directly generates every pixel. While this makes it susceptible to hallucination, in my opinion the other side of hallucination is full flexibility. In the future, I imagine operating systems running entirely (or mostly) on GPUs, adapting to user intent on the fly rather than relying on pre-designed menus and options.
  - godelski7 months ago
    I definitely have many of the same dreams as you. I've always been captivated by the holodeck. But I'm not convinced things need to be neural from top to bottom. There are many things I do not want my machine to hallucinate about. There are things I want to be static and uncompromising. You're also talking about compression, which, to be fair, is what current ML systems do best. Though I think we need some serious innovation to get to the point of generating world models.
    That isn't to say that I don't think there shouldn't be neural OS's. But I do imagine them being something radically different. Do we really want them to mimic what we have now? Or is that not, in some vague way, more like a mind?
    Regardless, I think this is really neat. I'm a big fan of doing things "just because" and "I wonder what would happen if". So I'm not trying to knock you down. I mean, I'm wrong about a lot of things haha
  - cookiengineer7 months ago
    Woah.
    This essentially is the idea of Star Trek computers, where there were "neural gel packs" being programmed/primed for different purposes on the starship's systems.
    Damn, I have to think about this more. Essentially you are building a holodeck computer, where the users interacting with it just describe roughly what they want and the computer just generates it - in human language being the primary interface.
mpascale007 months ago
This brings personal nostalgia to when I was very young and made an "OS" in PowerPoint using links between slides, animations, and the embedded internet explorer object. Similarly, I'm not sure I see any practical use in this. Still it's a really fascinating conceptual demonstration of networks understanding intent in the complex state-machine that is a graphical user interface.
- typewithrhythm7 months ago
  I think this was a prototype for windows 8.
  - dxroshan7 months ago
    Haha I like this.
- arm327 months ago
  I'm glad I wasn't the only one who did this, except for me, I used Microsoft Frontpage.
- vimredo7 months ago
  I did a similar thing when I was younger, except with Batch.
- sieabahlpark7 months ago
  [dead]
yuntian7 months ago
In response to complaints about the laggy demo experience (due to limited GPU resources), I've now set up a huggingface space version for developers: https://huggingface.co/spaces/yuntian-group/neural-os
Note: The Space is intended as a template, so please duplicate it and run with your own GPU for a better experience. (The default Space has only one worker.)
Recommended GPU: At least an L40, ideally an A100-large. (The original demo at neural-os.com used H100s.)
All code and models are self-contained in the huggingface space.
yuntian7 months ago
A generative operating system that directly predicts screen images based on mouse and keyboard inputs, powered by an RNN for state modeling and a diffusion model for image generation.
See my tweet for more details: https://x.com/yuntiandeng/status/1944802154314916331
- 5-7 months ago
  i like how most of your demo video is clicking through various firefox and google popups.
  - arm327 months ago
    Pretty realistic, actually.
  - sieabahlpark7 months ago
    [dead]
pmxi7 months ago
This is a cool proof-of-concept! It reminds me of https://oasis-model.github.io/. which friends and I had a lot of fun with
1dom7 months ago
Tried to use this, also found lag made it basically impossible, and felt uncomfortable being reminded that other people might be waiting for me to get on with it.
However, I was able to click on a folder, it opened and looked fairly convincing. Only indicator that something was off - other than lag - was the at the bottom of the file browser, it mentioned how much diskspace was available: the first digit was clearly 6, the second was flickering and blurring between different numbers.
Pretty interesting idea though. What framerates should it run at? I felt I was getting <5fps.
- Sharlin7 months ago
  I managed to open the terminal, unsurprisingly trying to type something just resulted in hallucinations. And even though menus open and look plausible, clicking the items either didn't do anything or hallucinated some garbage.
ivolimmen7 months ago
Ok tried to execute 'ls'. Worked and it showed me some files. Tried to type 'less' and it just kept doing weird stuff. Removed my input and tried it again; it interpreted it as 'ls' and again showed me the content.
nither7 months ago
It's really cool concept. Just keep going, I wonder how it will look like in 1-2 years.
Nevermark7 months ago
Give it a permanent pre-prompt of “no bugs, no security holes, no ads, no tracking, no feed manipulation, no spam in search, online and subscription tools all backed up to run local, all online data also backed up local, all interactions or tasks to display in completed form within 100ms or less, … (burn everyone else’s Bitcoin when they attempt to transact, but not mine), …”
Looks like the entire mucky internet will be fixed with just some careful prompting as soon as this thing runs efficiently!
More seriously, it would be fun - and probably instructive - to play with a system that consistently (shallowly) simulated that. A kind of oasis.
- 5-7 months ago
  before the current ai craze, there was this: https://www.dangermouse.net/esoteric/petrovich.html
  > imagine a Petrovich layer over another operating system, such as Microsoft Windows (TM). Every time Windows does something you don't like, you could punish it, and it would never do it again...
tianqi7 months ago
This is a very inspiring concept. Although it is still in its infancy, it demonstrates the generative capabilities of AI in the field of interaction and inspired me to think of some new possibilities. However, I admit that I was surprised when I first saw the title: Neural Network Operating System? Such nonsense marketing and misuse of terminology actually appearing on HN? :)
vivzkestrel7 months ago
What does it mean when you say "operating system powered by neural network"? Does it have a kernel space and user space with hard defined boundaries or is the network determining what function call is being made and switches the space based on it? what about security? what about networking? what about program execution? how does this actually work?
- yuntian7 months ago
  When we say "powered by a neural network," we mean something fundamentally different from a traditional OS (or even gemini os). NeuralOS is essentially a video generation model that "hallucinates" every pixel on the screen in response to user inputs (mouse movements, clicks, keyboard inputs).
  There is no underlying kernel, no function calls, no program execution, and no networking. Everything is purely visual and imagined by the neural model. You can think of it as a safe, isolated container where nothing can actually run or cause harm, since no real code executes. It's essentially an interactive video simulation, conditioned entirely on user inputs.
  - mrheosuper7 months ago
    it sounds more like "Neural Desktop environment" than OS.
- TickleSteve7 months ago
  What they mean is UI, not OS.
  The purpose of an OS is to manage the resources of the computer, CPU, RAM, devices, etc. This is simply a UI generated by an NN.
jjaksic7 months ago
This reminds me of a recent conversation we had at work where someone suggested that at some point all backend APIs are going to get replaced by a single LLM that'll just do anything (if you ask it nicely enough).
- tartoran7 months ago
  And how do you solve the non-deterministic nature of LLMs? Or would all APIs become non-determinstic?
cess117 months ago
I think I prefer Eagle mode when I'm in the mood for a weird UI.
https://eaglemode.sourceforge.net/
Scene_Cast27 months ago
This is a really cool idea! I wonder if a more integrated adaptation would help render real OS UIs better / faster / prettier.
273kgracia7 months ago
You can visit NeuralOS inside NeuralOS!
spogbiper7 months ago
seems similar to this: https://aistudio.google.com/apps/bundled/gemini_os?showPrevi...
although i wasn't able to really use it due to lag
- yuntian7 months ago
  Actually NeuralOS works very differently from Gemini OS. NeuralOS directly generates each screen at the pixel level entirely from neural networks, while Gemini OS generates code that's then rendered into a traditional UI. This difference is why NeuralOS is much slower and currently runs at a lower frame rate.
e1ghtSpace7 months ago
I've been waiting for this. Wish the resolution and framerate were higher though.
vrighter7 months ago
What's the point? You don't need to predict what the UI should be like. The application tells the system exactly what it wants it to look like. And it does it with a fraction of the resources.
Also, it isn't an OS in any way shape or form. It's just another slop video generator. It even tries to "simulate" the applications themselves. One can just run the application itself, instead of simulating it. Case in point: try going anywhere except google.com in the "browser".
What problem is this trying to solve? And "to show how it might look" is not a valid answer, because it is designed to look like xfce4. It is not trying to generate a UI or something. And I can just run xfce4 in termux on my phone right now and be able to see exactly how it looks. How do you expect this to be a useful UI framework? Remember, the existing xfce4 works perfectly all the time, and this is just designed to (badly) simulate it only most of the time. What is the value proposition of something like this?
theGnuMe7 months ago
This is super cool. Not sure what it implies/means though..
therein7 months ago
This is like the Minecraft demo. Pretty cool conceptually.
animitronix7 months ago
Why would I want a non-deterministic operating system??
Guestmodinfo7 months ago
Feels yuck on mobile
QuaidCarloB7 months ago
[dead]
UncleOxidant7 months ago
Who thought this was a good idea?
- odyssey77 months ago
  It seems inevitable.