43 pointsby rishikeshs7 hours ago18 comments
  • eigenspace5 hours ago
    I'm surprised they released this thing. Brand perception is probably a lot more important to Nvidia than whatever sales they could get from this thing, and if it's basically just DGX Spark, it's likely to underwhelm.

    I've heard there's still a large backlog of both software problems, and hardware problems with the platform. The software problems could be fixed with time, but they'll still give a shitty first impression. I'd have thought Nvidia would just bury this and try again with a successor run of silicon with a new design.

    This thing seems practically destined to just be a repeat of the Snapdragon laptop debacle.

    • fg1375 hours ago
      I cannot think why someone would run those workflows on a Windows laptop, unless someone has way too much money to spend.
      • bigfishrunning5 hours ago
        > someone has way too much money to spend.

        that's what nvidia is hoping for

      • thinkingtoilet5 hours ago
        If the workload is offloaded to the chip, why would the host platform matter?
        • fg1375 hours ago
          Lots of machine learning workflows support Linux better than Windows, if they run on Windows at all. (e.g. https://docs.vllm.ai/en/latest/getting_started/quickstart/ )

          DGX Spark runs Linux, and nobody is going to install Windows on that machine. This laptop got it backwards.

          If someone decides to run Ollama for local inference with this laptop, they fit perfectly into the "has too much money to waste" bracket, which is addressed by a few other comments in the discussion.

          • chris_money2024 hours ago
            WSL
            • rvz2 hours ago
              Believe it or not, Windows (WSL) is the best Linux distro and Nvidia knows that.
          • lostmsu2 hours ago
            vllm-windows works well enough
  • aseipp5 hours ago
    The GB10 itself is pretty good and I love using mine for broad Linux development. But it's too expensive for consumer level pricing, and even for the "prosumer" the price is pretty stiff. Even if they dropped the CX-7 and halfed the RAM and shipped a smaller hard drive, would it be below, say, $2500 USD? I guess we'll see, but this variant is coming out pretty late so maybe it's just best to wait for the 2nd generation.
    • h14h5 hours ago
      This feels like getting a foot in the door to ensure Apple doesn't entirely eat Nvidia's lunch if AI inference workloads start to shift from cloud to local.

      With MLX, Apple is building an answer to CUDA, and if people start switching from ChatGPT & Claude to some app that runs on their M5, suddenly Apple starts to look like Nvidia's biggest competitor.

      If Nvidia doesn't have a pathway towards getting hardware into the hands of consumers, it could be a really difficult road ahead for them.

  • comandillos5 hours ago
    So they have basically reused the same hardware as in the DGX Spark (GB10)... That chip isn't great for LLM inference actually.

    https://www.techpowerup.com/gpu-specs/gb10.c4342 https://www.nvidia.com/en-us/products/rtx-spark/

    • general_reveal5 hours ago
      The RTX GPU laptops run very hot. Even though they are pound for pound better, it’s just runs too hot for local llm usage for me at least. Prefer Macs for this. A lot of AMD cards also run cooler. I wonder if undervting would help with smaller models and heat.
      • comandillos4 hours ago
        I mean the GB10 is pretty efficient for the power it has, but imho is nowhere near the power efficiency of Apple Silicon (it was never intended to be a chip used for mobile devices). I guess this is kind of the movement Apple did with the A12Z and the Mini but... the other way around?

        I think its gonna be another failure as we are used to see with the PC market these days.

    • ewklwekl5 hours ago
      It is great for inference for single user/single session. it is not replacement for graphical accelerator, that run several concurrent inference sessions in parallel.

      Basically the same tradeoff as macmini with unified memory.

    • joe_mamba5 hours ago
      >That chip isn't great for LLM inference actually.

      Why do I have the feeling it's been intentionally made to be bad in order to get you on to their most pensive datacenter gear.

      • ekidd5 hours ago
        It's probably more that LLM inference speed comes from having a large amount of fast RAM. And fast RAM is brutally expensive right now.

        At this point, your cost-efficient options include used 3090s, "frankenrigs" using recycled data center cards, and a handful of "workstation" class cards, where the originally high margins and the long enterprise purchasing cycles have kept prices from going up too fast.

        In contrast, a lot of these "personal" AI systems are basically a GPU-like core wired to larger amounts of slow RAM. Which is still semi-affordable. Generally speaking, they make for OK chatbots but extremely slow coding agents. Whereas you can run a modestly useful coding agent at reasonable speed on a 3090.

        So yeah, a lot of these systems are bit scammy. But not because it's a secret conspiracy to protect data center cards. Rather, there simply isn't enough fast RAM in the entire world. So they'll flog you disappointly slow RAM instead.

        TL;dr: Might be useful for some use cases, but benchmark very carefully.

  • dom965 hours ago
    I’m getting more and more convinced that we will end up running LLMs in our personal computers. Which makes me wonder where Anthropic/OpenAIs moats will come from.
    • VMG5 hours ago
      Convince me

      1. in order to run LLMs, especially the best ones, you need complicated devices which are expensive

      2. if you buy one for your personal use, you are probably not going to utilize it all the time and it will be idle a lot

      It seems to me that it will always be more economical that the LLM-running devices are in a datacenter where it is easier to make sure they are always utilized

      • OtherShrezzing5 hours ago
        If a model is substantially better than most humans at most tasks, the human isn't going to be able to perceive the difference between Claude Opus 7.7 and 8.7. Humans at some point aren't going to be able to perceive the difference on benchmarks either, because they are going to get wildly abstract.

        AI vendors are really going to struggle to shift tokens far beyond the frontier of human capabilities. It's reasonable (not guaranteed) to assume that, if the trend of frontier models (doubling capabilities on benchmarks every n months) holds, then the same trend will hold for local models, and those local models will meet and exceed the perception frontier. This would mean a human cannot tell the difference between Mistral-Open-2030 and Claude Opus 2030.

        That's a bunch of "ifs", but there's nothing exceptional about those "ifs". They're basically the scenario if nothing changes between now and ~2030 with regards to capabilities trend attainment.

      • Mordisquitos5 hours ago
        The trend over the past three decades of personal computing has been for devices to become exponentially more powerful regardless of the actual computing needs of users. The excess computing power has famously been requested by projects such as SETI@Home and Folding@Home, and been exploited by bad actors for crypto mining. The most basic laptop today used only for web browsing and word processing would be a powerful workstation 20 years ago, when the most basic laptop was also used only for web browsing and word processing (and arguably for more things, as it was all mostly local software).

        There is no ceiling to the power of consumer hardware. If it's cheap enough, it will be bought.

      • fg1375 hours ago
        This.

        Even two or three years people were pointing out "The ChatGPT subscriptions you can buy with $2000 give you much more compute than whatever home setup you come up with" on r/LocalLLM. I did my own elementary school maths and came to the same conclusion.

        Yet till this day people still boast how their beefy M4 Pro/Max machine with 32+GB RAM (which is not at all a "normal person's setup" and costs $2000+) runs LLMs smoothly, and "that's the future".

        Someone needs to re-learn basic maths and take a walk around Best Buy to understand what "consumer laptop" looks like.

      • nemomarx5 hours ago
        If there end up being useful workflows where you keep stuff running in the background or overnight that's one advantage, compared to a data center that might cut off your access during peak hours or etc.

        Think of it like having a graphics card at home versus using a cloud gaming stream? Technically subscribing to GeForce is much cheaper up front than getting a card, but people still do that. So will the audience of people running agents at home be as large as PC gaming? I think that's kind of plausible.

        • VMG5 hours ago
          > if there end up being useful workflows where you keep stuff running in the background or overnight that's one advantage

          That is not how LLMs are typically used though in my experience

          > Think of it like having a graphics card at home versus using a cloud gaming stream?

          Latency seems to be much more important in that use case

      • OtherShrezzing5 hours ago
        >2. if you buy one for your personal use, you are probably not going to utilize it all the time and it will be idle a lot

        I think consumers are primed for that type of behaviour though. I have an iPhone on my desk. It has something like 2-3tflops CPU+GPU, which is double that of the largest super computer on earth when Jurassic Park came out, and is probably more computing power than existed on earth when I was born in the 80s.

        I use this device for around 1hr per day to write text messages.

      • KeplerBoy5 hours ago
        It's inevitable. What might be a prosumer device today priced at 4000$ will be a regular consumer device in 10 years and models only get better.

        Local models today are fine for a lot of mundane tasks and will continue to be so. The use cases where paying for frontier models is worth it, will continue to shrink for folks not doing frontier work.

        • parineum5 hours ago
          > models only get better.

          Or stall. Acceleration has been slowing significantly and gains seem to be tied to huge memory footprints.

      • Guillaume864 hours ago
        3. If your device run on battery, why not using a relatively cheap network call in place of a very power hungry local inference call?
      • davebren5 hours ago
        Uploading your IP to the biggest IP thieves in human history seems bad idk.

        2. Eventually we'll get to where local models that don't have sycophancy and slot-machine mechanics trained into them will perform better.

      • nerbert5 hours ago
        Just like cloud vs private server. It'll be based on use case.
    • adrian_b4 hours ago
      While I agree with that in principle, it is very worrisome that the prices of personal computers, especially of any personal computer that is not a big desktop, have been increasing continuously.

      The price of a mini-PC with Intel Panther Lake is at least double in comparison with the price of a mini-PC with Arrow Lake H having similar specifications, and I am talking about barebones, before adding DRAM and SSDs, whose prices have risen even more.

      The rise in prices is somewhat obfuscated by the confusing names of CPUs, i.e. some old and new CPUs may seem to be at similar prices and they have similar names, but the new CPU actually corresponds to a lower segment of the market, by having e.g. a smaller GPU and a lower clock frequency, while the CPU model that really corresponds to the old is named such that it seems to belong to the class corresponding to its present price.

      As a concrete example of this obfuscation, which may confuse the buyers of laptops or mini-PCs, I have an ASUS 15 Pro with "Core Ultra 5 225H". If I would buy an ASUS 16 Pro now, the corresponding CPU model, the cheapest which is not worse than what I have, would be "Core Ultra X7 358H".

    • fg1375 hours ago
      The best open weight LLMs don't run on this computer, or almost any consumer grade computer. Even the memory requirement for Gemma 4 is out of reach for most consumers (by which I mean those who are not on HN). Unless there is some magic that would make high quality LLMs consume no more than 8GB RAM which makes them usable on a 16GB laptop (which is the norm these days), "local LLM for personal computing" is mostly just a myth.
    • xnx3 hours ago
      We're hitting the atomic limits of what's possible with minimum feature size in silicon. It's also very hard to remove 1 kW of heat from a laptop, let alone do it quietly or on battery.
    • itake5 hours ago
      My biggest concern with local LLMs is there just isn't enough RAM or HD space to run multiple models, and the generic LLMs are too generic...
    • xdertz5 hours ago
      I find it hard to see how that would ever be economical. LLMs need very expensive power hungry chips and datacenters have

      - bulk discounts - cheaper electricity - high utilisation to spread the costs among many users

      I don't see how PCs could ever compete against it. Most users AI demands would probably result in >90% idle time on the GPU.

    • pjmlp5 hours ago
      First we need to actually still be employed, and have them at affordable price.
    • eigenspace5 hours ago
      If we do, it won't be on this chip.
    • wasmitnetzen5 hours ago
      It'll be just another round of the client-side vs server-side processing rounds. We've been through them, we will keep going through them.
    • notepad0x905 hours ago
      i think a lot of that is for government and enterprise use. even for personal computers themselves (i.e.: laptops) they're usually loss leaders, they don't turn profit. You can run a server (and many do) on laptops, but that didn't replace cloud services or server hosting. You can't store enormous amounts of data on your laptop/phone for the llm to use, or access tools the app dev wouldn't want exposed on untrusted devices.

      The whole replacing people angle is just the short term use case the more ghoulish executives are thinking about. In practice, lots of lots of new use cases have been made possible by LLMs. A lot of which can be done locally. But whatever capacity you have locally, they can have more of and for cheaper, and they manage the model instead of you doing it yourself. I think you put it nicely though, their moat will be thinned, and I doubt they'll be as profitable as their funding suggests, but at the same time the demand for them won't go away either. I don't know if OpenAI and Anthropic will be viable, but I'm nearly certain Deepseek is.

      The tipping point will be power usage, if a local llm can run the same workload for less power that would be a game changer. Nvidia might get decimated, but even Google and others have moved on from GPUs already, they have faster and more power efficient TPUs. Add to that network bandwidth and availability issues, their moat remains. Also consider that even for graphics capabilities, user devices just don't have a consistent spec to make things like widespread 3d graphics and webgl usage viable. Someone's cheap android phone will never run a local llm reliably,same as it won't a 3d game. even if they have a high-end iphone, network providers aren't always performant as they are in western countries, and then there are people that won't want to install your app or local software, and then browser based exposure of the capability to sites which will have similar hardware spec issues, OS instabilities, competing tabs,etc...

  • orthoxerox5 hours ago
    Well, it was only a matter of time, since both AMD and now Intel are now switching to APUs. Nvidia could either cede the desktop GPU market to them, going all-in into AI datacenter chips, or it could challenge them.

    Maybe the Nth time's the charm and Microsoft+Nvidia will manage to make Windows on ARM a viable platform.

  • chris_money2025 hours ago
    It’s a step in the right direction, but there’s still a long ways to go in terms of smaller LLMs ability and hardware costs
  • xeyownt5 hours ago
    Great! More pressure on fabs, price of standard GPU will again rise.

    Guess I need to postpone my gamer PC renewal to end 2030.

  • lanycrost5 hours ago
    I'm waiting for powerful on device LLM models, since that not worth it
    • Hugsun5 hours ago
      Have you tried Qwen 3.6 or Gemma 4? They're not frontier level but certainly have their uses.
  • agnosticmantis5 hours ago
    How would these compare to a MacBook Pro M5 in terms of performance and price?
  • koolkao5 hours ago
    Very exciting! sounds like we're finally leaving x86 behind
  • PunchyHamster5 hours ago
    The fact they advertise it as some step forward in PCs is outright bizzare.

    It's just worse Strix Halo, as you are landing square in middle of Windows ARM problems

    • Iolaum5 hours ago
      Strix Halo chips have around 210+ GB/sec gpu memory bandwidth and announcements put the new nvidia chip at around 300GB/sec gpu memory bandwidth.

      I 'd say that is an improvement if you want to run local llm inference. Still well below with what you can achieve with Apple chips though.

  • ocdtrekkie5 hours ago
    The thing I think is really funny is that if this takes off, frontier model companies and datacenters will end up holding the bag, and as per usual after the last few tech hype cycles, NVIDIA will still be selling.

    Eventually a lot of inference will get right-sized into something you affordably run yourself.

  • LatencyKills5 hours ago
    First:

    > "Our goal is to deliver unmetered intelligence to every home and every desk with Windows," said Satya Nadella, chairman and head of Microsoft.

    Then:

    > However, Ian Fogg, Research Director at industry analyst firm FDM CCS Insight said the change was "likely to come with a significant price tag" and Nvidia would be targeting "those looking for workstation-class performance".

    So... not every desk with Windows.

    • pitched5 hours ago
      First, make it possible. Then, expand the market. The early adopters help pay R&D for later efforts. Every desk is a good goal, even if not hit by the first doodad.

      It just feels too much like what they said about Apple II and early Windows. A play at nostalgia instead putting real thought into it.

      • LatencyKills5 hours ago
        I was an engineer at both MS and Apple, and wholeheartedly agree with you.

        My question is, what happens to the people who use RTX cards for gaming? This new solution isn't meant for that. Do they need an "AI accelerator" and a gaming-centric GPU?

    • cryo325 hours ago
      I don’t know anyone other than a very small but vocal minority who will give a shit about this.

      Even in the analytics side most of the stuff is some shonky ass numpy or excel gank.

      I don’t know what the market is. I just can’t see it.

    • netdevphoenix5 hours ago
      The constant deliberate conflagration of LLMs with general intelligence is so grating.
  • toksum7 hours ago
    [flagged]
  • lucamark5 hours ago
    [flagged]
  • sylware7 hours ago
    Did they tell Trump that if you don't use chips with the latest silicon process, machine learning will just take a bit more time and more energy, but it will happen anyway and at the same level of quality if the machine learning "recipes" and training data are close enough?
    • twobitshifter5 hours ago
      Right, the export controls are only forcing Chinese AI to innovate, build their own fabs, and make training and inference more efficient. The end game of this will be NVIDIA chips won’t be wanted because you can get a $50 chinese chip running a ternary model that is competitive with claude in English and is much better in Mandarin.
      • adrian_b5 hours ago
        The US government has failed to learn from its own history.

        60 years ago the US government had forbidden the export of fast computers to France, with the hope that this sanction will prevent the French from developing thermonuclear bombs.

        The result was that the French state (which at that time was lead by de Gaulle, not much less autocratically than China) subsidized some of their computer manufacturers, which previously could not compete with the American companies like IBM and CDC, and also their semiconductor manufacturing industry, which had to provide the components for the locally-made computers.

        Eventually, the French produced TTL circuits and mainframe computers made with them, and finally they also made thermonuclear bombs.

        So the American "sanctions" against France have been a complete failure and have been great for the French industry of semiconductors and computers.

        Many years later, when USA no longer had export restrictions towards France and the French state no longer protected their industry, the French industries of integrated circuits and computers have been greatly reduced, their companies either becoming bankrupt or being bought or merged into multinational companies.

      • pitched5 hours ago
        I would order that in a heartbeat. Even if it required proprietary Chinese-government drivers. I would try to segregate in a VM without internet or something. Please make this happen! Tokens cost too much in the current system.
  • asimovDev5 hours ago
    >Lenovo, HP, Dell and Apple accounted for almost 75% of the world's PC market in the first three months of this year, according to research firm Gartner.

    https://www.gartner.com/en/newsroom/press-releases/2026-4-10...