16 pointsby mdp20219 days ago3 comments
  • Const-me9 days ago
    If I were an enthusiast, I would rather consider a mini PC with AMD Strix Halo APU. These things have been coming soon for a few months now.

    The memory is slower but not by much, 256 GB/s is much faster than system memory found in most consumer-targeted PCs. The devices have way more memory, up to 128 GB. A system with a Strix Halo APU is a general-purpose computer; these special accelerator cards can only be used for one thing.

    • fxtentacle9 days ago
      256 GB/s is excruciatingly slow for LLM interference. The 5090 has roughly 8x as much and since the task is mostly RAM BW bound, performance scales almost linearly with it.
      • reitzensteinm9 days ago
        There's a sweet spot for running MoE models, though. If you need the entire model in VRAM but only need to retrieve a part of it per token, trading more memory for less bandwidth can be a win.

        I have a 4090, and given the MoE trend, I'd be more tempted to purchase a Strix Halo next than a 5090.

      • Const-me9 days ago
        The specialized accelerators discussed in the article have much slower memory than a 5090 GPU. The memory in them delivers 448 or 512 GB/s, only around 2x compared to Strix Halo.
  • fxtentacle9 days ago
    "Both Blackhole cards offer roughly half the memory bandwidth of a used RTX 3090"

    And that means I have no idea what these cards could be useful for. They are more expensive, have roughly the same VRAM, but are much slower.

    • Carstairs9 days ago
      These have the advantage of not being 5 years old with no warranty.
  • 9 days ago
    undefined