1 pointby gcollard-2 hours ago1 comment
  • gcollard-2 hours ago
    Testing this hardware LLM (LLAMA 3.1 8B on a chip) I get ~16k tokens per second.

    With frontier models plateauing, I’ve been convinced AI will end up like bitcoin mining, and that NVIDIA’s general-purpose GPUs will be replaced by model-specific chips.

    Glad to see someone innovating in this space.