3 pointsby truetotosse3 hours ago2 comments
  • truetotosse3 hours ago
    Hi HN, I built llmrequirements.com to answer "what GPU should I buy for local models?" for myself without leaving the site to google something.

    It's a static site that maps every model in the open-weights ecosystem (Llama, Qwen, Mistral, DeepSeek, GLM, Kimi, Flux, Wan, ...) to the hardware that can actually run it, with three numbers per build/model pair sourced from llama.cpp / vLLM benchmarks rather than vendor marketing:

      - tg/s (single-stream generation)
      - pp (prefill / prompt-processing throughput)
      - TTFT at 100k-token context, null when the KV cache won't fit
    
    Hardware ranges from a Framework laptop up to an 8x H200 rack; software-stack maturity and extensibility get explicit 0-5 scores.

    The data exported to a public repo, so anyone can PR a correction and the diff is reviewable.

    Project started from the picker as a landing, but now it has state of the local AI page - SOLAI because all use cases now are somewhat unified under coding, agent, personal assistant. And models which can run such use cases are well defined as well.

  • 3 hours ago
    undefined