5 pointsby limondasan hour ago1 comment
  • denn-gubskyan hour ago
    Try qwen3-coder or qwen3-coder-next models which fit into your configuration. This is team-of-experts model which may load only actual experts into GPU.