2 pointsby TechPreacher6 hours ago3 comments
  • orbanlevi6 hours ago
    I have 1 DGX Spark and running models with vLLM to, out of curiosity why not using Llama.cpp / TensorRT-LLM or any other alternatives?
  • awedisee3 hours ago
    Oh thank god. Finally a man of the people who can show us how to optimize 10k worth of equipment.

    Because we all have at least two of these. Shout out to OP!!

  • TechPreacher6 hours ago
    [flagged]