5 pointsby saivegasena2 hours ago1 comment
  • saivegasena2 hours ago
    Hi everyone! this weekend I shipped a quant for the Flash-Base model in the deepseek V4 series. I posted all the quality, throughput and verification metrics in the repo:

    https://huggingface.co/EnsueAI/DeepSeek-V4-Flash-Base-INT4

    lmk what you think!

    It is the full 284B params in 157 GiB at full FP8 speed. I ran most of my tests on 4 H100s with about 320 GB of VRAM.

    • mandeepj2 hours ago
      Would you mind sharing your bill?
      • saivegasenaan hour ago
        I built it autonomously using my company's AI research agents, so it was technically free for me. The total time was 80 experiments and a total of 49hrs. I checked rent rates which were for 6.771$ an hour so ~$350 dollars which seemed pretty worth imo.