First DeepSeek V4 Flash-Base-Int4 Quant(huggingface.co)

5 pointsby saivegasena2 hours ago1 comment

saivegasena2 hours ago
Hi everyone! this weekend I shipped a quant for the Flash-Base model in the deepseek V4 series. I posted all the quality, throughput and verification metrics in the repo:
https://huggingface.co/EnsueAI/DeepSeek-V4-Flash-Base-INT4
lmk what you think!
It is the full 284B params in 157 GiB at full FP8 speed. I ran most of my tests on 4 H100s with about 320 GB of VRAM.
- mandeepj2 hours ago
  Would you mind sharing your bill?
  - saivegasenaan hour ago
    I built it autonomously using my company's AI research agents, so it was technically free for me. The total time was 80 experiments and a total of 49hrs. I checked rent rates which were for 6.771$ an hour so ~$350 dollars which seemed pretty worth imo.