My $600 Mac Mini Runs a 35B AI Model(thoughts.jock.pl)

4 pointsby danebalia15 hours ago1 comment

bigyabai15 hours ago
> The 35B Trick (Your SSD Is the New GPU Memory)
Wave "bye bye" to your write cycles.
- RobMurray13 hours ago
  why? it's mostly reads. the weights are static.
  - bigyabai12 hours ago
    llama-cpp's process is, but macOS itself will swap hard when 10-14gb of memory is paged for LLM inference. Dense models especially would thrash zram.