1 pointby HFerrahoglu8 hours ago1 comment

akssassin9077 hours ago
Running local LLMs on Apple Silicon has gotten surprisingly capable — the M-series chips handle models that used to require expensive GPU setups, so tools like this that actually speak to that hardware are welcome.
The quantization comparison is the feature I'd use most. It's one of those things that sounds simple but in practice nobody wants to dig through benchmarks just to figure out whether Q4 or Q8 is worth the extra memory on their specific machine.
Does it factor in what else is running in the background when estimating how much your machine can handle? That number can shift a lot depending on what else has memory tied up.