Hacker News
new
top
best
ask
show
job
Re-quantizing a local LLM 14x faster by skipping the tensors that didn't change
(
andreaborio.substack.com
)
6 points
by
andreaborio
6 hours ago
1 comment
andreaborio
6 hours ago
[dead]