Hacker News
new
top
best
ask
show
job
New KV cache compaction technique cuts LLM memory 50x without accuracy loss
(
venturebeat.com
)
8 points
by
mellosouls
8 hours ago
1 comment
androiddrew
4 hours ago
I hope this is real.