Hacker News
new
top
best
ask
show
job
RIS-Kernel: Running 64k context LLMs on CPU via sparse attention
(
github.com
)
2 points
by
santosardr
7 hours ago
1 comment
santosardr
7 hours ago
[flagged]