Hacker News
new
top
best
ask
show
job
Why LLM decode is memory-bound, not compute-bound
(
github.com
)
4 points
by
harshuljain13
2 hours ago
1 comment
harshuljain13
2 hours ago
[flagged]