Hacker News
new
top
best
ask
show
job
Decoupling Compute and Memory for Async GPUs
5 points
by
yiyingzhang
an hour ago
1 comment
bobbyzhu2008
an hour ago
67% less kernel code is the more interesting number here — Hopper's async capabilities have been underutilized largely because the programming model is painful. Curious how it handles cases where compute and memory phases aren't cleanly separable.