Hacker News
new
top
best
ask
show
job
The Bayesian Geometry of Transformer Attention
(
arxiv.org
)
4 points
by
samwillis
a month ago
1 comment
samwillis
a month ago
Higher level overview and links to the other related papers:
https://medium.com/@vishalmisra/attention-is-bayesian-infere...